English (US) Call Center Speech Dataset for Healthcare

The audio dataset includes call center conversations in Healthcare, featuring native English speakers from US, with detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language Call Center Speech Dataset for the Healthcare domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Healthcare industry.


With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Healthcare domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in United States.


Speech Data:

This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the Healthcare domain, to build robust and accurate customer service speech technology.


To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of United States. This collaborative effort ensures a balanced representation of US accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.


Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.


The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.


Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.


The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the Healthcare domain.


Transcription:

To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.


These ready-to-use transcriptions accelerate the development of Healthcare call center conversational AI and ASR models for the English language.


Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.


If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


License:

This Healthcare call center audio dataset is created by FutureBeeAI and is available for commercial use!


Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Healthcare sector, our dataset serves as a trusted resource to meet your goals


Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Male(29)Female(24)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Speech4.7745.799Speaker 2Hello Futurebee.
Speech7.5788.477Speaker 1Hello Futurebee.
Noise7.7037.961--
Noise9.5199.878--
Speech12.66914.336Speaker 2Hello is this <PII>Mr. Micheal</PII>?
Speech15.57917.335Speaker 1This is, whom I speaking to?
Speech18.56224.603Speaker 2Hi this is Kelly. I am with [filler], Dr Brigers office. [filler] I am calling because we
Noise24.32924.518--
Speech25.08226.722Speaker 2needed you a little pre test
Speech27.36129.411Speaker 2screening before your appointment tomorrow.
Speech30.10331.312Speaker 2You have just ten minutes.
Speech33.86837.679Speaker 1[filler]yeah I think so. [filler] yeah, yeah I have got.
Noise37.72238.222--
Speech39.20744.883Speaker 2Okay perfect. So we just do this phone call to make the check in process easier once you get here because
Speech45.64947.847Speaker 2ever since our office is reopened
Speech48.28350.347Speaker 2we had such a backlog of patients
Speech51.02453.265Speaker 2that when we reopened
Noise52.81654.182--
Speech54.13360.191Speaker 2we are trying to take on more patients than usual. So by doing this process it helps us to get through
Speech60.74061.731Speaker 2the check in process
Speech61.95765.248Speaker 2more quickly when you actually come to the office. Okay?
Speech68.32875.453Speaker 1I will tell you what, that makes me real happy because one of the things I hate about going into doctors offices is I get there in time to my appointment.
Speech75.89481.087Speaker 1And then I get a quick half an hour to call out all the paper work. That's in a, that's in a eliminate this right?
Speech83.28786.912Speaker 2For the most part, [filler] thankfully you are already [filler]
Speech87.55290.194Speaker 2repeat patient that we won't have to call out
Speech90.58791.686Speaker 2any paper work
Speech92.43695.052Speaker 2like sometimes you have to do with your new patient.
Speech95.953100.677Speaker 2[filler], but I just want you to know that like I said we have been really
Speech101.403103.453Speaker 2swamped with new patients and
Speech103.843105.686Speaker 2a backlog of patients so
Speech106.170108.412Speaker 2please continue to be patient with us if you are
Speech108.811110.569Speaker 2Appointment is on perfectly on time.
Speech114.045116.670Speaker 1Okay. Alright [filler], thanks for giving me a heads up.
Speech117.203120.170Speaker 2Alright so I just have a few questions to ask you.
Speech121.927126.644Speaker 2[filler]and they are just some general questions about your health within the last few days.
Speech127.412131.703Speaker 2So just think back to your last few days and you can give me an answer. Are you ready?
Speech135.287135.961Speaker 1I am ready.
Speech135.453138.978Speaker 2Okay my first question is, have you had a new fever?
Speech139.425145.336Speaker 2of a hundred and four degrees or higher, Of one hundred point four degrees or higher? Yes or no?
Speech148.151156.961Speaker 1[filler], I have not had any meter to take my own temperature in the last forty eight hours. I have a sense of fever. So [filler], I don't have any measurements to tell you this. But no, I, I don't.
Noise157.032157.663--
Speech158.788163.216Speaker 2Okay so that means that within the last couple of days there has been no symptoms that could be
Noise159.274160.066--
Speech163.757165.532Speaker 2connected with a fever either right?
Speech167.757168.191Speaker 1Correct
Speech169.223172.032Speaker 2Okay that's great. Let me just type that into my computer.
Noise172.626172.723--
Speech174.830176.247Speaker 2Okay and my next question.
Speech176.782181.449Speaker 2Have you had a new cough, you can now attribute to another health condition?

TRANSCRIPTION

TIMETRANSCRIPT
4.774
5.799
Hello Futurebee.
7.578
8.477
Hello Futurebee.
7.703
7.961
-
9.519
9.878
-
12.669
14.336
Hello is this <PII>Mr. Micheal</PII>?
15.579
17.335
This is, whom I speaking to?
18.562
24.603
Hi this is Kelly. I am with [filler], Dr Brigers office. [filler] I am calling because we
24.329
24.518
-
25.082
26.722
needed you a little pre test
27.361
29.411
screening before your appointment tomorrow.
30.103
31.312
You have just ten minutes.
33.868
37.679
[filler]yeah I think so. [filler] yeah, yeah I have got.
37.722
38.222
-
39.207
44.883
Okay perfect. So we just do this phone call to make the check in process easier once you get here because
45.649
47.847
ever since our office is reopened
48.283
50.347
we had such a backlog of patients
51.024
53.265
that when we reopened
52.816
54.182
-
54.133
60.191
we are trying to take on more patients than usual. So by doing this process it helps us to get through
60.740
61.731
the check in process
61.957
65.248
more quickly when you actually come to the office. Okay?
68.328
75.453
I will tell you what, that makes me real happy because one of the things I hate about going into doctors offices is I get there in time to my appointment.
75.894
81.087
And then I get a quick half an hour to call out all the paper work. That's in a, that's in a eliminate this right?
83.287
86.912
For the most part, [filler] thankfully you are already [filler]
87.552
90.194
repeat patient that we won't have to call out
90.587
91.686
any paper work
92.436
95.052
like sometimes you have to do with your new patient.
95.953
100.677
[filler], but I just want you to know that like I said we have been really
101.403
103.453
swamped with new patients and
103.843
105.686
a backlog of patients so
106.170
108.412
please continue to be patient with us if you are
108.811
110.569
Appointment is on perfectly on time.
114.045
116.670
Okay. Alright [filler], thanks for giving me a heads up.
117.203
120.170
Alright so I just have a few questions to ask you.
121.927
126.644
[filler]and they are just some general questions about your health within the last few days.
127.412
131.703
So just think back to your last few days and you can give me an answer. Are you ready?
135.287
135.961
I am ready.
135.453
138.978
Okay my first question is, have you had a new fever?
139.425
145.336
of a hundred and four degrees or higher, Of one hundred point four degrees or higher? Yes or no?
148.151
156.961
[filler], I have not had any meter to take my own temperature in the last forty eight hours. I have a sense of fever. So [filler], I don't have any measurements to tell you this. But no, I, I don't.
157.032
157.663
-
158.788
163.216
Okay so that means that within the last couple of days there has been no symptoms that could be
159.274
160.066
-
163.757
165.532
connected with a fever either right?
167.757
168.191
Correct
169.223
172.032
Okay that's great. Let me just type that into my computer.
172.626
172.723
-
174.830
176.247
Okay and my next question.
176.782
181.449
Have you had a new cough, you can now attribute to another health condition?

Dataset Demographics

Details Headline

Language

English

Language code

en-us

Country

USA

Accents

Arizona,...more

Gender Distribution

M: 55, F: 45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

5-15 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg