English (India) Call Center Speech Dataset for Healthcare

The audio dataset includes call center conversations in Healthcare, featuring native English speakers from India, with detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language Call Center Speech Dataset for the Healthcare domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Healthcare industry.


With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Healthcare domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in India.


Speech Data:

This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the Healthcare domain, to build robust and accurate customer service speech technology.


To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of India. This collaborative effort ensures a balanced representation of Indian accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.


Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.


The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.


Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.


The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the Healthcare domain.


Transcription:

To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.


These ready-to-use transcriptions accelerate the development of Healthcare call center conversational AI and ASR models for the English language.


Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.


If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


License:

This Healthcare call center audio dataset is created by FutureBeeAI and is available for commercial use!


Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Healthcare sector, our dataset serves as a trusted resource to meet your goals


Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Female(21)Male(24)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Speech1.1752.350Speaker 1Hello Futurebee.
Speech3.7004.825Speaker 2Hello Futurebee.
Speech5.1506.150Speaker 1Good morning.
Speech7.6258.698Speaker 2Good morning.
Speech8.82311.900Speaker 1So, this is <PII>Dr. Varsha</PII>. How can I assist you today?
Noise12.86413.813--
Speech13.79815.448Speaker 2Good morning doctor. I am <PII>Rishabh</PII>.
Speech15.96124.437Speaker 2I have been experience some persistent headache, over the past few days. And I though it would be best to consult with the professionals. So, I am here.
Speech25.70733.009Speaker 1Okay. So, I am glad <PII>Rishabh</PII>. You reach out and I am here to help you. So, let start by discussing your symptoms
Speech33.40541.304Speaker 1in more detail. And [filler] can you tell me when the headache started and how frequently you have been experiencing them?
Speech44.31651.091Speaker 2[filler]The headaches started about a week ago and they have been occurring almost [filler] every day since then.
Speech51.43256.408Speaker 2[filler]They usually begin in the morning and last for several hours.
Speech56.86860.268Speaker 2I have tried over the counter pain relievers [filler] but
Speech60.60362.554Speaker 2they only provide temporary relief.
Speech63.47870.102Speaker 1Okay. I see. [filler] Have you notice any specific trigger or pattern associated with the headache.
Speech70.42376.373Speaker 1For example do they worsen after certain activity or exposer to certain environment.
Speech79.55286.427Speaker 2[filler]I think not that I am aware of. They seems to come on without any apparent reason.
Speech86.69491.795Speaker 2I haven't notice any particular triggers that which you, which you mentioned that.
Speech92.165100.540Speaker 1Okay. Beside the headache have you been experiencing any other symptoms such as dizziness, nausea or changes in vision and all?
Speech103.027108.902Speaker 2[filler]No. None of those symptoms. It just mainly headaches that have been bothering me.
Speech109.134114.242Speaker 2And little bit cold is also I am facing right now.
Speech114.471121.822Speaker 1Understood. Based on your description it's possible that may be you experiencing tension headache or migraine.
Speech122.447126.897Speaker 1Because however it's important to conduct [filler]
Speech127.286128.634Speaker 1(())
Speech128.948139.048Speaker 1Because to rule out any underline diseases or any cause [filler] you have experience similar headache in the past or is it new occurrence?
Noise140.985142.461--
Speech142.605145.407Speaker 2[filler]I have had
Speech145.625149.925Speaker 2occasional headaches before but they were never this frequent or intense.
Speech150.241154.143Speaker 2[filler]This is definitely different from what I have experience past.
Speech154.584161.709Speaker 1Okay. I see. Given the change in the frequency and intensity it would be prudent to conduct further test
Speech161.830162.657Speaker 1to ensure
Speech162.918170.342Speaker 1[filler]we have a comprehensive understanding of your condition. I recommend (()) appointment.
Speech170.930177.330Speaker 1Perform our physical examination and potentially order some diagnostic test. Because such as
Speech177.587185.312Speaker 1blood (()) or brain images scan. This will help us rule out any other possibilities and provide a more accurate

TRANSCRIPTION

TIMETRANSCRIPT
1.175
2.350
Hello Futurebee.
3.700
4.825
Hello Futurebee.
5.150
6.150
Good morning.
7.625
8.698
Good morning.
8.823
11.900
So, this is <PII>Dr. Varsha</PII>. How can I assist you today?
12.864
13.813
-
13.798
15.448
Good morning doctor. I am <PII>Rishabh</PII>.
15.961
24.437
I have been experience some persistent headache, over the past few days. And I though it would be best to consult with the professionals. So, I am here.
25.707
33.009
Okay. So, I am glad <PII>Rishabh</PII>. You reach out and I am here to help you. So, let start by discussing your symptoms
33.405
41.304
in more detail. And [filler] can you tell me when the headache started and how frequently you have been experiencing them?
44.316
51.091
[filler]The headaches started about a week ago and they have been occurring almost [filler] every day since then.
51.432
56.408
[filler]They usually begin in the morning and last for several hours.
56.868
60.268
I have tried over the counter pain relievers [filler] but
60.603
62.554
they only provide temporary relief.
63.478
70.102
Okay. I see. [filler] Have you notice any specific trigger or pattern associated with the headache.
70.423
76.373
For example do they worsen after certain activity or exposer to certain environment.
79.552
86.427
[filler]I think not that I am aware of. They seems to come on without any apparent reason.
86.694
91.795
I haven't notice any particular triggers that which you, which you mentioned that.
92.165
100.540
Okay. Beside the headache have you been experiencing any other symptoms such as dizziness, nausea or changes in vision and all?
103.027
108.902
[filler]No. None of those symptoms. It just mainly headaches that have been bothering me.
109.134
114.242
And little bit cold is also I am facing right now.
114.471
121.822
Understood. Based on your description it's possible that may be you experiencing tension headache or migraine.
122.447
126.897
Because however it's important to conduct [filler]
127.286
128.634
(())
128.948
139.048
Because to rule out any underline diseases or any cause [filler] you have experience similar headache in the past or is it new occurrence?
140.985
142.461
-
142.605
145.407
[filler]I have had
145.625
149.925
occasional headaches before but they were never this frequent or intense.
150.241
154.143
[filler]This is definitely different from what I have experience past.
154.584
161.709
Okay. I see. Given the change in the frequency and intensity it would be prudent to conduct further test
161.830
162.657
to ensure
162.918
170.342
[filler]we have a comprehensive understanding of your condition. I recommend (()) appointment.
170.930
177.330
Perform our physical examination and potentially order some diagnostic test. Because such as
177.587
185.312
blood (()) or brain images scan. This will help us rule out any other possibilities and provide a more accurate

Dataset Demographics

Details Headline

Language

English

Language code

en-In

Country

India

Accents

Chandigarh,...more

Gender Distribution

M: 55, F: 45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

5-15 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg