English (India) Call Center Speech Dataset for BFSI

The audio dataset includes call center conversations in BFSI, featuring native English speakers from India, with detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language Call Center Speech Dataset for the BFSI domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the BFSI industry.


With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the BFSI domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in India.


Speech Data:

This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the BFSI domain, to build robust and accurate customer service speech technology.


To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of India. This collaborative effort ensures a balanced representation of Indian accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.


Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.


The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.


Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.


The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the BFSI domain.


Transcription:

To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.


These ready-to-use transcriptions accelerate the development of BFSI call center conversational AI and ASR models for the English language.


Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.


If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


License:

This BFSI call center audio dataset is created by FutureBeeAI and is available for commercial use!


Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the BFSI sector, our dataset serves as a trusted resource to meet your goals


Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Male(21)Female(23)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Speech0.0001.149Speaker 1Hello Futurebee.
Speech2.9233.700Speaker 1Hello ma'am.
Speech3.4505.000Speaker 2Hello Fururebee.
Speech5.5007.025Speaker 1Hello ma'am. How can I help you?
Speech9.84918.350Speaker 2Hello. I have a few concerns regarding my credit card and some transactions that seems to be incorrect. Can you assist me with this?
Speech19.00021.250Speaker 1Certainly ma'am. I am here to help you.
Speech19.77420.750Speaker 2(())
Speech21.82434.625Speaker 1I apologize for any inconvenience caused by the incorrect transaction. To assist you better could you please provide me with your credit card number and some details about the specific transaction?
Babble32.12532.975--
Speech38.42343.100Speaker 2[noise] Sure, my credit card number is <PII>four two one three</PII>.
Speech43.67344.825Speaker 2<PII>three two one four</PII>
Speech45.67347.024Speaker 2<PII>eight nine double zero</PII>
Speech47.77448.575Speaker 2<PII>zero nine.</PII>
Speech49.72552.825Speaker 2I have noticed two transactions (())
Speech50.79851.600Speaker 1I noted.
Speech54.12566.025Speaker 2on my statement that I do not recognize. They seems to be fraudulent. I did like to dispute them an insure they are removed from my account.
Speech67.84974.500Speaker 1Okay. Am sorry hear to [filler] hear about the fraudel transaction on your tran~ account.
Speech75.40086.450Speaker 1I understand your concern and I will do everything I can assist you. Let me go ahead and review the transaction for you. Please bear with me while I check the details for you.
Noise76.67477.525--
Noise79.22579.625--
Speech89.37593.174Speaker 2[noise] Okay. Please take your time. I will be waiting.
Speech93.72494.150Speaker 1Yeah
Speech94.650108.974Speaker 1Yeah Thank you ma'am for you patience. I have reviewed all the transactions on your account and I can confirm that they do not, they do appear to be fraud. I will immediately initiate a dispute resolution.
Speech109.700113.625Speaker 1process for this transaction. As part of this process
Speech114.174120.025Speaker 1We will investigate the transaction and if found to be fraud, they will be removed from your account.
Speech120.599128.500Speaker 1And you will not be held liable for them. Additionally, I will be blocking your security purpose.
Speech129.074132.300Speaker 1And issue a new one with a different card number.
Noise134.650136.298--
Speech136.350138.824Speaker 2Yeah Thank you for your action.
Speech139.425144.598Speaker 2I appreciate your help in resolving this issue. What should I do next.
Speech145.098150.699Speaker 2[filler]Do I need to provide any additional information or any documents for the dispute?
Speech150.900158.824Speaker 1Your welcome ma'am. I am glad I could assist you. Regarding the dispute process, I will be sending you a dispute form via mail.
Speech159.300160.875Speaker 1depending on your preference.
Speech161.400173.949Speaker 1[filler]this form will outline the necessary steps and additional information or documentation required to complete the dispute. Once you receive the form, please fill it out, accurately.
Noise172.574174.550--
Speech174.875186.750Speaker 1provide any supporting documents requested, and return it was within the specific time frame. Our team will thoroughly investigate the matter and keep you updated on the process.

TRANSCRIPTION

TIMETRANSCRIPT
0.000
1.149
Hello Futurebee.
2.923
3.700
Hello ma'am.
3.450
5.000
Hello Fururebee.
5.500
7.025
Hello ma'am. How can I help you?
9.849
18.350
Hello. I have a few concerns regarding my credit card and some transactions that seems to be incorrect. Can you assist me with this?
19.000
21.250
Certainly ma'am. I am here to help you.
19.774
20.750
(())
21.824
34.625
I apologize for any inconvenience caused by the incorrect transaction. To assist you better could you please provide me with your credit card number and some details about the specific transaction?
32.125
32.975
-
38.423
43.100
[noise] Sure, my credit card number is <PII>four two one three</PII>.
43.673
44.825
<PII>three two one four</PII>
45.673
47.024
<PII>eight nine double zero</PII>
47.774
48.575
<PII>zero nine.</PII>
49.725
52.825
I have noticed two transactions (())
50.798
51.600
I noted.
54.125
66.025
on my statement that I do not recognize. They seems to be fraudulent. I did like to dispute them an insure they are removed from my account.
67.849
74.500
Okay. Am sorry hear to [filler] hear about the fraudel transaction on your tran~ account.
75.400
86.450
I understand your concern and I will do everything I can assist you. Let me go ahead and review the transaction for you. Please bear with me while I check the details for you.
76.674
77.525
-
79.225
79.625
-
89.375
93.174
[noise] Okay. Please take your time. I will be waiting.
93.724
94.150
Yeah
94.650
108.974
Yeah Thank you ma'am for you patience. I have reviewed all the transactions on your account and I can confirm that they do not, they do appear to be fraud. I will immediately initiate a dispute resolution.
109.700
113.625
process for this transaction. As part of this process
114.174
120.025
We will investigate the transaction and if found to be fraud, they will be removed from your account.
120.599
128.500
And you will not be held liable for them. Additionally, I will be blocking your security purpose.
129.074
132.300
And issue a new one with a different card number.
134.650
136.298
-
136.350
138.824
Yeah Thank you for your action.
139.425
144.598
I appreciate your help in resolving this issue. What should I do next.
145.098
150.699
[filler]Do I need to provide any additional information or any documents for the dispute?
150.900
158.824
Your welcome ma'am. I am glad I could assist you. Regarding the dispute process, I will be sending you a dispute form via mail.
159.300
160.875
depending on your preference.
161.400
173.949
[filler]this form will outline the necessary steps and additional information or documentation required to complete the dispute. Once you receive the form, please fill it out, accurately.
172.574
174.550
-
174.875
186.750
provide any supporting documents requested, and return it was within the specific time frame. Our team will thoroughly investigate the matter and keep you updated on the process.

Dataset Demographics

Details Headline

Language

English

Language code

en-In

Country

India

Accents

Chandigarh,...more

Gender Distribution

M:55, F:45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

5-15 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg