English (India) Call Center Speech Dataset for Real Estate

The audio dataset includes call center conversations in Real Estate, featuring native English speakers from India, with detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language Call Center Speech Dataset for the Real Estate domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Real Estate industry.


With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Real Estate domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in India.


Speech Data:

This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the Real Estate domain, to build robust and accurate customer service speech technology.


To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of India. This collaborative effort ensures a balanced representation of Indian accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.


Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.


The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.


Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.


The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the Real Estate domain.


Transcription:

To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.


These ready-to-use transcriptions accelerate the development of Real Estate call center conversational AI and ASR models for the English language.


Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.


If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


License:

This Real Estate call center audio dataset is created by FutureBeeAI and is available for commercial use!


Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Real Estate sector, our dataset serves as a trusted resource to meet your goals


Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Female(29)Female(28)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Noise0.0240.299--
Noise1.0241.974--
Speech2.6498.298Speaker 1[filler]Hello, am I talking with dosti real estate?
Noise2.6732.948--
Noise3.8994.224--
Noise5.2746.049--
Noise8.5488.948--
Speech11.69816.199Speaker 2Hello, good morning. Thank you for calling dosti real estate.
Speech16.77418.774Speaker 2How may I assist you today?
Noise19.44919.998--
Speech22.24829.724Speaker 1Hi, I am interested in buying a property in the downtown area. Can you help me with that?
Noise30.44930.748--
Speech32.19832.798Speaker 2[filler]
Noise32.99933.249--
Noise33.97234.323--
Noise38.67439.524--
Noise42.94943.999--
Speech50.29858.948Speaker 2Absolutely yes absolutely. I'll be happy to help you with that. Can you tell me a little bit
Speech59.72265.447Speaker 2more about what are you looking for? First of all tell me your details first.
Noise60.34760.823--
Noise64.57265.224--
Speech66.34970.322Speaker 2[filler]From where your talking, which area, what's your name
Speech70.77473.524Speaker 2and what kind of property you're looking for?
Speech76.92487.174Speaker 1[filler]I am [filler] <PII>Tina</PII>, talking from Thane district [filler] and I am looking for a two bedroom apartment.
Noise85.47486.049--
Speech88.29991.623Speaker 1with a balcony and a view of a city.
Speech93.82296.998Speaker 2Okay. Can I know what is your budget around?
Speech98.674103.024Speaker 1Okay, my budget is around [filler] four to five lakh.
Noise103.424105.724--
Speech106.224107.072Speaker 2(())
Speech107.498108.498Speaker 2Great great.
Speech109.424112.274Speaker 2We have several option that might interest you?
Speech112.774116.974Speaker 2Can you tell me your preferred location?
Speech117.774120.774Speaker 2and when do you plan on moving in?
Speech121.947124.072Speaker 2Which location you want? Which area?
Noise125.649125.974--
Speech126.697137.622Speaker 1Okay, I am planning for a property in the Thane district only [filler] preferably near business district. Actually my business is located near Viviana mall.
Noise137.122137.549--
Speech137.774149.324Speaker 1So preferably, there I am looking for a property [filler] and I am flexible with my moving in day. But preferably if it would be in next three months then it would be great.
Noise139.774140.049--
Noise149.723149.997--
Speech151.526152.372Speaker 2Okay
Speech152.949156.247Speaker 2That side I'm having one property at Viviana mall but
Speech157.348161.923Speaker 2the possession date of that said property is more than three month.
Speech163.448170.323Speaker 2Are you okay with that? It'll come near about some end of the year, somewhere at the December.
Noise170.298170.673--
Noise172.122172.973--
Speech173.044179.423Speaker 1Okay, but is it sure that I will get the flat at in December, the possession in December?
Noise179.247179.573--

TRANSCRIPTION

TIMETRANSCRIPT
0.024
0.299
-
1.024
1.974
-
2.649
8.298
[filler]Hello, am I talking with dosti real estate?
2.673
2.948
-
3.899
4.224
-
5.274
6.049
-
8.548
8.948
-
11.698
16.199
Hello, good morning. Thank you for calling dosti real estate.
16.774
18.774
How may I assist you today?
19.449
19.998
-
22.248
29.724
Hi, I am interested in buying a property in the downtown area. Can you help me with that?
30.449
30.748
-
32.198
32.798
[filler]
32.999
33.249
-
33.972
34.323
-
38.674
39.524
-
42.949
43.999
-
50.298
58.948
Absolutely yes absolutely. I'll be happy to help you with that. Can you tell me a little bit
59.722
65.447
more about what are you looking for? First of all tell me your details first.
60.347
60.823
-
64.572
65.224
-
66.349
70.322
[filler]From where your talking, which area, what's your name
70.774
73.524
and what kind of property you're looking for?
76.924
87.174
[filler]I am [filler] <PII>Tina</PII>, talking from Thane district [filler] and I am looking for a two bedroom apartment.
85.474
86.049
-
88.299
91.623
with a balcony and a view of a city.
93.822
96.998
Okay. Can I know what is your budget around?
98.674
103.024
Okay, my budget is around [filler] four to five lakh.
103.424
105.724
-
106.224
107.072
(())
107.498
108.498
Great great.
109.424
112.274
We have several option that might interest you?
112.774
116.974
Can you tell me your preferred location?
117.774
120.774
and when do you plan on moving in?
121.947
124.072
Which location you want? Which area?
125.649
125.974
-
126.697
137.622
Okay, I am planning for a property in the Thane district only [filler] preferably near business district. Actually my business is located near Viviana mall.
137.122
137.549
-
137.774
149.324
So preferably, there I am looking for a property [filler] and I am flexible with my moving in day. But preferably if it would be in next three months then it would be great.
139.774
140.049
-
149.723
149.997
-
151.526
152.372
Okay
152.949
156.247
That side I'm having one property at Viviana mall but
157.348
161.923
the possession date of that said property is more than three month.
163.448
170.323
Are you okay with that? It'll come near about some end of the year, somewhere at the December.
170.298
170.673
-
172.122
172.973
-
173.044
179.423
Okay, but is it sure that I will get the flat at in December, the possession in December?
179.247
179.573
-

Dataset Demographics

Details Headline

Language

English

Language code

en-In

Country

India

Accents

Chandigarh,...more

Gender Distribution

M:55, F:45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

5-15 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg