Bahasa (Indonesia) Call Center Speech Dataset for Real Estate

The audio dataset includes call center conversations in Real Estate, featuring native Bahasa speakers from Indonesia, with detailed metadata and accurate transcriptions.


Unscripted Call Center Conversations

Total Volume

40 Speech Hours

Last updated

July 2023

Number of participants


Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the Bahasa Language Call Center Speech Dataset for the Real Estate domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Real Estate industry.

With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Real Estate domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the Bahasa language spoken in Indonesia.

Speech Data:

This training dataset comprises 40 hours of call center audio recordings covering various topics and scenarios related to the Real Estate domain, to build robust and accurate customer service speech technology.

To curate realistic call center interactions, we collaborated with a diverse network of 80 expert native Bahasa speakers from different states/provinces of Indonesia. This collaborative effort ensures a balanced representation of Indonesian accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.

Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.

The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.


In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.

The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of Bahasa language call center speech recognition models for the Real Estate domain.


To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.

These ready-to-use transcriptions accelerate the development of Real Estate call center conversational AI and ASR models for the Bahasa language.

Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.

If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


This Real Estate call center audio dataset is created by FutureBeeAI and is available for commercial use!


Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Real Estate sector, our dataset serves as a trusted resource to meet your goals

Use Cases

Use of speech data for Automatic Speech Recognition


Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation


Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech


Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

Samples will be available soon!

Contact us to get the samples immediately for this dataset.

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg

Dataset Demographics

Details Headline



Language code




Gender Distribution

M:55, F:45

Age Group


Audio File Details

Details Headline


Silent, Noisy

Bit Depth

16 bit



Sample rate



Dual separate channel

Audio file duration

5-15 minutes

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg