American English Call Center Speech Dataset for Healthcare

This American English speech dataset features real-world call center conversations from the Healthcare domain. With detailed metadata and accurate transcriptions, it’s designed to power ASR systems, voice AI, and conversational agents.

About this Off-the-shelf Speech Dataset

Introduction

This US English Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of English speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.

Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.

Speech Data

The dataset features 30 Hours of dual-channel call center conversations between native US English speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.

•Participant Diversity:

•

Speakers: 60 verified native US English speakers from our contributor community.

•

Regions: Diverse provinces across United States of America to ensure broad dialectal representation.

•

Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.

•RecordingDetails:

•

Conversation Nature: Naturally flowing, unscripted conversations.

•

Call Duration: Each session ranges between 5 to 15 minutes.

•

Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.

•

Recording Environment: Captured in clear conditions without background noise or echo.

Topic Diversity

The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).

•Inbound Calls:

•Appointment Scheduling

•New Patient Registration

•Surgical Consultation

•Dietary Advice and Consultations

•Insurance Coverage Inquiries

•Follow-up Treatment Requests, and more

•OutboundCalls:

•Appointment Reminders

•Preventive Care Campaigns

•Test Results & Lab Reports

•Health Risk Assessment Calls

•Vaccination Updates

•Wellness Subscription Outreach, and more

These real-world interactions help build speech models that understand healthcare domain nuances and user intent.

Transcription

Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.

•Transcription Includes:

•Speaker-identified Dialogues

•Time-coded Segments

•Non-speech Annotations (e.g., silence, cough)

•High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.

Metadata

Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.

•

Participant Metadata: ID, gender, age, region, accent, and dialect.

•

Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

Usage and Applications

This dataset can be used across a range of healthcare and voice AI use cases:

•

Automatic Speech Recognition (ASR): Fine-tune medical English speech-to-text systems.

•

Speech Analytics: Derive actionable insights from patient interactions.

•

Voice Assistants & Chatbots: Build empathetic, healthcare-specific virtual assistants.

•

Sentiment Analysis: Understand patient concerns and emotional tone.

•

Generative AI: Train models to summarize, respond to, or simulate healthcare conversations.

Secure and Ethical Collection

•All data was captured using FutureBeeAI’s secure, proprietary platform “Yugo.”

•No personally identifiable information is included.

•Fully compliant with global data ethics and healthcare data privacy standards.

•Copyright-free and safe for commercial or research use.

Updates and Customization

We continuously enrich this dataset and offer fully customizable collection services:

•Customization Options:

•

Recording Conditions: Choose silent or realistic healthcare settings.

•

Sample Rate: Flexible options from 8kHz to 48kHz.

•

Transcription Format: Adaptable to your formatting and labeling needs.

License

This Healthcare call center speech dataset is created by FutureBeeAI and is available under a commercial license for enterprise and research deployment.

Use Cases

Call Center Conversational AI

Use of speech data for Automatic Speech Recognition

ASR

Chatbot

Language Modelling

TTS

Speech Analytics

Dataset Sample(s)

Dataset Details

Language

English

Language code

en-us

Country

USA

Accents

Arizona, California ...moreColorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Alabama, Alaska

Gender Distribution

M:60, F:40

Age Group

18-70 Years