Gujarati Scripted Monologue Speech Dataset for the Healthcare Domain

The audio dataset comprises scripted monologue speech data in the Healthcare domain, featuring native Gujarati speakers from India. It includes speech data, detailed metadata, and accurate transcriptions.

Category

Scripted Prompt Recordings

Total Volume

6000+ prompts

Last updated

July 2025

Number of participants

60+

Healthcare scripted monologue speech data for ASR in Gujarati (India)
Download
Download Icon

About this Off-the-shelf Speech Dataset

Card Head Line

Introduction

Introducing the Gujarati Scripted Monologue Speech Dataset for the Healthcare Domain, a voice dataset built to accelerate the development and deployment of Gujarati language automatic speech recognition (ASR) systems, with a sharp focus on real-world healthcare interactions.

Speech Data

This dataset includes over 6,000 high-quality scripted audio prompts recorded in Gujarati, representing typical voice interactions found in the healthcare industry. The data is tailored for use in voice technology systems that power virtual assistants, patient-facing AI tools, and intelligent customer service platforms.

  • Participant Diversity
  • Speakers: 60 native Gujarati speakers.
  • Regional Balance: Participants are sourced from multiple regions across Gujarat, reflecting diverse dialects and linguistic traits.
  • Demographics: Includes a mix of male and female participants (60:40 ratio), aged between 18 and 70 years.
  • Recording Specifications
  • Nature of Recordings: Scripted monologues based on healthcare-related use cases.
  • Duration: Each clip ranges between 5 to 30 seconds, offering short, context-rich speech samples.
  • Audio Format: WAV files recorded in mono, with 16-bit depth and sample rates of 8 kHz and 16 kHz.
  • Environment: Clean and echo-free spaces ensure clear and noise-free audio capture.
  • Topic Coverage

    The prompts span a broad range of healthcare-specific interactions, such as:

  • Patient check-in and follow-up communication
  • Appointment booking and cancellation dialogues
  • Insurance and regulatory support queries
  • Medication, test results, and consultation discussions
  • General health tips and wellness advice
  • Emergency and urgent care communication
  • Technical support for patient portals and apps
  • Domain-specific scripted statements and FAQs
  • Contextual Depth

    To maximize authenticity, the prompts integrate linguistic elements and healthcare-specific terms such as:

  • Names: Gender- and region-appropriate Gujarat names
  • Addresses: Varied local address formats spoken naturally
  • Dates & Times: References to appointment dates, times, follow-ups, and schedules
  • Medical Terminology: Common medical procedures, symptoms, and treatment references
  • Numbers & Measurements: Health data like dosages, vitals, and test result values
  • Healthcare Institutions: Names of clinics, hospitals, and diagnostic centers
  • These elements make the dataset exceptionally suited for training AI systems to understand and respond to natural healthcare-related speech patterns.

    Transcription

    Every audio recording is accompanied by a verbatim, manually verified transcription.

  • Content: The transcription mirrors the exact scripted prompt recorded by the speaker.
  • Format: Files are delivered in plain text (.TXT) format with consistent naming conventions for seamless integration.
  • Quality Control: Transcriptions are created and reviewed by native Gujarati transcribers to ensure precision and consistency.
  • Metadata

    Comprehensive metadata is included for each audio clip and participant, providing full traceability and analysis capabilities.

  • Participant Metadata: Unique speaker ID, age, gender, country, region/state, and dialect
  • Recording Metadata: Text transcript, Recording environment details, Device specifications, Audio format, sample rate, and bit depth
  • This level of detail allows developers to fine-tune models for regional accents, demographics, and acoustic variations.

    Applications & Use Cases

    This dataset supports a wide array of healthcare-related AI and speech technology use cases:

  • ASR Model Training: Improve model accuracy for medical voice input and queries.
  • Voice Synthesis & TTS: Train synthetic voice models for interactive health applications.
  • Voice Assistants: Build intelligent healthcare bots that speak Gujarati.
  • Medical Chatbots: Enable more accurate patient communication through chatbot systems.
  • Entity Recognition (NER): Teach models to detect key medical data like drug names, appointment dates, and symptoms.
  • NLP & Language Understanding: Enhance downstream tasks like sentiment analysis and medical intent classification.
  • Secure & Ethical Collection

    The dataset was created using FutureBeeAI’s proprietary platform, “Yugo,” ensuring full compliance and security throughout the process.

  • Data collection was fully consented, anonymized, and conducted ethically.
  • No personally identifiable information (PII) is captured in any part of the dataset.
  • The dataset remains securely stored and processed on our platform, in accordance with international data protection standards.
  • License

    This Gujarati Scripted Monologue Speech Dataset for the Healthcare Domain is available for commercial licensing, enabling you to confidently develop and deploy speech AI solutions in the medical and healthcare sectors.

    Use Cases

    Use of scripted speech monologues datasets for Automatic Speech Recognition

    ASR

    Use of scripted speech monologues datasets for Conversational AI

    Conversational AI

    Use of scripted speech monologues datasets for Chatbot

    Chatbot

    Use of scripted speech monologues datasets for TTS

    TTS

    Use of scripted speech monologues datasets for Speech analytics

    Speech Analytics

    Use of scripted speech monologues datasets for Mobile speech

    Mobile Speech

    Dataset Sample(s)

    Card Head Line

    TRANSCRIPTION

    SPEAKERDURATIONTRANSCRIPT

    Dataset Details

    Card Head Line

    Language

    Gujarati

    Language code

    gu-in

    Country

    India

    Accents

    Kathiawari, Amdawadi Gujarati ...more

    Gender Distribution

    M:60, F:40

    Age Group

    18-70 Years

    File Details

    Card Head Line

    Environment

    Silent

    Bit Depth

    16 bit

    Sample rate

    8KHz & 16KHz

    Channel

    Mono

    Audio file duration

    5 to 30 seconds

    Need datasets for a specific AI/ML use case?
    Don't worry, we've got you covered! 👍

    Contact Us
    Prompt 2 Bg