Urdu (Pakistan) Scripted Monologue Speech Dataset for Healthcare Domain

The audio dataset comprises scripted monologue speech data in the Healthcare domain, featuring native Urdu speakers from Pakistan. It includes speech data, detailed metadata, and accurate transcriptions.

Category

Scripted Prompt Recordings

Total Volume

6000+ prompts

Last updated

July 2024

Number of participants

60+

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

Introduction

Welcome to the Urdu Scripted Monologue Speech Dataset for the Healthcare Domain. This meticulously curated dataset is designed to advance the development of Urdu language speech recognition models, particularly for the Healthcare industry.

Speech Data

This training dataset comprises over 6,000 high-quality scripted prompt recordings in Urdu. These recordings cover various topics and scenarios relevant to the Healthcare domain, designed to build robust and accurate customer service speech technology.

  • Participant Diversity:
  • Speakers: 60 native Urdu speakers from different regions of Pakistan.
  • Regions: Ensures a balanced representation of Urdu accents, dialects, and demographics.
  • Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
  • Recording Details:
  • Recording Nature: Audio recordings of scripted prompts/monologues.
  • Audio Duration: Average duration of 5 to 30 seconds per recording.
  • Formats: WAV format with mono channels, a bit depth of 16 bits, and sample rates of 8 kHz and 16 kHz.
  • Environment: Recordings are conducted in quiet settings without background noise and echo.
  • Topic Diversity: The dataset encompasses a wide array of topics and conversational scenarios to ensure comprehensive coverage of the Healthcare sector. Topics include:
  • Patient Interactions
  • Medical Consultations
  • Healthcare Services Inquiries
  • Technical Support
  • General Information and Advice
  • Regulatory and Compliance Queries
  • Emergency and Urgent Care
  • Domain Specific Statements
  • Other Elements: To enhance realism and utility, the scripted prompts incorporate various elements commonly encountered in Healthcare interactions:
  • Names: Region-specific names of males and females in various formats.
  • Addresses: Region-specific addresses in different spoken formats.
  • Dates & Times: Inclusion of date and time in various healthcare contexts, such as appointment dates and medication schedules.
  • Medical Terms: Specific medical terminology relevant to diagnoses, treatments, and procedures.
  • Numbers & Measurements: Various numbers and measurements related to dosages, test results, and medical statistics.
  • Healthcare Facilities: Names of hospitals, clinics, and medical institutions relevant to the healthcare sector.
  • Each scripted prompt is crafted to reflect real-life scenarios encountered in the Healthcare domain, ensuring applicability in training robust natural language processing and speech recognition models.

    Transcription Data

    In addition to high-quality audio recordings, the dataset includes meticulously prepared text files with verbatim transcriptions of each audio file. These transcriptions are essential for training accurate and robust speech recognition models.

  • Content: Each text file contains the exact scripted prompt corresponding to its audio file, ensuring consistency.
  • Format: Transcriptions are provided in plain text (.TXT) format, with files named to match their associated audio files for easy reference.
  • Quality: All transcriptions are verified for accuracy and consistency by native Urdu transcribers.
  • Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

  • Participant Metadata: Unique identifier, age, gender, country, state, and dialect.
  • Other Metadata: Recording transcript, recording environment, device details, sample rate, bit depth, and file format.
  • This metadata is a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of Urdu language speech recognition models.

    Usage and Applications

    This dataset is a versatile resource for various applications within speech recognition, natural language processing, and AI-driven conversational technologies.

  • Speech Recognition Model Training: High-quality audio recordings and precise transcriptions for training and fine-tuning Urdu speech recognition models.
  • Voice Synthesis: The diverse and high-quality audio data can train generative AI models for creating synthetic voices.
  • Voice Assistants: Ideal for training voice assistants tailored to the Healthcare domain.
  • Chatbots: Transcription data can train conversational models, enabling chatbots to respond to customer queries effectively.
  • Entity Recognition: Sentences include names, dates, currencies, and other domain-specific entities for training NLP models for named entity recognition (NER) tasks.
  • Language Understanding: Improve language understanding applications like sentiment analysis and topic modeling within the Healthcare sector.
  • Secure and Ethical Collection

  • Our proprietary data collection and transcription platform, “Yugo” was used throughout the dataset creation process.
  • Data remained within our secure platform, ensuring data security and confidentiality.
  • The data collection process adhered to strict ethical guidelines, ensuring the privacy and consent of all participants.
  • The dataset does not include any personally identifiable information about any participant, making it safe to use.
  • License

    This Urdu Scripted Monologue Speech Dataset, created by FutureBeeAI, is available for commercial use.

    Use Cases

    Use of scripted speech monologues datasets for Automatic Speech Recognition

    ASR

    Use of scripted speech monologues datasets for Conversational AI

    Conversational AI

    Use of scripted speech monologues datasets for Chatbot

    Chatbot

    Use of scripted speech monologues datasets for Language modelling

    Language modelling

    Use of scripted speech monologues datasets for TTS

    TTS

    Use of scripted speech monologues datasets for Speech analytics

    Speech Analytics

    Dataset Sample(s)

    Sample Line

    Samples will be available soon!

    Contact us to get the samples immediately for this dataset.

    Contact Us

    Audio Arrow BtnAudio Arrow Btn Black
    Audio Promp 2 Bg

    Dataset Demographics

    Details Headline

    Language

    Urdu

    Language code

    ur

    Country

    Pakistan

    Accents

    Dakhni,...more

    Gender Distribution

    M:60, F:40

    Age Group

    18-70

    Audio File Details

    Details Headline

    Environment

    Silent

    Bit Depth

    16 bit

    Sample rate

    8KHz & 16KHz

    Channel

    Mono

    Audio file duration

    5 to 30 seconds

    Start your AI/ML model creation journey with FutureBeeAI!

    Contact Us

    Audio Arrow BtnAudio Arrow Btn Black
    Audio Promp 2 Bg