Gujarati (India) Scripted Monolouge Speech Dataset for Healthcare

The audio dataset includes scripted monologue speech data in the Healthcare domain, featuring native Gujarati speakers from India, with detailed metadata and accurate transcriptions.

Category

Scripted Utterance Recordings

Total Volume

6000+ prompts

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the Gujarati Language Scripted Monologue Speech Dataset for the Healthcare Domain. It is a comprehensive and diverse collection of single utterance voice data specifically designed to advance the development of Gujarati language speech recognition models for the Healthcare industry.

With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Healthcare domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the Gujarati language spoken in Indian.

Speech Data:

This training dataset consists of 6000+ high-quality scripted single-sentence recordings in the Gujarati Language. These sentences contain various elements like person names, organization names, currencies, dates, times, locations, and more, which makes them very useful for developing robust natural language processing algorithms.

This dataset contains the speech voices of 60 native Gujarati speakers from different parts of Gujarat. This collaborative effort guarantees a balanced representation of Indian accents and demographics, reducing biases and promoting inclusivity.

The average duration of each audio recording is around 5-30 seconds. The speech data is available in WAV format, with monochannel files having a bit depth of 16 bits and a sample rate of 48 kHz. The recording environment is generally quiet, without background noise and echo.

Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device details, bit depth, and sample rate will be provided.

The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Gujarati speech recognition models.

Transcription (Text File):

This dataset provides text files containing scripted prompts along with each audio file. The transcription is available in TXT file format with proper renaming corresponding to its audio file.

This text data can further be annotated with named entity recognition (NER) to expedite the deployment of Gujarati conversational AI and NLP models.

Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.

If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario or with different speaking speeds like fast, slow or normal, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8 kHz to 48 kHz, allowing you to fine-tune your models for different audio recording setups.

License:

This audio dataset, created by FutureBeeAI, is now available for commercial use.

Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring speech AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.

Use Cases

Use of scripted speech monologues datasets for Automatic Speech Recognition

ASR

Use of scripted speech monologues datasets for Conversational AI

Conversational AI

Use of scripted speech monologues datasets for Chatbot

Chatbot

Use of scripted speech monologues datasets for Language modelling

Language modelling

Use of scripted speech monologues datasets for TTS

TTS

Use of scripted speech monologues datasets for Speech analytics

Speech Analytics

Dataset Sample(s)

Sample Line

TRANSCRIPTION

SPEAKERDURATIONTRANSCRIPT
Male(25)0:00:05ધૂમ્રપાન અને દારૂ પીવું સ્વાસ્થ્ય માટે હાનિકારક છે.

Dataset Demographics

Details Headline

Language

Gujarati

Language code

gu-in

Country

India

Accents

Kathiawari,...more

Gender Distribution

M:55, F:45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent

Bit Depth

16 bit

Sample rate

48khz

Channel

Monologue

Audio file duration

5 to 30 seconds

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg