English (US) General Conversation Speech Dataset

The audio dataset consist of general conversations between native English people from US along with metadata and transcription.

Category

Unscripted General Conversations

Total Volume

25 Speech Hours

Last updated

July 2023

Number of participants

45

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of English language speech recognition models, with a particular focus on US accents and dialects.


With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in United States.


Speech Data:

This training dataset comprises 30 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 40 native English speakers from different states/provinces of United States. This collaborative effort guarantees a balanced representation of US accents, dialects, and demographics, reducing biases and promoting inclusivity.


Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.


Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.


The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of English language speech recognition models.


Transcription:

This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.


Our goal is to expedite the deployment of English language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.


Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.


If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.


License:

This audio dataset, created by FutureBeeAI, is now available for commercial use.


Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.


Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Male(29)Female(24)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Speech0.6261.597Speaker 1Hello Futurebee.
Speech1.8953.028Speaker 2Hello Futurebee.
Speech6.5867.107Speaker 2Okay
Speech6.75112.791Speaker 1[filler]so (()) you can tell me what kind of vacation things you want to do?
Speech13.33614.646Speaker 1(()).
Speech13.69814.330Speaker 2Yeah
Speech15.64920.942Speaker 2[filler]when Veronica comes in, [filler] I was gonna start with (())
Speech18.45418.968Speaker 1[filler]
Speech22.29526.100Speaker 2Because I am pretty they are gonna get here first. I am pretty sure we are gonna see.
Speech23.94524.439Speaker 1[filler]
Speech26.66329.669Speaker 2(()) and then towards the end of the trip
Speech31.00134.554Speaker 2there will be, there will be doing the all the thing once Veronica and grandma come.
Noise32.31032.490--
Noise32.90233.279--
Speech35.26137.088Speaker 2I don't know. I just, I am more
Speech37.50238.962Speaker 2interested in the (()) stuff.
Speech38.27738.813Speaker 1Yeah sure
Speech39.38140.658Speaker 2I don't know. I am kind of looking at
Speech41.14642.121Speaker 2whatever I guess.
Speech42.97245.640Speaker 2[filler]but right now I am looking at
Speech45.40546.658Speaker 1Okay [filler]
Speech46.15849.161Speaker 2Buses from Bangkok to Phuket.
Speech50.31754.066Speaker 2I don't know if I want to go to Phuket or no we are not going to Phuket. We are going to go visit (()).
Speech54.43255.456Speaker 2Where is Tulip?
Speech56.39362.920Speaker 1Yeah we are going to, we are going to visit (()). [filler] let me, let me look up where he lives (()).
Speech63.88766.322Speaker 1You know what, I can, do you want to message him?
Speech68.06969.123Speaker 2Me message him?
Speech69.38770.697Speaker 1Do you want me to message him?
Speech70.88772.290Speaker 2Yeah, yeah go ahead and message him.
Speech72.97773.750Speaker 2Because I will look very.
Speech73.42774.736Speaker 1Okay I will find out where he lives.
Speech75.50175.899Speaker 2Okay.
Speech77.21579.727Speaker 1I will find out where he lives (()).
Speech78.12980.709Speaker 2Yeah where we can, we can, we can go with him.
Speech81.19181.977Speaker 2[filler]
Speech84.19786.146Speaker 2I think if we go visit to
Speech86.56687.795Speaker 2I think it will be
Speech90.47497.456Speaker 2I don't know if there is going to be so many (()) things to do. I am not sure (()) like what (()) or kind of.
Speech98.27899.495Speaker 2Hoping to do
Speech100.266105.938Speaker 2like may be they are, because like Phuket is very very (()) but it is also pretty cute.
Speech107.140108.105Speaker 2And like it has
Speech109.007110.736Speaker 2like cute buildings and shops and stuff.
Speech111.462116.358Speaker 2[filler]but if we go with the (()) you know probably cheaper because it is not a tourist area.
Speech114.162114.587Speaker 1You could.
Speech117.019119.140Speaker 2Here have been living there for
Speech119.635122.013Speaker 2a couple of months already. So he can, you know
Speech122.629125.278Speaker 2show us around. Also, we could get to visit our friend.
Speech126.250127.227Speaker 2[filler]
Speech128.639129.554Speaker 2and
Speech129.973132.127Speaker 2its even on a small
Speech133.943134.729Speaker 2counts
Speech135.405137.347Speaker 2if they are on the water
Speech137.782139.294Speaker 2or like near the islands.
Speech140.566145.985Speaker 2I am pretty sure they still have like, little tourist things you can like rent [filler]
Speech146.431147.294Speaker 2tourist (())
Speech148.031151.048Speaker 2to go take it to the different islands because thats what we did
Speech151.812154.479Speaker 2when we were (()) and (()) is really
Speech154.905157.229Speaker 2small town too. It is not small place.
Speech157.905159.554Speaker 2But we were still able to rent
Speech160.413161.359Speaker 2like a tourist
Speech162.163162.859Speaker 2tour boat
Speech163.387166.473Speaker 2and go to the islands and (()) and stuff so
Speech167.393168.709Speaker 2it's, it could be
Speech170.020174.794Speaker 2even nicer because its not super touristy and we have a person we already know.
Speech175.580176.715Speaker 2And
Speech179.175180.520Speaker 2the stuff will probably be cheaper.

TRANSCRIPTION

TIMETRANSCRIPT
0.626
1.597
Hello Futurebee.
1.895
3.028
Hello Futurebee.
6.586
7.107
Okay
6.751
12.791
[filler]so (()) you can tell me what kind of vacation things you want to do?
13.336
14.646
(()).
13.698
14.330
Yeah
15.649
20.942
[filler]when Veronica comes in, [filler] I was gonna start with (())
18.454
18.968
[filler]
22.295
26.100
Because I am pretty they are gonna get here first. I am pretty sure we are gonna see.
23.945
24.439
[filler]
26.663
29.669
(()) and then towards the end of the trip
31.001
34.554
there will be, there will be doing the all the thing once Veronica and grandma come.
32.310
32.490
-
32.902
33.279
-
35.261
37.088
I don't know. I just, I am more
37.502
38.962
interested in the (()) stuff.
38.277
38.813
Yeah sure
39.381
40.658
I don't know. I am kind of looking at
41.146
42.121
whatever I guess.
42.972
45.640
[filler]but right now I am looking at
45.405
46.658
Okay [filler]
46.158
49.161
Buses from Bangkok to Phuket.
50.317
54.066
I don't know if I want to go to Phuket or no we are not going to Phuket. We are going to go visit (()).
54.432
55.456
Where is Tulip?
56.393
62.920
Yeah we are going to, we are going to visit (()). [filler] let me, let me look up where he lives (()).
63.887
66.322
You know what, I can, do you want to message him?
68.069
69.123
Me message him?
69.387
70.697
Do you want me to message him?
70.887
72.290
Yeah, yeah go ahead and message him.
72.977
73.750
Because I will look very.
73.427
74.736
Okay I will find out where he lives.
75.501
75.899
Okay.
77.215
79.727
I will find out where he lives (()).
78.129
80.709
Yeah where we can, we can, we can go with him.
81.191
81.977
[filler]
84.197
86.146
I think if we go visit to
86.566
87.795
I think it will be
90.474
97.456
I don't know if there is going to be so many (()) things to do. I am not sure (()) like what (()) or kind of.
98.278
99.495
Hoping to do
100.266
105.938
like may be they are, because like Phuket is very very (()) but it is also pretty cute.
107.140
108.105
And like it has
109.007
110.736
like cute buildings and shops and stuff.
111.462
116.358
[filler]but if we go with the (()) you know probably cheaper because it is not a tourist area.
114.162
114.587
You could.
117.019
119.140
Here have been living there for
119.635
122.013
a couple of months already. So he can, you know
122.629
125.278
show us around. Also, we could get to visit our friend.
126.250
127.227
[filler]
128.639
129.554
and
129.973
132.127
its even on a small
133.943
134.729
counts
135.405
137.347
if they are on the water
137.782
139.294
or like near the islands.
140.566
145.985
I am pretty sure they still have like, little tourist things you can like rent [filler]
146.431
147.294
tourist (())
148.031
151.048
to go take it to the different islands because thats what we did
151.812
154.479
when we were (()) and (()) is really
154.905
157.229
small town too. It is not small place.
157.905
159.554
But we were still able to rent
160.413
161.359
like a tourist
162.163
162.859
tour boat
163.387
166.473
and go to the islands and (()) and stuff so
167.393
168.709
it's, it could be
170.020
174.794
even nicer because its not super touristy and we have a person we already know.
175.580
176.715
And
179.175
180.520
the stuff will probably be cheaper.

Dataset Demographics

Details Headline

Language

English

Language code

en-us

Country

USA

Accents

Arizona,...more

Gender Distribution

M:55, F:45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

15-60 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg