English (UK) Call Center Speech Dataset for BFSI

The audio dataset comprises call center conversations for the BFSI domain, featuring native English speakers from UK. It includes speech data, detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

Jun 2024

Number of participants

60

English (UK) call center audio recording for BFSI industry
Download
Download Icon

About this Off-the-shelf Speech Dataset

Card Head Line

Introduction

Welcome to the UK English Call Center Speech Dataset for the BFSI domain designed to enhance the development of call center speech recognition models specifically for the BFSI industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

Speech Data

This training dataset comprises 30 Hours of call center audio recordings covering various topics and scenarios related to the BFSI domain, designed to build robust and accurate customer service speech technology.

  • Participant Diversity:
  • Speakers: 60 People expert native UK English speakers from the FutureBeeAI Community.
  • Regions: Different regions of United Kingdom, ensuring a balanced representation of UK accents, dialects, and demographics.
  • Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
  • Recording Details:
  • Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
  • Call Duration: Average duration of 5 to 15 minutes per call.
  • Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
  • Environment: Without background noise and without echo.
  • Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

  • Inbound Calls:
  • Debit Card Block Request
  • Home Loan Enquiry
  • Transaction Disputes
  • Credit Card Billing Dispute
  • Account Closure Procedures
  • Claim Procedures
  • Premium Payments
  • Policy Comparison
  • Policy Cancellation or Lapse
  • Insurance Renewal Options
  • Retirement Planning
  • Investment Risk Assessment Questionnaires
  • Tax-efficient Investment Strategies
  • Investment Performance Enquiry, and many more
  • Outbound Calls:
  • Credit Card Offers
  • Loan Offers
  • Loyalty Program Benefits
  • Customer Satisfaction Surveys
  • EMI Reminder Call
  • Policy Upgrade Offers
  • Claim Status Updates
  • Policyholder Loyalty Benefits
  • Insurance Policyholder Surveys
  • Term Life Insurance Offer
  • Investment Opportunities
  • Retirement Savings Review, and many more
  • This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

  • Speaker-wise Segmentation: Time-coded segments for both agents and customers.
  • Non-Speech Labels: Tags and labels for non-speech elements.
  • Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.
  • These ready-to-use transcriptions accelerate the development of the BFSI domain call center conversational AI and ASR models for the UK English language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

  • Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
  • Conversation Metadata: Domain, topic, call type, outcome/sentiment, bit depth, and sample rate.
  • This metadata is a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of UK English call center speech recognition models.

    Usage and Applications

    This dataset can be used for various applications in the fields of speech recognition, natural language processing, and conversational AI, specifically tailored to the BFSI domain. Potential use cases include:

  • Speech Recognition Models: Training and fine-tuning speech recognition models for UK English.
  • Speech Analytics Models: Building speech analytics models to extract insights, identify patterns, and glean valuable information from customer conversation, enables data-driven decision-making and process optimization within the BFSI sector.
  • Smart Assistants and Chatbots: Developing conversational agents and virtual assistants for customer service in the BFSI industries.
  • Sentiment Analysis: Analyzing customer sentiment and improving customer experience based on call center interactions.
  • Generative AI: Training generative AI models capable of generating human-like responses, summaries, or content tailored to the BFSI domain.
  • Secure and Ethical Collection

  • Our proprietary data collection and transcription platform, “Yugo” was used throughout the process of this dataset creation.
  • Throughout the data collection process, the data remained within our secure platform and did not leave our environment, ensuring data security and confidentiality.
  • The data collection process adhered to strict ethical guidelines, ensuring the privacy and consent of all participants.
  • It does not include any personally identifiable information about any participant, which makes the dataset safe to use.
  • The dataset does not contain any copyrighted content.
  • Updates and Customization

    Understanding the importance of diverse environments for robust ASR models, our call center voice dataset is regularly updated with new audio data captured in various real-world conditions.

  • Customization & Custom Collection Options:
  • Environmental Conditions: Custom collection in specific environmental conditions upon request.
  • Sample Rates: Customizable from 8kHz to 48kHz.
  • Transcription Customization: Tailored to specific guidelines and requirements.
  • License

    This BFSI domain call center audio dataset is created by FutureBeeAI and is available for commercial use.

    Use Cases

    Use of speech data in Conversational AI

    Call Center Conversational AI

    Use of speech data for Automatic Speech Recognition

    ASR

    Use of speech data for Chatbot & voicebot creation

    Chatbot

    Use of speech data in Language Modeling

    Language Modelling

    Use of speech data in Text-into-speech

    TTS

    Speech data usecase in Speech Analytics

    Speech Analytics

    Dataset Sample(s)

    Card Head Line
    00:00

    ATTRIBUTES

    TRANSCRIPTION

    TIME
    TRANSCRIPT
    0.250 - 1.577
    Hello Future Bee.
    2.173 - 2.270
    -
    2.577 - 3.992
    Hello[noise] Future Bee.
    6.764 - 13.567
    Hello,[noise] I've just received my bank statement today and there seems to be a transaction on there that I don't recognize.
    15.557 - 23.498
    Hello, good morning. #Amm I'm sorry to hear about the unknown transaction on your statement. I I'll be glad to assist you with this.
    24.251 - 29.469
    To begin with, could you please provide me with some details about the transaction in question? [noise]
    30.931 - 37.365
    Yeah, of course. #Ah So on my statement, it's got the normal payments and everything that I've made,
    37.871 - 40.060
    but there's a charge for hundred dollars.[noise]
    40.561 - 44.243
    from a company that's called XYZ <initial>LTD</initial>
    44.701 - 51.932
    And I've got no recollection of making any purchase from them. To be honest, I've never even heard of this company, so I'm a bit concerned.
    53.154 - 57.643
    [noise]Thank you for letting me know, Madam. #Amm [noise]I understand your concern.
    58.313 - 60.526
    Let me investigate this for you. [noise]
    61.060 - 66.019
    #Ah May I have your account number, please so I can access your account details? [noise]
    67.298 - 69.980
    Yeah, sure. My account number is [noise] sorry, both me. [noise]
    70.602 - 71.441
    get my card.[noise]
    76.293 - 78.700
    All right. And yeah, my account number is[noise]
    77.177 - 77.328
    -
    79.215 - 80.798
    <PII>nine eight</PII>. [noise]
    81.557 - 81.918
    Yeah.
    82.042 - 82.623
    (())
    84.034 - 85.352
    <PII>six five</PII>[noise]
    84.530 - 84.870
    yeah,
    87.010 - 87.382
    yeah,
    87.751 - 88.131
    <PII>four</PII>[noise]
    88.727 - 89.933
    <PII>three two one</PII>[noise]
    91.087 - 101.507
    <PII>four three two one</PII>. [noise] Thank you, madam, for providing your account number. #Amm I can see the transaction that you're referring to from XYZ Limited.
    102.102 - 116.030
    Let me check their details and see if we have any additional information about this charge. [noise]It could be that you've made a purchase from someone but their account name is something different than what you would expect to see.
    116.712 - 121.971
    And so just I'm just gonna to put you on hold for#Amm a short while. Is that okay?
    125.382 - 126.718
    Yeah, sure. [noise]No problem.
    126.015 - 127.674
    Okay, just bear with me. Thank you. [noise]
    146.984 - 160.491
    Hello, #Ah thank you for waiting madam. #Amm After looking to the transaction it appears that XYZ Limited is a legitimate merchant associated with an online retail platform.
    161.056 - 165.580
    #Ah Have you or anyone you know recently made a purchase from this company?
    168.758 - 177.889
    Well no, I haven't made any purchases from XYZ Limited or any any other online retail platforms recently.[noise] I'm trying to be a bit careful with what I'm spending my money on.
    178.526 - 182.312
    And this charge is completely unknown to me. I have no idea. [noise]
    183.086 - 184.300
    It's a bit worrying really.[noise]
    185.084 - 198.741
    #Ah I see. I apologies for the confusion, madam. #Ah It seems there may have been an error or potentially fraudulent transaction. In such cases, we always recommend contacting the merchant directly[noise]
    185.651 - 185.764
    -
    194.806 - 195.544
    Hm-Mm
    199.014 - 213.072
    first to enquire about the charge, [noise] as it could be a mistake that can be resolved quickly. #Amm Would you like me to provide you with XYZ Limited's contact information so that you can contact them directly yourself? [noise]
    214.793 - 220.020
    Oh yes, please. I'd appreciate it if you could provide me with their details. Thanks very much. [noise]
    221.324 - 233.859
    No problem. I completely understand, madam. #Amm The contact number for XYZ Limited is, #Ah just bear with me a moment, it's <PII>one two three</PII>.
    234.727 - 236.955
    (()) Can I just grab a pen? (())
    236.032 - 236.538
    Sure. [noise]
    241.305 - 242.538
    Okay, yeah, I'm ready.
    242.854 - 244.002
    <PII>One two three</PII> [noise]
    244.526 - 245.919
    [noise] <PII>four five six</PII>
    246.103 - 246.235
    -
    246.596 - 248.596
    <PII>seven eight nine zero</PII>.
    247.955 - 248.276
    Yeah, [noise]
    249.382 - 249.758
    #Amm
    251.185 - 252.282
    Can I just read [noise]yeah,
    251.627 - 252.151
    Yeah.[noise]
    252.937 - 253.817
    <PII>One two three</PII> [noise]
    254.312 - 254.692
    Yeah.
    254.395 - 257.939
    <PII>four five six [noise] seven eight nine zero</PII>.[noise]
    256.370 - 256.737
    Yeah.
    258.797 - 268.913
    That's correct, madam. #Amm I'd also recommend sending them an email to their customer support address. #Ah I'll give you that address now. It's customer service,
    260.187 - 260.843
    (())
    265.978 - 266.526
    (())
    270.557 - 271.463
    all one word, [noise]
    273.107 - 273.838
    #Hmm Wow,
    274.528 - 275.716
    Yeah, that ways
    275.778 - 280.781
    at XYZ Limited. XYZ <initial>LTD</initial> Sorry.
    283.312 - 285.247
    XYZ <initial>LTD</initial> Yeah,
    284.963 - 286.134
    yeah, dot com. [noise]
    286.959 - 287.103
    -
    287.697 - 287.821
    -
    287.875 - 293.692
    I mean just explain the situation to them and request some clarification regarding the transaction. [noise]
    288.346 - 288.709
    -
    295.824 - 307.809
    That's great, thanks very much for helping me. I'm going to get in contact with them right away and enquire about the charge. And In the meantime, is there anything I should do from the bank's side to resolve this matter because I'm hundred dollars down now?
    308.831 - 321.526
    Yes, #Amm it would be advisable for you to keep a record of your communication with XYZ Limited, #Amm including any phone calls or emails exchanged. If you are unable to resolve the issue directly with them,
    321.737 - 330.785
    please do contact us again with the details of your conversation [noise] and we will assist you further in resolving the dispute and potentially reversing the charge.
    332.401 - 338.425
    Oh, that's great, thank you. I'm gonna to make sure I keep a detailed record of all my communication with XYZ Limited. [noise]
    339.451 - 346.560
    And if I need to reach out to the bank again, should I refer to this conversation or provide any additional information when I call back?
    347.704 - 361.475
    Yes, please reference this conversation and also provide any documentation or evidence you have such as emails, receipts or any proof that supports your claim. #Amm These will help us in further assisting you and resolving the issue promptly.
    363.060 - 372.052
    All right, that sounds reasonable. Thank you so much for your [noise]assistance and guidance in this matter. I'm gonna to contact this company and then get back to you if I need any further action to be taken.
    373.552 - 382.843
    Excellent. You are more than welcome, madam. I'm here to help. Please don't hesitate to contact us again if you need any further assistance or if there's anything else we can do to assist you. [noise]
    385.285 - 388.016
    Thank you. I appreciate your help. Have a great day as well. Bye.
    387.855 - 388.533
    Thank you. Bye, bye.

    Dataset Details

    Card Head Line

    Language

    English

    Language code

    en-gb

    Country

    UK

    Accents

    English - East and Central Midlands, English - East Anglia ...more

    Gender Distribution

    M:60, F:40

    Age Group

    18-70

    File Details

    Card Head Line

    Environment

    Silent, Noisy

    Bit Depth

    16 bit

    Format

    wav

    Sample rate

    8khz & 16khz

    Channel

    Stereo

    Audio file duration

    5-15 minutes

    Need datasets for a specific AI/ML use case?
    Don't worry, we've got you covered! 👍

    Contact Us
    Prompt 2 Bg