English to Gujarati BFSI Domain Parallel Corpora

Dataset consists of bilingual sentence-aligned corpora for the bfsi domain from English to Gujarati.

Category

Parallel Corpora

Volume

50K+ corpus

Last Updated

Aug 2022

Number of participants

200+ people

Get this AI Dataset

Get Dataset Btn

About This OTS Dataset

About Gradiet Line

What’s Included

This bilingual parallel corpus consists of 50K+ sentence text data translated to Gujarati from English with the help of more than 200 native translators in the BFSI domain. These domain-specific parallel corpora have native language slang, phrases, and language-specific words, and follow the native way of talking, making the corpus more information-rich. Many of the same sentences are translated by various native translators, allowing us to compare how various groups interpret the same text.nnThe sentences in this comparable corpus range in length from 7 to 15 words. The data is accessible in excel format and can be converted into TMX, XML, XLIFF, or other equivalent formats. nnThese parallel bilingual corpora can be utilised for the research and development of bilingual lexicography and machine translation engines. Additionally, it can be used to create numerous language databases for applications like predictive keyboards, spell checkers, grammar checkers, text/speech understanding systems, text-to-speech modules, and many others that are based on NLP.nnMore translated sentences are constantly being added to this parallel corpus. Depending on your unique requirements, we can curate numerous parallel corpora in various languages. For synthetic custom curation, do not forget to check out the FutureBeeAI community. nThe license for this parallel corpus dataset belongs to FutureBeeAI!


Use Cases

Use of parallel corpus dataset in MT Engine

MT Engine

Use of parallel corpus dataset in Language modeling

Language model

Use of parallel corpus dataset in Predictive keyboards

Predictive keyboards

Use of parallel corpora dataset in Spell checker

Spell check

Use of parallel corpus dataset in grammar correction tool

Grammar correction

Use of parallel corpus dataset in Text/speech system

Text/speech systems

Dataset Sample(s)

Sample Line

SAMPLE

Source LanguageTarget Language
The system of tokenization to prevent bank frauds will come into effect from October 1.બેન્ક ફ્રોડ રોકવા ટોકનાઈઝેશનની સિસ્ટમ ૧લી ઓક્ટોબરથી અમલમાં.
No one will be able to know your debit/credit card number with the new system.નવી સિસ્ટમથી કોઈ તમારો ડેબિટ/ક્રેડિટ કાર્ડ નંબર જાણી નહિ શકે.
There is need to promote digital banking in rural areas.ગ્રામીણ વિસ્તારોમાં ડિજિટલ બેંકિંગને પ્રોત્સાહન આપવાની જરૂર.
RBI introduces internet banking guidelines for rural banksગ્રામીણ બેંકો માટે ઈન્ટરનેટ બેંકિંગ માટેની RBIની માર્ગદર્શિકા રજૂ કરી.
The scope of services of regional rural banks is limited.પ્રાદેશિક ગ્રામીણ બેંકોની સેવાઓનો વિસ્તાર મર્યાદિત છે.
RBI has recently issued a new guideline.RBIએ હાલમાં જ એક નવી ગાઈડલાઈન બહાર પાડી છે.
Preserve the message received after making UPI payments.UPI કર્યા બાદ પ્રાપ્ત થયેલા મેસેજને સાચવી રાખો.
Airtel Payments Bank has launched micro ATM on Wednesday.એરટેલ પેમેન્ટ્સ બેંકે બુધવારે માઈક્રો એટીએમ લોન્ચ કર્યુ છે.
Customers of all banks will be able to withdraw money through micro ATMsબધી જ બેંકોના ગ્રાહકો માઈક્રો એટીએમ દ્વારા રૂપિયા ઉપાડી શકશે
Where to invest to earn Rs 10 lakh in just three years?માત્ર ત્રણ જ વર્ષમાં 10 લાખ કમાવવા શેમાં રોકાણ કરવું?

ATTRIBUTES

target_languageGujarati
source_languageEnglish
domainBFSI

Dataset Details

Details Headline

Dataset type

Corpus data

Volume

50K+ corpus

Media type

Text

Language pair

English-Gujarati

File Details

Details Headline

Type

Bilingual

Word count

7 to 12 words/line

Format

XLSX, TMX, XML, XLIFF

Annotation

NA

Download data Sample

Download a free sample of this dataset to get more clarity about this set! OR get in touch with one of our expert to get hands on experience 📨

Download Free Dataset

Download Btn
Promp Bg

Need datasets for a specific AI/ML use case? Don’t worry, we’ve got you covered! 👍

Contact Us

Arrow BtnArrow Btn Black
Promp 2 Bg