Leveraging AI for Linguistic Diversity

Google credits its large language model, PaLM 2, as the driving force behind this ambitious project. PaLM 2 has enabled Google Translate to efficiently learn and support languages that are closely related, such as those near Hindi, like Awadhi and Marwadi, and various French creoles, including Seychellois Creole and Mauritian Creole.

Isaac Caswell, a Senior Software Engineer at Google Translate, highlights the role of technology and collaboration: “As technology advances, and as we continue to partner with expert linguists and native speakers, we’ll support even more language varieties and spelling conventions over time.”


A Major Milestone for Google Translate

This latest update brings Google Translate’s total supported languages to 243. Among the newly added languages are Cantonese, Punjabi (Shahmukhi), and NKo, reflecting a diverse range of linguistic and cultural backgrounds. Approximately 25% of these new additions are African languages, representing the largest expansion of African languages on Google Translate to date. This includes languages such as Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof.

Source: Google.

Google estimates that the newly added languages are spoken by over 614 million people worldwide, accounting for about 8% of the global population. This significant milestone is part of Google’s broader 1,000 Languages Initiative, which aims to support the most spoken languages worldwide through AI.


Overcoming Challenges in Language Translation

The process of adding new languages to Google Translate is complex, involving considerations of regional varieties, dialects, and different spelling standards. For instance, Cantonese, one of the most requested languages, posed a challenge due to its written characters overlapping with those of Mandarin.

Manx, a Celtic language from the Isle of Man, nearly went extinct in 1974 but has seen a revival. Now, thanks to Google Translate, Manx joins the list of supported languages, further aiding its preservation and promotion.

Google’s approach to managing linguistic diversity involves creating hybrid models that incorporate elements from various dialects. For example, Romani, which has many dialects throughout Europe, is represented in Google Translate with text closest to Southern Vlax Romani, while also mixing in elements from Northern Vlax and Balkan Romani.


The Role of PaLM 2 in Translation

PaLM 2 has been instrumental in enabling Google Translate to learn and support new languages more efficiently. This AI model has allowed for the seamless integration of languages that share similarities, improving the overall accuracy and usability of translations. The model’s ability to understand and translate closely related languages has been a key factor in this expansion.

Google’s commitment to expanding its language support aligns with its vision of making information universally accessible. By adding 110 new languages, Google Translate is not only enhancing its utility for millions of users but also playing a crucial role in the preservation and revitalization of lesser-known languages.

The latest expansion follows the introduction of 24 new languages in 2022, made possible by Zero-Shot Machine Translation, where a machine learning model learns to translate into another language without ever seeing an example. The ongoing efforts are part of Google’s 1,000 Languages Initiative, aimed at building AI models to support the world’s most spoken languages.

The addition of 110 new languages to Google Translate represents a significant milestone in the platform’s evolution. With the power of AI and the innovative capabilities of PaLM 2, Google continues to bridge linguistic gaps and foster global communication. This expansion not only highlights the technological advancements in AI but also underscores Google’s dedication to inclusivity and linguistic diversity. As Google Translate grows, it promises to be an even more invaluable tool for connecting people across cultures and languages.

Share.
Leave A Reply

Exit mobile version