Swecha’s Ambitious Project: Developing Telugu Language-Centric Large Language Models
In a groundbreaking initiative, Swecha, a non-profit organization dedicated to promoting free software and free knowledge movements, has announced an internship program on Artificial Intelligence (AI) for engineering students. The project, dubbed “Summer of AI,” aims to equip students with AI skills while developing Telugu language-centric Large Language Models (LLMs).
The Vision Behind the Project
The project is a collaborative effort between Swecha, IIIT Hyderabad, Ozonetel, Meta, and TASK. The primary objective is to collect vast amounts of data through interviews with people in villages and towns, focusing on Telugu folk tales, songs, food, and local skills. This data will be used to create a dataset for both speech and text-based LLMs.
“The approach of the project is to collect speech, transcribe the speech and create a dataset for both speech and as a base LLM.” - Project Statement
The Scale of the Project
The project’s scope is immense, with plans to engage a lakh interns during the month-long internship. The first batch of 10,000 interns has already begun working on the project. Upon successful completion, the approach will be replicated to collect data for other languages and regions.
Interns working on the Summer of AI project
The Significance of the Project
The initiative is significant, considering the scarcity of Indian language and India-centric LLMs. The project’s success will pave the way for the development of more language-centric LLMs, promoting linguistic diversity in AI.
Telugu script
Registration and Future Plans
Registration for the internship is currently open on the Swecha website. The project’s success will have far-reaching implications, enabling the development of more language-centric LLMs and promoting AI adoption in various regions.
Summer of AI logo
The “Summer of AI” project is a pioneering effort in the field of AI, and its success will have a profound impact on the development of language-centric LLMs. As the project progresses, it will be exciting to see the outcomes and the potential applications of this innovative initiative.