Incorporating Regional Data Models: The Key to Diverse and Accurate GenAI Predictions

Tech giants are advised to integrate regional data models for more inclusive GenAI responses. Learn how SEA-LION LLM enhances accuracy and cultural sensitivity in AI predictions.
Incorporating Regional Data Models: The Key to Diverse and Accurate GenAI Predictions

AI Leaders Urged to Integrate Local Data Models for Diversity’s Sake

Tech giants developing generative artificial intelligence (GenAI) tools are being encouraged to incorporate regional and local data models to ensure their products better represent a diverse global population. Laurence Liew, director of AI innovation at AI Singapore, emphasized the importance of integrating models like the Southeast Asian Languages in One Network (SEA-LION) large language model (LLM) to enhance the accuracy of GenAI responses.

Liew shared a test where SEA-LION outperformed a popular global GenAI platform in predicting the outcome of a recent Asian election. SEA-LION, running on 3-billion and 7-billion-parameter models, is trained on 981 billion language tokens, including English, Southeast Asian, and Chinese fragments.

Most public GenAI tools today lack Asian focus, potentially leading to inherent data bias. LLMs such as SEA-LION are considered more culturally sensitive, ensuring that GenAI responses align better with the region’s societal diversity.

Asian countries like Thailand and India have also developed their own LLMs. SEA-LION being open-source, AI Singapore hopes major tech players like Microsoft and Google will adopt such regional and local LLMs.

AI Singapore, established in 2017, aims to boost the country’s AI capabilities through collaboration with research institutions, startups, and companies. The program is backed by government agencies like the Smart Nation and Digital Government Office.

Businesses are increasingly interested in GenAI products, with Microsoft witnessing high demand for tools like Copilot. However, a study revealed that only 30% of organizations feel equipped with the necessary IT assets to deploy GenAI effectively.

While 76% engaged with GenAI in some capacity last year, only 9% widely adopted it. Automation of low-value tasks and enhancing customer service were cited as primary reasons for adoption.

Looking ahead, 60% of respondents believe GenAI will significantly disrupt their industries in the next five years, viewing the technology as a competitive advantage rather than a threat.

Challenges to adoption include regulatory compliance, data privacy concerns, budget constraints, and a shortage of skilled professionals like machine learning engineers and AI data scientists.

Geraldine Kor, Telstra International’s South Asia managing director, highlighted the importance of processing data effectively for informed business decisions. She emphasized the need for end-to-end capabilities to handle large datasets and ensure ethical AI application.

As companies navigate the digital landscape, the decision to invest in GenAI is influenced by geopolitical issues and economic uncertainties. Kor advised organizations to identify specific functions for applying GenAI to kickstart their AI journey.

AI Singapore is addressing adoption challenges through initiatives like the AI Apprenticeship Programme and LLM Application Developer Programme. The country recently published a handbook to assist local companies, including small and midsize businesses, in adopting GenAI and acquiring the necessary skills.

In conclusion, the journey towards effective GenAI implementation requires access to real datasets, skilled AI engineers, and robust computer infrastructure. Companies must overcome data quality, storage, and talent bottlenecks to leverage the full potential of GenAI.


Stay tuned for more updates on the evolving landscape of artificial intelligence and its impact on businesses and society.