The Chinese University of Hong Kong (CUHK) today (30 October) announced the launch of CLEVA-Cantonese, the world’s first dynamic evaluation platform and ecosystem dedicated to the Cantonese language. Cantonese is a vital language for communities in Hong Kong, Guangdong and other Cantonese-speaking regions. This pioneering platform delivers fair, dynamic, informative benchmarking that reveals how well various large language models (LLMs) support Cantonese. It provides researchers and developers with meaningful insights to accelerate the improvement and real-world application of Cantonese-capable LLMs.
This project is a collaboration between CUHK’s InnoHK Centre for Perceptual and Interactive Intelligence (CPII) and the CUHK Language and Vision (LaVi) Lab. It is co-led by Professor Helen Meng Mei-ling, Patrick Huen Wing Ming Professor of Systems Engineering and Engineering Management and Director of CPII, and Professor Wang Liwei, Assistant Professor in the Department of Computer Science and Engineering at CUHK, Leader of the LaVi Lab and CLEVA project leader.
An evolving ecosystem for Cantonese LLM evaluation
CLEVA (Chinese Language Models EVAluation Platform), developed by CUHK’s LaVi Lab, is widely recognised as one of the largest and most comprehensive evaluation benchmarks for Mandarin Chinese LLMs. Building upon this foundation, CLEVA-Cantonese establishes the world’s first evolving ecosystem for Cantonese LLM evaluation. It integrates a collaborative, automated workflow that cycles through four key phases: data import and filtering, language model understanding, evaluation, and feedback. This continuous process provides timely insights to guide LLM innovation, improves services for Cantonese-speaking populations and generates research outcomes that can assist in the evaluation of other low-resource languages.
Cantonese evaluation for LLMs is crucial, as it provides clear performance signals that pinpoint model strengths and areas for improvement, thereby accelerating their development. It also enables scalable, timely assessment that keeps pace with rapid model iteration cycles, while ensuring trustworthy comparisons through standardised tasks, prompts and multi-metric evaluations.
CLEVA-Cantonese is built to meet the special challenges of creating a high-quality Cantonese benchmark:
It is capable to evaluate written vernacular Cantonese (粵語白話文) – the written form of everyday spoken Cantonese – capturing unique linguistic traits such as colloquial expressions and slang, code-switching with English and Mandarin, and romanisation in the form of Jyutping (粵拼).
CLEVA-Cantonese standardises the end-to-end workflow for evaluation, including constructing representative tasks with up-to-date data, evaluating LLMs using consistent prompts and selecting a suite of informative metrics.
Through collaboration with data providers such as Phoenix TV, CLEVA-Cantonese continuously adopts the latest data, which naturally reflects emerging language trends in Cantonese and mitigates data contamination.
Professor Wang said: “We utilise natural language understanding technology based on LLMs to assist in constructing a series of multidimensional evaluation tasks. These tasks are designed around linguistic features, ensuring the benchmark faithfully reflects the language’s structural and knowledge-based characteristics. CLEVA-Cantonese marks the beginning of an ecosystem that brings together academic research, data contributors and state-of-the-art model developers to drive LLM advancement across languages, with immediate benefits for Cantonese-speaking communities.”
Early findings and the continuous improvement loop
The CLEVA-Cantonese team has completed an initial round of evaluation with a range of international and domestic LLMs, spanning open-source and proprietary models. The findings show that even the latest models still struggle to fully capture the nuances of Cantonese, leaving substantial room for improvement in grammar, pronunciation and vocabulary. These insights will guide the next generation of LLMs, enhancing their alignment with Cantonese and performance in related tasks. As stronger models emerge, CLEVA-Cantonese will iteratively refine its evaluation criteria – completing the continuous cycle of data import, language model understanding, evaluation and feedback.
Professor Meng concluded: “Building upon CUHK’s interdisciplinary expertise, we will continuously refresh the benchmark through expanded data partnerships, develop an open evaluation platform for researchers, developers and institutions, extend CLEVA-Cantonese to support more languages, tasks and spoken Cantonese, and provide shared tools to advance collaborative research across linguistics, education, culture and related domains. CLEVA-Cantonese elevates evaluation to a systematic process. It makes gaps for improvement visible, guides research and product roadmaps, and helps ensure Cantonese is well supported across areas such as education, healthcare, public services and cultural life.”
Monday 8 December 2025
Hong Kong - 1 month ago
Thu, 30 Oct 2025 00:00:00 UTC CUHK launches world’s first dynamic evaluation platform and ecosystem for Cantonese large language models
Methods Mondays: Designing and Integrating a Sequential Mixed Methods Study: A Practical Overview
- Queen’sSelezione per la stipula di un contratto di ricerca di durata biennale ai sensi dell’articolo 22 della legge 240/2010 - PHYS-03/A - Fisica sperimentale della materia e applicazioni -
- Sant’AnnaTreasured ‘third spaces’ on campus help students contemplate, connect and find community
- PrincetonMayo Clinic s Erin Sexton named University of Minnesota s vice president for government and community relations
- MMUCooperazione giudiziaria tra Italia e Paesi africani: terminati due workshops in Gabon e Ciad
- Sant’Anna
‘Terms of Respect’: Eisgruber’s book is opening new avenues to share what colleges get right on free speech and academic inquiry
- Princeton
Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi - Asian University Rankings 2026
- topuniversities
Beijing Foreign Studies Univ MSc International Bu MBA Master s - QS International Trade Ranking - Masters and MBA 2026
- topuniversitiesNew UK guideline for clopidogrel recommends pharmacogenetic testing for all patients before prescription
- LiverpoolReport demonstrates how harnessing digitally generated data can transform humanitarian aid
- LiverpoolLa vela italiana lancia la prima valutazione del ciclo di vita di un evento velico: un nuovo standard globale per la sostenibilità grazie al progetto di Federazione Italiana di Vela (FIV) e la Scuola Superiore Sant’Anna
- Sant’AnnaRegenerating the future: Western empowers entrepreneurs to build businesses that restore
- Western OntarioForschungskolleg vernetzt Geistes- und Sozialwissenschaften der UA Ruhr international
- Duisburg-EssenResearch College Connects Humanities and Social Sciences at the UA Ruhr Internationally
- Duisburg-EssenInternational community convenes in Pisa to advance coordinated reform in publishing and research assessment
- LeidenProceeds from Ameland children’s book support new research into the microbiota–gut–brain axis
- LeidenBah Humbug! Young actors gear-up to bring Dicken’s ghosts to life with retelling of A Christmas Carol at Rockingham Castle
- Northampton Barcelona
Copenhagen
Gordon
Aberdeen
acenet
Agricultural Sciences
Alabama
Arizona
Autonomous
Bath
Bergen
Bern
Bloomington
Boston
Bozen-Bolzano
Brandeis
Buffalo
Calgary
Cambridge
Central European
Charité
Chester
Colorado Boulder
Connecticut
Copenhagen
Duisburg-Essen
Duke
Dundee
École
Eindhoven
Emory
Estadual de Campinas
Federal do Rio de Janeiro
Florida
Frankfurt am Main
Galway
Geneva
Goethe
Groningen
Harvard
Hawai’i at Mānoa
Hong Kong
Hongkong
Imperial
James Cook
Keele
Kingston
KTH
Laval
Leiden
Liège
Liverpool
Lomonosov Moscow
Luxembourg
Macquarie
Mancunion
Maryland
Massachusetts
Michigan
MMU
Montreal
Nacional de Colombia
Newcastle
Northampton
Nuremberg
Ohio
Ottawa
Oxford
Paris-Sud
Princeton
Purdue
qswownews
Quaid-i-Azam
Queensland
Queen’s
Radboud
Riverside
Ruhr
Rush
Rutgers
RWTH Aachen
Santa Barbara
Santa Cruz
Sant’Anna
São Paulo
Sciences Po
Scuola
SOAS
South Australia
South Florida
Southampton
St-andrews
St. Louis
Stanford
Stirling
Stockholm
Stony Brook
Stuttgart
Surrey
Sussex
SUU
Swansea
Sydney
Syracuse
Texas
Texas A&M
Texas at Dallas
Tokyo
topuniversities
Trento
Tufts
Ulm
USnews/Education
Utah
Utrecht
Wageningen
Waikato
Warwick
Waseda
Washington
Western Australia
Western Ontario
Wilhelms-University Munster
William & Mary
Wollongong
Würzburg
Yale
Yeshiva
⁞
Copenhagen
Gordon
Aberdeen
acenet
Agricultural Sciences
Alabama
Arizona
Autonomous
Bath
Bergen
Bern
Bloomington
Boston
Bozen-Bolzano
Brandeis
Buffalo
Calgary
Cambridge
Central European
Charité
Chester
Colorado Boulder
Connecticut
Copenhagen
Duisburg-Essen
Duke
Dundee
École
Eindhoven
Emory
Estadual de Campinas
Federal do Rio de Janeiro
Florida
Frankfurt am Main
Galway
Geneva
Goethe
Groningen
Harvard
Hawai’i at Mānoa
Hong Kong
Hongkong
Imperial
James Cook
Keele
Kingston
KTH
Laval
Leiden
Liège
Liverpool
Lomonosov Moscow
Luxembourg
Macquarie
Mancunion
Maryland
Massachusetts
Michigan
MMU
Montreal
Nacional de Colombia
Newcastle
Northampton
Nuremberg
Ohio
Ottawa
Oxford
Paris-Sud
Princeton
Purdue
qswownews
Quaid-i-Azam
Queensland
Queen’s
Radboud
Riverside
Ruhr
Rush
Rutgers
RWTH Aachen
Santa Barbara
Santa Cruz
Sant’Anna
São Paulo
Sciences Po
Scuola
SOAS
South Australia
South Florida
Southampton
St-andrews
St. Louis
Stanford
Stirling
Stockholm
Stony Brook
Stuttgart
Surrey
Sussex
SUU
Swansea
Sydney
Syracuse
Texas
Texas A&M
Texas at Dallas
Tokyo
topuniversities
Trento
Tufts
Ulm
USnews/Education
Utah
Utrecht
Wageningen
Waikato
Warwick
Waseda
Washington
Western Australia
Western Ontario
Wilhelms-University Munster
William & Mary
Wollongong
Würzburg
Yale
Yeshiva