Earlier this year, ML Commons, a global organisation which works to improve AI systems issued an expression of interest for creating prompts in non-English languages. Tattle was selected as a pilot project to contribute to the benchmark in Hindi, using the participatory approach we followed with Uli 1, and we commenced working on this project. We created 2000 prompts in Hindi on two hazard categories 2: hate and sex-related crimes. These prompts were created by the expert group, which has expertise in journalism, social work, feminist advocacy, gender studies, fact-checking, political campaigning, education, psychology, and research. All of the experts were native or fluent Hindi speakers.
The project took place over the course of 2 months, where we conducted online sessions with the experts organised into groups. They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise. Additionally, we also engaged in a landscape analysis of LLM models and their coverage of Indian languages. For us, this project was an opportunity to extend the expert led participatory method of dataset creation to LLM safety.
MLCommons is now releasing the v1 Safety Benchmark dataset, AI Luminate. It is an important step in assessing the safety of LLMs. Our project provided interesting insights on the universality of the framework proposed in v0.5. We conclude our report-available here- to MLCommons with some recommendations for extending this work to low resource languages.
Take a look at AI Luminate here for more information about this benchmark, how we’re involved, and what it means for the rest of us.