Datasets We Love

This is a list of Non Tattle Datasets That Are Relevant to Social Media in India. This list isn't exhaustive. We have focused on datasets that account for multi-modal and multi-lingual nature of social media in India.

Open Access:

  • A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian and Indian Elections. ICWSM 2020. Paper. Dataset.

  • Claim Matching Beyond English to Scale Global Fact-Checkings. ACL 2021. Paper. Dataset

  • Short is the Road that Leads from Fear to Hate: Fear Speech in Indian WhatsApp Groups. Arxiv Preprint. Paper. Dataset

Available on Request:

  • Sharechat Data referenced in Characterising User Content on a Multi-Lingual Social Network. Published at ICWSM 2020. Paper. Dataset.
  • The CoronaVirusFacts Alliance Database. About the Alliance. Database.

Want to Add a Dataset Here?

Make a pull request with suggested additions, holler at us on Twitter @tattlemade or drop us a line at

Text and illustrations on the website is licensed under Creative Commons 4.0 License. The code is licensed under GPL. For data, please look at respective licenses.