LRW-Persian: The Largest Persian Lip Reading Dataset

About

We introduce LRW-Persian, the largest in-the-wild Persian word-level lipreading dataset, with every word instance represented by at least 100 training samples and 30 test samples. Designed as a benchmark-ready resource, LRW-Persian provides speaker-disjoint training and test splits, wide regional and dialectal coverage, and rich per-clip metadata including head pose, age, and gender.

Word Instances

0

Video Samples

0

Hours

0

Download

Get the dataset splits

Train split

Download Train

Test split

Download Test

Word list

Download Word List

Publication

LRW-Persian: Lip-reading in the Wild Dataset for Persian Language
View on arXiv
@misc{taghizadeh2025lrwpersianlipreadingwilddataset,
  title={LRW-Persian: Lip-reading in the Wild Dataset for Persian Language},
  author={Zahra Taghizadeh and Mohammad Shahverdikondori and Arian Noori and Alireza Dadgarnia},
  year={2025},
  eprint={2510.22716},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2510.22716},
}

Contact

Reach out with any questions