54
1

Everyday Speech in the Indian Subcontinent

Abstract

India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End-to-End (E2E) framework for multilingual synthesis. The Indian language text is first converted to CLS. This approach enables seamless code switching across 13 Indian languages and English in a given native speaker's voice, which corresponds to everyday speech in the Indian subcontinent, where the population is multilingual.

View on arXiv
@article{p2025_2410.10508,
  title={ Everyday Speech in the Indian Subcontinent },
  author={ Utkarsh P },
  journal={arXiv preprint arXiv:2410.10508},
  year={ 2025 }
}
Comments on this paper