Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Audio-Visual Activity Guided Cross-Modal Identity Association for Active
Speaker DetectionIEEE Open Journal of Signal Processing (JOSP), 2022 Rahul Sharma Shrikanth Narayanan |
Using Active Speaker Faces for Diarization in TV shows Rahul Sharma Shrikanth Narayanan |