TRACE: Real-Time Multimodal Common Ground Tracking in Situated Collaborative Dialogues
Hannah VanderHoeven
Brady Bhalla
Ibrahim Khebour
Austin Youngren
Videep Venkatesha
Mariah Bradford
Jack Fitzgerald
Carlos Mabrey
Jingxuan Tu
Yifan Zhu
Kenneth Lai
Changsoo Jung
James Pustejovsky
Nikhil Krishnaswamy
Abstract
We present TRACE, a novel system for live *common ground* tracking in situated collaborative tasks. With a focus on fast, real-time performance, TRACE tracks the speech, actions, gestures, and visual attention of participants, uses these multimodal inputs to determine the set of task-relevant propositions that have been raised as the dialogue progresses, and tracks the group's epistemic position and beliefs toward them as the task unfolds. Amid increased interest in AI systems that can mediate collaborations, TRACE represents an important step forward for agents that can engage with multiparty, multimodal discourse.
View on arXiv@article{vanderhoeven2025_2503.09511, title={ TRACE: Real-Time Multimodal Common Ground Tracking in Situated Collaborative Dialogues }, author={ Hannah VanderHoeven and Brady Bhalla and Ibrahim Khebour and Austin Youngren and Videep Venkatesha and Mariah Bradford and Jack Fitzgerald and Carlos Mabrey and Jingxuan Tu and Yifan Zhu and Kenneth Lai and Changsoo Jung and James Pustejovsky and Nikhil Krishnaswamy }, journal={arXiv preprint arXiv:2503.09511}, year={ 2025 } }
Comments on this paper