Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.09812
Cited By
Voice Activity Projection: Self-supervised Learning of Turn-taking Events
19 May 2022
Erik Ekstedt
Gabriel Skantze
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Voice Activity Projection: Self-supervised Learning of Turn-taking Events"
16 / 16 papers shown
Title
Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals
Yuxin Lin
Yinglin Zheng
Ming Zeng
Wangzheng Shi
55
0
0
19 May 2025
Speculative End-Turn Detector for Efficient Speech Chatbot Assistant
Hyunjong Ok
Suho Yoo
Jaeho Lee
85
0
0
30 Mar 2025
Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
K. Inoue
Divesh Lala
Gabriel Skantze
Tatsuya Kawahara
48
2
0
21 Oct 2024
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
127
2,879
0
14 Jun 2021
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
162
5,734
0
20 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
469
41,106
0
28 May 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
208
18,607
0
13 Feb 2020
Unsupervised pretraining transfers well across languages
M. Rivière
Armand Joulin
Pierre-Emmanuel Mazaré
Emmanuel Dupoux
SSL
VLM
28
206
0
07 Feb 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
242
42,038
0
03 Dec 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
943
93,936
0
11 Oct 2018
Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs
Matthew Roddy
Gabriel Skantze
N. Harte
8
40
0
31 Aug 2018
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRL
SSL
225
10,152
0
10 Jul 2018
Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs
Matthew Roddy
Gabriel Skantze
N. Harte
13
34
0
29 Jun 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
435
129,831
0
12 Jun 2017
Cyclical Learning Rates for Training Neural Networks
L. Smith
ODL
118
2,515
0
03 Jun 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
806
149,474
0
22 Dec 2014
1