Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.13500
Cited By
Learning Emotion Representations from Verbal and Nonverbal Communication
22 May 2023
Sitao Zhang
Yimu Pan
Jianmin Wang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Emotion Representations from Verbal and Nonverbal Communication"
33 / 33 papers shown
Title
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Wen Yin
Yong Wang
Guiduo Duan
Dongyang Zhang
Xin Hu
Yuan-Fang Li
Tao He
111
0
0
26 May 2025
Heterogeneous bimodal attention fusion for speech emotion recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua Reiss
92
0
0
09 Mar 2025
Bimodal Connection Attention Fusion for Speech Emotion Recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua D. Reiss
99
0
0
08 Mar 2025
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
99
325
0
04 Aug 2022
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Chien-Yao Wang
Alexey Bochkovskiy
H. Liao
ObjD
158
6,519
0
06 Jul 2022
M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Vishal M. Chudasama
P. Kar
Ashish Gudmalwar
Nirmesh J. Shah
Pankaj Wasnik
N. Onoe
41
113
0
05 Jun 2022
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
79
376
0
08 Dec 2021
Grounded Language-Image Pre-training
Liunian Harold Li
Pengchuan Zhang
Haotian Zhang
Jianwei Yang
Chunyuan Li
...
Lu Yuan
Lei Zhang
Lei Li
Kai-Wei Chang
Jianfeng Gao
ObjD
VLM
120
1,061
0
07 Dec 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
221
1,428
0
03 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
281
1,040
0
09 Oct 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
309
578
0
28 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
281
222
0
24 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
492
2,396
0
02 Sep 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
136
1,173
0
01 Apr 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
213
2,149
0
29 Mar 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
118
661
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
365
2,048
0
09 Feb 2021
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
106
503
0
01 May 2020
EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege's Principle
Trisha Mittal
P. Guhan
Uttaran Bhattacharya
Rohan Chandra
Aniket Bera
Tianyi Zhou
102
133
0
14 Mar 2020
Suppressing Uncertainties for Large-Scale Facial Expression Recognition
Kai Wang
Xiaojiang Peng
Jianfei Yang
Shijian Lu
Yu Qiao
87
487
0
24 Feb 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
358
18,752
0
13 Feb 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
199
12,074
0
13 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
230
7,504
0
02 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
615
24,431
0
26 Jul 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
108
1,200
0
07 Jun 2019
Multimodal Speech Emotion Recognition Using Audio and Text
Seunghyun Yoon
Seokhyun Byun
Kyomin Jung
45
295
0
10 Oct 2018
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations
Soujanya Poria
Devamanyu Hazarika
Navonil Majumder
Gautam Naik
Min Zhang
Rada Mihalcea
98
1,068
0
05 Oct 2018
Deep Facial Expression Recognition: A Survey
Shan Li
Weihong Deng
183
1,299
0
23 Apr 2018
EmotionLines: An Emotion Corpus of Multi-Party Conversations
Shengli Chen
Chao-Chun Hsu
Chuan-Chun Kuo
Ting-Hao 'Kenneth' Huang
Huang
Lun-Wei Ku
74
253
0
23 Feb 2018
MovieGraphs: Towards Understanding Human-Centric Situations from Videos
Paul Vicol
Makarand Tapaswi
Lluis Castrejon
Sanja Fidler
68
140
0
19 Dec 2017
Temporal Segment Networks for Action Recognition in Videos
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
112
810
0
08 May 2017
FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff
Dmitry Kalenichenko
James Philbin
3DH
370
13,143
0
12 Mar 2015
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
1