Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02313
Cited By
v1
v2 (latest)
Personality-aware Human-centric Multimodal Reasoning: A New Task, Dataset and Baselines
5 April 2023
Yaochen Zhu
Xiangqing Shen
Rui Xia
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Personality-aware Human-centric Multimodal Reasoning: A New Task, Dataset and Baselines"
28 / 28 papers shown
Title
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
127
249
0
14 Nov 2023
Unifying Data Perspectivism and Personalization: An Application to Social Norms
Joan Plepi
Béla Neuendorf
Lucie Flek
Charles F Welch
103
21
0
26 Oct 2022
MBTI Personality Prediction for Fictional Characters Using Movie Scripts
Yisi Sang
Xiangyang Mou
Mo Yu
Dakuo Wang
Jing Li
Jeffrey Stanton
71
18
0
20 Oct 2022
Personality-Driven Social Multimedia Content Recommendation
Qi Yang
Sergey I. Nikolenko
Alfred Huang
Aleksandr Farseev
81
15
0
25 Jul 2022
Visual Abductive Reasoning
Chen Liang
Wenguan Wang
Tianfei Zhou
Yi Yang
LRM
81
40
0
26 Mar 2022
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Jack Hessel
Jena D. Hwang
Jinho Park
Rowan Zellers
Chandra Bhagavatula
Anna Rohrbach
Kate Saenko
Yejin Choi
ReLM
209
51
0
10 Feb 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
99
214
0
07 Jan 2022
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao
Xindi Shang
Angela Yao
Tat-Seng Chua
105
506
0
18 May 2021
Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues
Qingxiu Dong
Ziwei Qin
Heming Xia
Tian Feng
Shoujie Tong
...
Weidong Zhan
Sujian Li
Zhongyu Wei
Tianyu Liu
Zuifang Sui
LRM
62
11
0
15 May 2021
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
145
884
0
05 Apr 2021
AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
Madeleine Grunde-McLaughlin
Ranjay Krishna
Maneesh Agrawala
CoGe
78
119
0
30 Mar 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
684
41,563
0
22 Oct 2020
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
85
73
0
15 Oct 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
131
504
0
01 May 2020
GoodNewsEveryone: A Corpus of News Headlines Annotated with Emotions, Semantic Roles, and Reader Perception
Laura Ana Maria Bostan
Evgeny Kim
Roman Klinger
74
87
0
06 Dec 2019
Multimodal Video-based Apparent Personality Recognition Using Long Short-Term Memory and Convolutional Neural Networks
Süleyman Aslan
U. Güdükbay
CVBM
48
19
0
01 Nov 2019
Abductive Commonsense Reasoning
Chandra Bhagavatula
Ronan Le Bras
Chaitanya Malaviya
Keisuke Sakaguchi
Ari Holtzman
Hannah Rashkin
Doug Downey
Scott Yih
Yejin Choi
ReLM
LRM
88
463
0
15 Aug 2019
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
169
3,286
0
10 Dec 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
186
883
0
27 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,324
0
11 Oct 2018
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
101
642
0
05 Sep 2018
Focal Visual-Text Attention for Visual Question Answering
Junwei Liang
Lu Jiang
Liangliang Cao
Li Li
Alexander G. Hauptmann
63
111
0
05 Jun 2018
Investigating Audio, Visual, and Text Fusion Methods for End-to-End Automatic Personality Prediction
Onno P. Kampman
Elham J. Barezi
D. Bertero
Pascale Fung
87
96
0
02 May 2018
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
152
1,251
0
02 May 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
918
6,799
0
26 Sep 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
237
5,766
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.3K
194,641
0
10 Dec 2015
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
238
7,765
0
31 Aug 2015
1