Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.17069
Cited By
PVChat: Personalized Video Chat with One-Shot Learning
21 March 2025
Yufei Shi
Weilong Yan
Gang Xu
Yumeng Li
Yongqian Li
Zechao Li
Fei Richard Yu
Ming Li
Si Yong Yeo
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PVChat: Personalized Video Chat with One-Shot Learning"
25 / 25 papers shown
Title
Personalized Large Vision-Language Models
Chau Pham
Hoang Phan
David Doermann
Yunjie Tian
VLM
100
4
0
23 Dec 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
82
17
0
15 Oct 2024
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Jianzhu Guo
Dingyun Zhang
Xiaoqiang Liu
Zhizhou Zhong
Yuan Zhang
Pengfei Wan
Di Zhang
VGen
98
62
0
03 Jul 2024
What matters when building vision-language models?
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
94
174
0
03 May 2024
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Zhe Chen
Weiyun Wang
Hao Tian
Shenglong Ye
Zhangwei Gao
...
Tong Lu
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
MLLM
VLM
108
627
0
25 Apr 2024
MyVLM: Personalizing VLMs for User-Specific Queries
Yuval Alaluf
Elad Richardson
Sergey Tulyakov
Kfir Aberman
Daniel Cohen-Or
MLLM
VLM
78
23
0
21 Mar 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
103
29
0
21 Feb 2024
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Zhen Li
Mingdeng Cao
Xintao Wang
Zhongang Qi
Ming-Ming Cheng
Ying Shan
DiffM
96
197
0
07 Dec 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
60
89
0
28 Nov 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
108
205
0
12 Jun 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
560
4,861
0
17 Apr 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,631
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
426
4,563
0
30 Jan 2023
Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari
Bin Zhang
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
147
872
0
08 Dec 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
279
2,861
0
25 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
161
1,889
0
02 Aug 2022
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog
Baolin Peng
Michel Galley
Pengcheng He
Chris Brockett
Lars Liden
E. Nouri
Zhou Yu
Bill Dolan
Jianfeng Gao
VLM
72
74
0
22 Jun 2022
General Facial Representation Learning in a Visual-Linguistic Manner
Yinglin Zheng
Hao Yang
Ting Zhang
Jianmin Bao
Dongdong Chen
Yangyu Huang
Lu Yuan
Dong Chen
Ming Zeng
Fang Wen
CVBM
185
172
0
06 Dec 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
477
10,367
0
17 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
955
29,436
0
26 Feb 2021
DeepFaceLab: Integrated, flexible and extensible face-swapping framework
Ivan Perov
Daiheng Gao
Nikolay Chervoniy
Kunlin Liu
Sugasa Marangonda
...
Jian Jiang
Sheng Zhang
Pingyu Wu
Wenbo Zhou
Weiming Zhang
CVBM
67
226
0
12 May 2020
You Impress Me: Dialogue Generation via Mutual Persona Perception
Qian Liu
Yihong Chen
B. Chen
Jian-Guang Lou
Zixuan Chen
Bin Zhou
Dongmei Zhang
69
169
0
11 Apr 2020
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Saizheng Zhang
Emily Dinan
Jack Urbanek
Arthur Szlam
Douwe Kiela
Jason Weston
105
1,459
0
22 Jan 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
713
131,652
0
12 Jun 2017
FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff
Dmitry Kalenichenko
James Philbin
3DH
379
13,145
0
12 Mar 2015
1