Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.16270
Cited By
Social Media Fashion Knowledge Extraction as Captioning
28 September 2023
Yifei Yuan
Wenxuan Zhang
Yang Deng
Wai Lam
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Social Media Fashion Knowledge Extraction as Captioning"
17 / 17 papers shown
Title
Aspect Sentiment Quad Prediction as Paraphrase Generation
Wenxuan Zhang
Yang Deng
Xin Li
Yifei Yuan
Lidong Bing
W. Lam
274
190
0
02 Oct 2021
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
116
782
0
26 May 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
418
4,987
0
24 Feb 2021
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
118
503
0
01 May 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
121
1,944
0
13 Apr 2020
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Antoine Miech
Jean-Baptiste Alayrac
Lucas Smaira
Ivan Laptev
Josef Sivic
Andrew Zisserman
VGen
SSL
128
712
0
13 Dec 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
445
20,298
0
23 Oct 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
163
1,666
0
22 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
231
3,693
0
06 Aug 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
232
8,444
0
19 Jun 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
115
1,200
0
07 Jun 2019
Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks
Sijie Yan
Ziwei Liu
Ping Luo
Shi Qiu
Xiaogang Wang
Xiaoou Tang
42
59
0
07 Aug 2017
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
Kaipeng Zhang
Zhanpeng Zhang
Zhifeng Li
Yu Qiao
CVBM
176
4,969
0
11 Apr 2016
Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network
Junshi Huang
Rogerio Feris
Qiang Chen
Shuicheng Yan
60
414
0
29 May 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
211
5,497
0
03 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
348
10,079
0
10 Feb 2015
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
249
6,035
0
17 Nov 2014
1