ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.07247
  4. Cited By
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with
  Omni Retrieval

CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval

15 February 2022
Licheng Yu
Jun Chen
Animesh Sinha
Mengjiao MJ Wang
Hugo Chen
Tamara L. Berg
Ning Zhang
    VLM
ArXivPDFHTML

Papers citing "CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval"

10 / 10 papers shown
Title
Multi-Modality Transformer for E-Commerce: Inferring User Purchase Intention to Bridge the Query-Product Gap
Srivatsa Mallapragada
Ying Xie
Varsha Rani Chawan
Zeyad Hailat
Yuanbo Wang
36
0
0
28 Jan 2025
Model-as-a-Service (MaaS): A Survey
Model-as-a-Service (MaaS): A Survey
Wensheng Gan
Shicheng Wan
Philip S. Yu
21
21
0
10 Nov 2023
LRVS-Fashion: Extending Visual Search with Referring Instructions
LRVS-Fashion: Extending Visual Search with Referring Instructions
Simon Lepage
Jérémie Mary
David Picard
23
1
0
05 Jun 2023
Unified Vision-Language Representation Modeling for E-Commerce
  Same-Style Products Retrieval
Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval
Ben Chen
Linbo Jin
Xinxin Wang
D. Gao
Wen Jiang
Wei Ning
14
3
0
10 Feb 2023
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
43
525
0
13 Jun 2022
Relational Representation Learning in Visually-Rich Documents
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
16
12
0
05 May 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
302
7,434
0
11 Nov 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
245
577
0
22 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,693
0
11 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
927
0
24 Sep 2019
1