ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.02401
  4. Cited By
Vision Guided Generative Pre-trained Language Models for Multimodal
  Abstractive Summarization

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization

6 September 2021
Tiezheng Yu
Wenliang Dai
Zihan Liu
Pascale Fung
ArXivPDFHTML

Papers citing "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization"

18 / 18 papers shown
Title
Large Scale Generative Multimodal Attribute Extraction for E-commerce
  Attributes
Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes
Anant Khandelwal
Happy Mittal
S. Kulkarni
D. Gupta
17
9
0
01 Jun 2023
Learning Summary-Worthy Visual Representation for Abstractive
  Summarization in Video
Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video
Zenan Xu
Xiaojun Meng
Yasheng Wang
Qinliang Su
Zexuan Qiu
Xin Jiang
Qun Liu
22
3
0
08 May 2023
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Chunpu Xu
Hanzhuo Tan
Jing Li
Piji Li
21
5
0
26 Feb 2023
Summary-Oriented Vision Modeling for Multimodal Abstractive
  Summarization
Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization
Yunlong Liang
Fandong Meng
Jinan Xu
Jiaan Wang
Yufeng Chen
Jie Zhou
25
19
0
15 Dec 2022
Grafting Pre-trained Models for Multimodal Headline Generation
Grafting Pre-trained Models for Multimodal Headline Generation
Lingfeng Qiao
Chen Wu
Ye Liu
Haoyuan Peng
Di Yin
Bo Ren
35
5
0
14 Nov 2022
Plausible May Not Be Faithful: Probing Object Hallucination in
  Vision-Language Pre-training
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Dan Su
Pascale Fung
MLLM
VLM
29
62
0
14 Oct 2022
Hierarchical3D Adapters for Long Video-to-text Summarization
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
27
12
0
10 Oct 2022
Every picture tells a story: Image-grounded controllable stylistic story
  generation
Every picture tells a story: Image-grounded controllable stylistic story generation
Holy Lovenia
Bryan Wilie
Romain Barraud
Samuel Cahyawijaya
Willy Chung
Pascale Fung
19
8
0
04 Sep 2022
Interpreting Song Lyrics with an Audio-Informed Pre-trained Language
  Model
Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model
Yixiao Zhang
Junyan Jiang
Gus Xia
S. Dixon
25
9
0
24 Aug 2022
Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car
  Commands
Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands
Wenliang Dai
Samuel Cahyawijaya
Tiezheng Yu
Elham J. Barezi
Pascale Fung
16
1
0
06 Jul 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and
  Metrics
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
73
122
0
03 Jul 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge
  Distillation
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
22
90
0
12 Mar 2022
Learning Cluster Patterns for Abstractive Summarization
Learning Cluster Patterns for Abstractive Summarization
Sung-Guk Jo
Jeong-Jae Kim
Byung-Won On
19
3
0
22 Feb 2022
Speech Summarization using Restricted Self-Attention
Speech Summarization using Restricted Self-Attention
Roshan S. Sharma
Shruti Palaskar
A. Black
Florian Metze
24
33
0
12 Oct 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
256
525
0
04 Feb 2021
VinVL: Revisiting Visual Representations in Vision-Language Models
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang
Xiujun Li
Xiaowei Hu
Jianwei Yang
Lei Zhang
Lijuan Wang
Yejin Choi
Jianfeng Gao
ObjD
VLM
260
157
0
02 Jan 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,923
0
17 Aug 2015
1