ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17476
  4. Cited By
The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts

The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts

23 May 2025
Yuchen Zhang
Yaxiong Wang
Yujiao Wu
Lianwei Wu
Li Zhu
    AAML
ArXiv (abs)PDFHTML

Papers citing "The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts"

40 / 40 papers shown
Title
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Yingzhe Peng
Gongrui Zhang
Miaosen Zhang
Zhiyuan You
Jie Liu
Qipeng Zhu
Kai Yang
Xingzhong Xu
Xin Geng
Xu Yang
LRMReLM
218
89
0
10 Mar 2025
Qwen2.5-VL Technical Report
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
375
699
0
20 Feb 2025
ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation
  Detecting and Grounding
ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
Zhenxing Zhang
Yijiao Wang
Lechao Cheng
Zhun Zhong
Dan Guo
Meng Wang
91
1
0
17 Dec 2024
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced
  Multimodal Understanding
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Z. F. Wu
Xiaokang Chen
Zizheng Pan
Xianglong Liu
Wen Liu
...
Xingkai Yu
Haowei Zhang
Liang Zhao
Yijiao Wang
Chong Ruan
MLLMVLMMoE
188
158
0
13 Dec 2024
GPT-4o System Card
GPT-4o System Card
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
...
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
MLLM
235
1,038
0
25 Oct 2024
MiRAGeNews: Multimodal Realistic AI-Generated News Detection
MiRAGeNews: Multimodal Realistic AI-Generated News Detection
Runsheng Huang
Liam Dugan
Yue Yang
Chris Callison-Burch
90
4
0
11 Oct 2024
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal
  Large Language Models
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
Jiabo Ye
Haiyang Xu
Haowei Liu
Anwen Hu
Ming Yan
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLMVLM
84
139
0
09 Aug 2024
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Shengbang Tong
Ellis L Brown
Penghao Wu
Sanghyun Woo
Manoj Middepogu
...
Xichen Pan
Austin Wang
Rob Fergus
Yann LeCun
Saining Xie
3DVMLLM
152
378
0
24 Jun 2024
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
107
172
0
10 Nov 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
268
1,908
0
28 Sep 2023
Detecting and Grounding Multi-Modal Media Manipulation and Beyond
Detecting and Grounding Multi-Modal Media Manipulation and Beyond
Rui Shao
Tianxing Wu
Jianlong Wu
Liqiang Nie
Ziwei Liu
72
27
0
25 Sep 2023
Unified Frequency-Assisted Transformer Framework for Detecting and
  Grounding Multi-Modal Manipulation
Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation
Huan Liu
Zichang Tan
Qiang Chen
Yunchao Wei
Yao-Min Zhao
Jingdong Wang
53
9
0
18 Sep 2023
Exploring Diverse In-Context Configurations for Image Captioning
Exploring Diverse In-Context Configurations for Image Captioning
Xu Yang
Yongliang Wu
Mingzhuo Yang
Haokun Chen
Xin Geng
MLLM
84
63
0
24 May 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
571
4,925
0
17 Apr 2023
Detecting and Grounding Multi-Modal Media Manipulation
Detecting and Grounding Multi-Modal Media Manipulation
Rui Shao
Tianxing Wu
Ziwei Liu
96
68
0
05 Apr 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,761
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLMMLLM
432
4,656
0
30 Jan 2023
Fine-Grained Face Swapping via Regional GAN Inversion
Fine-Grained Face Swapping via Regional GAN Inversion
Zhian Liu
Maomao Li
Yong Zhang
Cairong Wang
Qi Zhang
Jue Wang
Yongwei Nie
CVBM
91
58
0
25 Nov 2022
Detecting and Recovering Sequential DeepFake Manipulation
Detecting and Recovering Sequential DeepFake Manipulation
Rui Shao
Tianxing Wu
Ziwei Liu
AAML
96
42
0
05 Jul 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
418
3,610
0
29 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
148
251
0
07 Apr 2022
Zoom Out and Observe: News Environment Perception for Fake News
  Detection
Zoom Out and Observe: News Environment Perception for Fake News Detection
Qiang Sheng
Juan Cao
Xueyao Zhang
Rundong Li
Danding Wang
Yongchun Zhu
EgoV
98
77
0
21 Mar 2022
Faking Fake News for Real Fake News Detection: Propaganda-loaded
  Training Data Generation
Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation
Kung-Hsiang Huang
Kathleen McKeown
Preslav Nakov
Yejin Choi
Heng Ji
105
63
0
10 Mar 2022
High-Fidelity GAN Inversion for Image Attribute Editing
High-Fidelity GAN Inversion for Image Attribute Editing
Tengfei Wang
Yong Zhang
Yanbo Fan
Jue Wang
Qifeng Chen
DiffM
273
251
0
14 Sep 2021
N24News: A New Dataset for Multimodal News Classification
N24News: A New Dataset for Multimodal News Classification
Zhen Wang
Xu Shan
Xiangxie Zhang
Jie Yang
VLM
105
38
0
30 Aug 2021
SimSwap: An Efficient Framework For High Fidelity Face Swapping
SimSwap: An Efficient Framework For High Fidelity Face Swapping
Renwang Chen
Xuanhong Chen
Bingbing Ni
Yanhao Ge
CVBM
80
315
0
11 Jun 2021
NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media
NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media
Grace Luo
Trevor Darrell
Anna Rohrbach
53
94
0
13 Apr 2021
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen-Or
Dani Lischinski
CLIPVLM
138
1,211
0
31 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
1.0K
29,926
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLMCLIP
475
3,906
0
11 Feb 2021
ViLT: Vision-and-Language Transformer Without Convolution or Region
  Supervision
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Wonjae Kim
Bokyung Son
Ildoo Kim
VLMCLIP
139
1,761
0
05 Feb 2021
Learning Self-Consistency for Deepfake Detection
Learning Self-Consistency for Deepfake Detection
Tianchen Zhao
Xiang Xu
Mingze Xu
Hui Ding
Yuanjun Xiong
Wei Xia
SSL
91
272
0
16 Dec 2020
Visual News: Benchmark and Challenges in News Image Captioning
Visual News: Benchmark and Challenges in News Image Captioning
Fuxiao Liu
Yinghan Wang
Tianlu Wang
Vicente Ordonez
VLM
74
116
0
08 Oct 2020
Face X-ray for More General Face Forgery Detection
Face X-ray for More General Face Forgery Detection
Lingzhi Li
Jianmin Bao
Ting Zhang
Hao Yang
Dong Chen
Fang Wen
B. Guo
PICVCVBM
83
850
0
31 Dec 2019
Defending Against Neural Fake News
Defending Against Neural Fake News
Rowan Zellers
Ari Holtzman
Hannah Rashkin
Yonatan Bisk
Ali Farhadi
Franziska Roesner
Yejin Choi
AAML
140
1,032
0
29 May 2019
Good News, Everyone! Context driven entity-aware captioning for news
  images
Good News, Everyone! Context driven entity-aware captioning for news images
Ali Furkan Biten
Lluís Gómez
Marçal Rusiñol
Dimosthenis Karatzas
84
141
0
02 Apr 2019
Generalized Intersection over Union: A Metric and A Loss for Bounding
  Box Regression
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
S. Hamid Rezatofighi
Deyuan Li
JunYoung Gwak
Amir Sadeghian
Ian Reid
Silvio Savarese
154
4,186
0
25 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
Deep Multimodal Image-Repurposing Detection
Deep Multimodal Image-Repurposing Detection
Ekraam Sabir
Wael AbdAlmageed
Yue Wu
Premkumar Natarajan
38
54
0
20 Aug 2018
"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News
  Detection
"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection
William Yang Wang
HILMGNN
115
1,365
0
01 May 2017
1