ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.21276
  4. Cited By
GPT-4o System Card

GPT-4o System Card

25 October 2024
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
Adam Perelman
Aditya A. Ramesh
Aidan Clark
AJ Ostrow
Akila Welihinda
Alan Hayes
Alec Radford
Aleksander Mądry
Alex Baker-Whitcomb
Alex Beutel
Alex Borzunov
Alex Carney
Alex Chow
Alex Kirillov
Alex Nichol
Alex Paino
Alex Renzin
Alex Tachard Passos
A. Kirillov
Alexi Christakis
Alexis Conneau
Ali Kamali
Allan Jabri
Allison Moyer
Allison Tam
Amadou Crookes
Amin Tootoochian
Amin Tootoonchian
A. Kumar
Andrea Vallone
Andrej Karpathy
Andrew Braunstein
Andrew Cann
Andrew Codispoti
Andrew Galu
Andrew Kondrich
Andrew Tulloch
Andrey Mishchenko
Angela Baek
Angela Jiang
Antoine Pelisse
Antonia Woodford
Anuj Gosalia
Arka Dhar
Ashley Pantuliano
A. Nayak
Avital Oliver
Barret Zoph
Behrooz Ghorbani
Ben Leimberger
Ben Rossen
Ben Sokolowsky
Ben Wang
Benjamin Zweig
Beth Hoover
Blake Samic
Bob McGrew
B. S.
Bogo Giertler
Bowen Cheng
Brad Lightcap
Brandon Walkin
Brendan Quinn
Brian Guarraci
Brian Hsu
Bright Kellogg
Brydon Eastman
Camillo Lugaresi
Carroll L. Wainwright
Cary Bassin
Cary Hudson
Casey Chu
Chad Nelson
Chak Ming Li
Chan Jun Shern
Channing Conger
Charlotte Barette
Chelsea Voss
Chen Ding
Cheng Lu
Chong Zhang
Chris Beaumont
Chris Hallacy
Chris Koch
C. Gibson
Christina Kim
C. Choi
C. McLeavey
Christopher Hesse
Claudia Fischer
Clemens Winter
Coley Czarnecki
Colin Jarvis
Colin Wei
Constantin Koumouzelis
Dane Sherburn
Daniel Kappler
Daniel Levin
Daniel Levy
David Carr
David Farhi
David A. Mély
David Robinson
David Sasaki
Denny Jin
Dev Valladares
Dimitris Tsipras
D. Li
D. Nguyen
Duncan Findlay
Edede Oiwoh
E. Wong
Ehsan Asdar
Elizabeth Proehl
E. Yang
Eric Antonow
Eric Kramer
Eric Peterson
Eric Sigler
Eric Wallace
E. Brevdo
Evan Mays
Farzad Khorasani
F. Such
Filippo Raso
Francis Zhang
Fred von Lohmann
Freddie Sulit
Gabriel Goh
Gene Oden
Geoff Salmon
Giulio Starace
Greg Brockman
Hadi Salman
Haiming Bao
Haitang Hu
Hannah Wong
Haoyu Wang
Heather Schmidt
Heather Whitney
Heewoo Jun
Hendrik Kirchner
Henrique Pondé de Oliveira Pinto
Hongyu Ren
Huiwen Chang
Hyung Won Chung
Ian Kivlichan
Ian O'Connell
Ian O'Connell
Ian Osband
Ian Silber
Ian Sohl
Ibrahim Okuyucu
Ikai Lan
Ilya Kostrikov
Ilya Sutskever
I. Kanitscheider
Ishaan Gulrajani
Jacob Coxon
Jacob Menick
J. Pachocki
James Aung
James Betker
James Crooks
James Lennon
J. Kiros
Jan Leike
Jane Park
Jason Kwon
Jason Phang
Jason Teplitz
Jason W. Wei
Jason Wolfe
Jianfei Chen
Jeff Harris
Jenia Varavva
Jessica Gan Lee
Jessica Shieh
Ji Lin
Jiahui Yu
Jiayi Weng
Jie Tang
Jieqi Yu
Joanne Jang
Joaquin Quiñonero Candela
Joe Beutler
Joe Landers
Joel Parish
Johannes Heidecke
John Schulman
Jonathan Lachman
Jonathan McKay
J. Uesato
Jonathan Ward
Jong Wook Kim
Joost Huizinga
Jordan Sitkin
Jos Kraaijeveld
Josh Gross
Josh Kaplan
Josh Bleecher Snyder
Joshua Achiam
Joy Jiao
J. Lee
Juntang Zhuang
Justyn Harriman
Kai Fricke
Kai Hayashi
K. Singhal
Katy Shi
Kemal Kurniawan
Kayla Wood
Kendra Rimbach
K. Hsu
Kenny Nguyen
Keren Gu-Lemberg
Kevin Button
Kevin Liu
Kiel Howe
Krithika Muthukumar
Kyle Luther
Lama Ahmad
Larry Kai
Lauren Itow
Lauren Workman
Leher Pathak
Lu Chen
Li Jing
Lia Guy
Liam Fedus
Liang Zhou
Lien Mamitsuka
Lilian Weng
Lindsay McCallum
Lindsey Held
Long Ouyang
Louis Feuvrier
Lu Zhang
Lukas Kondraciuk
Lukasz Kaiser
Luke Hewitt
Luke Metz
Lyric Doshi
Mada Aflak
Maddie Simens
Madelaine Boyd
Madeleine Thompson
Marat Dukhan
Mark Chen
Mark Gray
Mark Hudnall
Marvin Zhang
Marwan Aljubeh
Mateusz Litwin
Matthew Zeng
Max Johnson
Maya Shetty
Mayank Gupta
Meghan Shah
Mehmet Yatbaz
M. Yang
Mengchao Zhong
Mia Glaese
Mianna Chen
Michael Janner
Michael Lampe
Michael Petrov
Michael Wu
Michele Wang
Michelle Fradin
Michelle Pokrass
Miguel Castro
M. Castro
Mikhail Pavlov
Miles Brundage
Ming Wang
Minal Khan
Mira Murati
Mo Bavarian
Molly Lin
Murat Yesildal
Nacho Soto
N. Gimelshein
Natalie Cone
Natalie Staudacher
Natalie Summers
Natan LaFontaine
Neil Chowdhury
Nick Ryder
Nick Stathas
Nick Turley
Nik Tezak
Niko Felix
Nithanth Kudige
N. Keskar
Noah Deutsch
Noel Bundick
Nora Puckett
Ofir Nachum
Ola Okelola
Oleg Boiko
O. Murk
Oliver Jaffe
Olivia Watkins
Olivier Godement
Owen Campbell-Moore
Patrick Chao
Paul McMillan
Pavel Belov
Peng Su
Peter Bak
Peter Bakkum
Peter Deng
Peter Dolan
Peter Hoeschele
Peter Welinder
Phil Tillet
Philip Pronin
Philippe Tillet
Prafulla Dhariwal
Qiming Yuan
Rachel Dias
Rachel Lim
Rahul Arora
R. Troll
Randall Lin
Rapha Gontijo Lopes
Raul Puri
Reah Miyara
R. Leike
Renaud Gaubert
Reza Zamani
Ricky Wang
Rob Donnelly
Rob Honsby
Rocky Smith
Rohan Sahai
Rohit Ramchandani
Romain Huet
Rory Carmichael
Rowan Zellers
Roy Chen
Ruby Chen
Ruslan Nigmatullin
Ryan Cheu
Saachi Jain
Sam Altman
S. Schoenholz
Sam Toizer
Samuel Miserendino
Sandhini Agarwal
Sara Culver
Scott Ethersmith
Scott Gray
Sean Grove
Sean Metzger
Shamez Hermani
Shantanu Jain
Shengjia Zhao
Sherwin Wu
Shino Jomoto
Shirong Wu
Shri Kiran Srinivasan
X. Xia
Sonia Phene
Spencer Papay
Srinivas Narayanan
Steve Coffey
Seanie Lee
Stewart Hall
S. Balaji
Tal Broda
Tal Stramer
Tao Xu
Tarun Gogineni
Taya Christianson
Ted Sanders
Tejal Patwardhan
Thomas Cunninghman
Thomas Degry
Thomas Dimson
Thomas Raoux
Thomas Shadwell
Tianhao Zheng
Todd Underwood
Todor Markov
Toki Sherbakov
Tom Rubin
Tom Stasi
Tomer Kaftan
Tristan Heywood
Troy Peterson
Tyce Walters
Tyna Eloundou
Valerie Qi
Veit Moeller
Vinnie Monaco
Vishal Kuo
Vlad Fomenko
W. Chang
W. J. Zheng
Wenda Zhou
Wesam Manassra
Will Sheu
Wojciech Zaremba
Yash Patil
Y. Qian
Yongjik Kim
Youlong Cheng
Yu Zhang
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
    MLLM
ArXivPDFHTML

Papers citing "GPT-4o System Card"

50 / 206 papers shown
Title
MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control
MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control
Mingqi Shao
Feng Xiong
Zhaoxu Sun
Mu Xu
DiffM
12
0
0
19 May 2025
PANORAMA: A synthetic PII-laced dataset for studying sensitive data memorization in LLMs
PANORAMA: A synthetic PII-laced dataset for studying sensitive data memorization in LLMs
Sriram Selvam
Anneswa Ghosh
7
0
0
18 May 2025
BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind
BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind
Rikunari Sagara
Koichiro Terao
Naoto Iwahashi
LM&Ro
12
0
0
18 May 2025
CompBench: Benchmarking Complex Instruction-guided Image Editing
CompBench: Benchmarking Complex Instruction-guided Image Editing
Bohan Jia
Wenxuan Huang
Yuntian Tang
Junbo Qiao
Jincheng Liao
...
Lin Chen
Fei Zhao
Zihan Wang
Yuan Xie
Shaohui Lin
CoGe
14
0
0
18 May 2025
SOCIA: An End-to-End Agentic Framework for Automated Cyber-Physical-Social Simulator Generation
SOCIA: An End-to-End Agentic Framework for Automated Cyber-Physical-Social Simulator Generation
Yuncheng Hua
Ji Miao
Mehdi Jafari
Jianxiang Xie
Hao Xue
Flora D. Salim
7
0
0
17 May 2025
SafeVid: Toward Safety Aligned Video Large Multimodal Models
SafeVid: Toward Safety Aligned Video Large Multimodal Models
Yixu Wang
Jiaxin Song
Yifeng Gao
Xin Wang
Yang Yao
Yan Teng
Xingjun Ma
Yingchun Wang
Yu-Gang Jiang
12
0
0
17 May 2025
Group-in-Group Policy Optimization for LLM Agent Training
Group-in-Group Policy Optimization for LLM Agent Training
Lang Feng
Zhenghai Xue
Tingcong Liu
Bo An
OffRL
17
0
0
16 May 2025
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
Xueliang Wang
Ziyi Zhao
Siyu Ren
Shao Zhang
Song Li
...
Lin Qiu
Guanglu Wan
Xuezhi Cao
Xunliang Cai
Weinan Zhang
ALM
32
0
0
16 May 2025
CAMEO: Collection of Multilingual Emotional Speech Corpora
CAMEO: Collection of Multilingual Emotional Speech Corpora
Iwona Christop
Maciej Czajka
19
0
0
16 May 2025
Disentangling Reasoning and Knowledge in Medical Large Language Models
Disentangling Reasoning and Knowledge in Medical Large Language Models
Rahul Thapa
Qingyang Wu
Kevin Wu
Harrison Zhang
Angela Zhang
...
Joseph Boen
Shriya Reddy
Ben Athiwaratkun
Shuaiwen Leon Song
James Zou
ELM
AI4MH
LM&MA
LRM
25
0
0
16 May 2025
PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language
PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language
Ijazul Haq
Yingjie Zhang
Irfan Ali Khan
34
0
0
15 May 2025
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Tianyu Huai
Jie Zhou
Yuxuan Cai
Qin Chen
Wen Wu
Xingjiao Wu
Xipeng Qiu
Liang He
CLL
33
0
0
15 May 2025
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models
Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models
Lucas Choi
Ross Greer
VLM
33
0
0
14 May 2025
GlobalMood: A cross-cultural benchmark for music emotion recognition
GlobalMood: A cross-cultural benchmark for music emotion recognition
Harin Lee
Elif Celen
Peter M. C. Harrison
Manuel Anglada-Tort
Pol van Rijn
Minsu Park
Marc Schönwiesner
Nori Jacoby
32
0
0
14 May 2025
CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding
CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding
Wenxuan Ma
Xiaoge Cao
Yujie Zhang
Chaofan Zhang
Shaobo Yang
Peng Hao
Bin Fang
Yinghao Cai
Shaowei Cui
Shuo Wang
36
0
0
13 May 2025
LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs
LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs
K. M. S. Islam
A. S. Nipu
Jiawei Wu
Praveen Madiraju
46
0
0
13 May 2025
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
Yifu Yuan
Haiqin Cui
Yibin Chen
Zibin Dong
Fei Ni
Longxin Kou
Jinyi Liu
Pengyi Li
Yan Zheng
Jianye Hao
31
0
0
13 May 2025
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Rushi Qiang
Yuchen Zhuang
Yinghao Li
D. Kilman
Rongzhi Zhang
...
Ian Shu-Hei Wong
Sherry Yang
Percy Liang
Chao Zhang
Bo Dai
ELM
41
0
0
12 May 2025
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Weiyu Li
Xuanyang Zhang
Zheng Sun
Di Qi
Yiming Li
...
Zeming Li
Gang Yu
Xiangyu Zhang
Daxin Jiang
Ping Tan
48
0
0
12 May 2025
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
29
0
0
12 May 2025
UniDiffGrasp: A Unified Framework Integrating VLM Reasoning and VLM-Guided Part Diffusion for Open-Vocabulary Constrained Grasping with Dual Arms
UniDiffGrasp: A Unified Framework Integrating VLM Reasoning and VLM-Guided Part Diffusion for Open-Vocabulary Constrained Grasping with Dual Arms
Xueyang Guo
Hongwei Hu
Chengye Song
J. Chen
Zilin Zhao
Yu Fu
Bowen Guan
Zhenze Liu
31
0
0
11 May 2025
LLM-Augmented Chemical Synthesis and Design Decision Programs
LLM-Augmented Chemical Synthesis and Design Decision Programs
Haorui Wang
Jeff Guo
Lingkai Kong
R. Ramprasad
Philippe Schwaller
Yuanqi Du
Chao Zhang
36
0
0
11 May 2025
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration
Honglong Yang
Shanshan Song
Yi Qin
Lehan Wang
Haonan Wang
Xinpeng Ding
Qixiang Zhang
Bodong Du
Xuelong Li
LM&MA
34
0
0
11 May 2025
System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection
System Prompt Poisoning: Persistent Attacks on Large Language Models Beyond User Injection
Jiawei Guo
Haipeng Cai
SILM
AAML
29
0
0
10 May 2025
Describe Anything in Medical Images
Describe Anything in Medical Images
Xi Xiao
Yunbei Zhang
Thanh-Huy Nguyen
Ba Thinh Lam
Janet Wang
...
Xingjian Li
Xiaobei Wang
Hao Xu
Tianming Liu
Min Xu
MedIm
VLM
51
0
0
09 May 2025
LLMs Get Lost In Multi-Turn Conversation
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
50
1
0
09 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yong Li
Jiaheng Liu
Xueliang Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
73
0
0
08 May 2025
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
Jaehyun Jeon
Janghan Yoon
Minsoo Kim
Sumin Shim
Yejin Choi
Hanbin Kim
Youngjae Yu
AAML
47
0
0
08 May 2025
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
Haibo Wang
Bo Feng
Zhengfeng Lai
Mingze Xu
Shiyu Li
Weifeng Ge
Afshin Dehghan
Meng Cao
Ping Huang
OffRL
57
0
0
08 May 2025
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
Lennart Luettgau
Harry Coppock
Magda Dubois
Christopher Summerfield
Cozmin Ududec
31
0
0
08 May 2025
R^3-VQA: "Read the Room" by Video Social Reasoning
R^3-VQA: "Read the Room" by Video Social Reasoning
Lixing Niu
Jiapeng Li
Xingping Yu
Shu Wang
Ruining Feng
Bo Wu
Ping Wei
Yansen Wang
Lifeng Fan
51
0
0
07 May 2025
"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments
"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments
Zhe Zhang
Zhen Sun
Zhenru Zhang
Zifan Peng
Yuemeng Zhao
Zihan Wang
Zeren Luo
Ruiting Zuo
Xinlei He
42
0
0
07 May 2025
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang
Zhimin Li
Yuhang Zang
Chunyu Wang
Qinglin Lu
Cheng Jin
Jize Wang
LRM
48
2
0
06 May 2025
Graph Drawing for LLMs: An Empirical Evaluation
Graph Drawing for LLMs: An Empirical Evaluation
Walter Didimo
Fabrizio Montecchiani
Tommaso Piselli
54
0
0
06 May 2025
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Denis Shcherba
Eckart Cobo-Briesewitz
Cornelius V. Braun
Marc Toussaint
LM&Ro
LRM
46
0
0
06 May 2025
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu
Yiran Yang
Houxing Ren
Haotian Hou
Han Xiao
Ke Wang
Weikang Shi
Aojun Zhou
Mingjie Zhan
Yiming Li
LLMAG
47
0
0
06 May 2025
LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs
LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs
Xinyuan Zhang
Yonglin Tian
Fei Lin
Yue Liu
Jing Ma
Kornélia Sára Szatmáry
Fei Wang
48
0
0
06 May 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
Shanghang Zhang
LM&Ro
48
0
0
06 May 2025
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
Yemin Shi
Yu Shu
Siwei Dong
Guangyi Liu
Jaward Sesay
Jingwen Li
Zhiting Hu
AuLLM
VLM
50
0
0
05 May 2025
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Lu Ling
C. Lin
Nayeon Lee
Yin Cui
Y. Zeng
Yichen Sheng
Yunhao Ge
Ming Liu
Aniket Bera
Zhaoshuo Li
VGen
3DV
61
0
0
05 May 2025
RM-R1: Reward Modeling as Reasoning
RM-R1: Reward Modeling as Reasoning
Xiusi Chen
Gaotang Li
Zehua Wang
Bowen Jin
Cheng Qian
...
Y. Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLM
OffRL
LRM
188
1
0
05 May 2025
Sim2Real Transfer for Vision-Based Grasp Verification
Sim2Real Transfer for Vision-Based Grasp Verification
Pau Amargant
Peter Honig
Markus Vincze
34
0
0
05 May 2025
Adaptive Thinking via Mode Policy Optimization for Social Language Agents
Adaptive Thinking via Mode Policy Optimization for Social Language Agents
Minzheng Wang
You Li
Haozhao Wang
Xinghua Zhang
Nan Xu
Bingli Wu
Fei Huang
Haiyang Yu
Wenji Mao
LLMAG
LRM
43
1
0
04 May 2025
Improving Physical Object State Representation in Text-to-Image Generative Systems
Improving Physical Object State Representation in Text-to-Image Generative Systems
Tianle Chen
Chaitanya Chakka
Deepti Ghadiyaram
39
0
0
04 May 2025
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
Shuhang Xun
Sicheng Tao
Jiajun Li
Yibo Shi
Zhixin Lin
...
Shikang Wang
Yong-Jin Liu
Hao Zhang
Ying Ma
Xuming Hu
VLM
LRM
50
1
0
04 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li
Xiyang Wu
Guangyao Shi
Yubin Qin
Hongyang Du
Tianyi Zhou
Dinesh Manocha
Jordan Lee Boyd-Graber
MLLM
57
0
0
02 May 2025
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Nicy Scaria
Silvester John Joseph Kennedy
Diksha Seth
Ananya Thakur
Deepak N. Subramani
AI4Ed
28
0
0
02 May 2025
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
Jen-Hao Cheng
Vivian Wang
Huayu Wang
Huapeng Zhou
Yi-Hao Peng
...
Wenhao Chai
Yi-Ling Chen
Vibhav Vineet
Qin Cai
Lei Li
AI4TS
196
0
0
02 May 2025
ScaleTrack: Scaling and back-tracking Automated GUI Agents
ScaleTrack: Scaling and back-tracking Automated GUI Agents
Jing Huang
Zhixiong Zeng
Wenkang Han
Yufeng Zhong
Liming Zheng
Shuai Fu
Jingyuan Chen
Lin Ma
188
0
0
01 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
71
0
0
01 May 2025
12345
Next