Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.03040
Cited By
Safety Modulation: Enhancing Safety in Reinforcement Learning through Cost-Modulated Rewards
3 April 2025
Hanping Zhang
Yuhong Guo
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Safety Modulation: Enhancing Safety in Reinforcement Learning through Cost-Modulated Rewards"
22 / 22 papers shown
Title
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
107
28
0
18 Jun 2024
Motion-Guided Masking for Spatiotemporal Representation Learning
D. Fan
Jue Wang
Shuai Liao
Yi Zhu
Vimal Bhat
H. Santos-Villalobos
M. Rohith
Xinyu Li
VGen
78
22
0
24 Aug 2023
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Sihan Chen
Handong Li
Qunbo Wang
Zijia Zhao
Ming-Ting Sun
Xinxin Zhu
Qingbin Liu
177
110
0
29 May 2023
Turbo Training with Token Dropout
Tengda Han
Weidi Xie
Andrew Zisserman
ViT
74
11
0
10 Oct 2022
SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
Gang Li
Heliang Zheng
Daqing Liu
Chaoyue Wang
Fuchun Sun
Changwen Zheng
97
130
0
21 Jun 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
177
1,309
0
04 May 2022
Learning To Recognize Procedural Activities with Distant Supervision
Xudong Lin
Fabio Petroni
Gedas Bertasius
Marcus Rohrbach
Shih-Fu Chang
Lorenzo Torresani
88
87
0
26 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
477
7,837
0
11 Nov 2021
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu
Karan Goel
Christopher Ré
217
1,832
0
31 Oct 2021
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
117
170
0
21 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
297
2,845
0
15 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
735
6,139
0
29 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
135
1,265
0
22 Apr 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
225
2,168
0
29 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.0K
29,926
0
26 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
135
664
0
11 Feb 2021
Bootstrap your own latent: A new approach to self-supervised Learning
Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre Harvey Richemond
...
M. G. Azar
Bilal Piot
Koray Kavukcuoglu
Rémi Munos
Michal Valko
SSL
408
6,849
0
13 Jun 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
393
18,897
0
13 Feb 2020
VideoGraph: Recognizing Minutes-Long Human Activities in Videos
Noureldien Hussein
E. Gavves
A. Smeulders
146
77
0
13 May 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLM
SSL
85
1,250
0
03 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,324
0
11 Oct 2018
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
693
31,571
0
16 Jan 2013
1