Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.10862
Cited By
Recursively Summarizing Books with Human Feedback
22 September 2021
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Recursively Summarizing Books with Human Feedback"
50 / 226 papers shown
Title
Improved Algorithms for Differentially Private Language Model Alignment
Keyu Chen
Hao Tang
Qinglin Liu
Yizhao Xu
26
0
0
13 May 2025
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
Marco Arazzi
Vignesh Kumar Kembu
Antonino Nocera
V. P.
82
0
0
30 Apr 2025
Safety in Large Reasoning Models: A Survey
Cheng Wang
Y. Liu
B. Li
Duzhen Zhang
Z. Li
Junfeng Fang
Bryan Hooi
LRM
145
1
0
24 Apr 2025
Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization
Shouwei Ruan
Zhenyu Wu
Yao Huang
Ruochen Zhang
Yitong Sun
Caixin Kang
Xingxing Wei
EGVM
37
0
0
19 Apr 2025
Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking
Yu-Hang Wu
Yu-Jie Xiong
Jie-Zhang
AAML
30
0
0
08 Apr 2025
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
Tomek Korbak
Mikita Balesni
Buck Shlegeris
Geoffrey Irving
ELM
27
1
0
07 Apr 2025
Do LLM Evaluators Prefer Themselves for a Reason?
Wei-Lin Chen
Zhepei Wei
Xinyu Zhu
Shi Feng
Yu Meng
ELM
LRM
42
0
0
04 Apr 2025
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration
David Wan
Justin Chih-Yao Chen
Elias Stengel-Eskin
Mohit Bansal
LLMAG
LRM
60
1
0
19 Mar 2025
Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering
Rongzhi Zhu
Xiangyu Liu
Zequn Sun
Yiwei Wang
Wei Hu
LRM
RALM
KELM
90
1
0
21 Feb 2025
Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation
Sha Li
Naren Ramakrishnan
RALM
KELM
151
1
0
18 Feb 2025
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization
Xuefeng Liu
Songhao Jiang
Siyu Chen
Zhuoran Yang
Yuxin Chen
Ian T. Foster
Rick L. Stevens
LM&MA
OffRL
53
0
0
11 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou
Mirella Lapata
MoMe
186
1
0
03 Feb 2025
Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye
Ting Zhang
Wenbin Jiang
Hua Huang
OffRL
LRM
SyDa
63
1
0
03 Feb 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun-Xiong Xia
Tianyi Wu
Zhiwei Xue
Y. Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
131
14
0
30 Jan 2025
Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques
Natalia Zhang
X. Wang
Qiwen Cui
Runlong Zhou
Sham Kakade
Simon S. Du
OffRL
48
0
0
10 Jan 2025
MRJ-Agent: An Effective Jailbreak Agent for Multi-Round Dialogue
Fengxiang Wang
Ranjie Duan
Peng Xiao
Xiaojun Jia
Shiji Zhao
...
Hang Su
Jialing Tao
Hui Xue
J. Zhu
Hui Xue
LLMAG
56
7
0
08 Jan 2025
MedG-KRP: Medical Graph Knowledge Representation Probing
Gabriel R. Rosenbaum
L. Jiang
Ivaxi Sheth
Jaden Stryker
Anton Alyakin
...
Mustafa Nasir-Moin
Jan Moritz Niehues
Karl L. Sangwon
Eunice Yang
Eric Karl Oermann
AI4MH
67
0
0
14 Dec 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation
Ruiyu Xiao
Lei Wu
Yuhang Gou
Weinan Zhang
Ting Liu
26
0
0
30 Oct 2024
Negative-Prompt-driven Alignment for Generative Language Model
Shiqi Qiao
Ning Xv
Biao Liu
Xin Geng
ALM
SyDa
28
0
0
16 Oct 2024
BookWorm: A Dataset for Character Description and Analysis
Argyrios Papoudakis
Mirella Lapata
Frank Keller
23
1
0
14 Oct 2024
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
Hao Ma
Tianyi Hu
Zhiqiang Pu
Boyin Liu
Xiaolin Ai
Yanyan Liang
Min Chen
42
3
0
08 Oct 2024
PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models
Lemei Zhang
Peng Liu
Marcus Tiedemann Oekland Henriksboe
Even W. Lauvrak
J. Gulla
Heri Ramampiaro
29
1
0
04 Oct 2024
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
72
38
0
03 Oct 2024
Recursive Abstractive Processing for Retrieval in Dynamic Datasets
Charbel Chucri
Rami Azouz
Joachim Ott
45
0
0
02 Oct 2024
FlipAttack: Jailbreak LLMs via Flipping
Yue Liu
Xiaoxin He
Miao Xiong
Jinlan Fu
Shumin Deng
Bryan Hooi
AAML
34
12
0
02 Oct 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
37
2
0
25 Sep 2024
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation
B. Rappazzo
Yingheng Wang
Aaron Ferber
Carla P. Gomes
VLM
18
0
0
23 Sep 2024
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Jiahui Gao
Renjie Pi
Tianyang Han
Han Wu
Lanqing Hong
Lingpeng Kong
Xin Jiang
Zhenguo Li
41
5
0
17 Sep 2024
Semi-Supervised Reward Modeling via Iterative Self-Training
Yifei He
Haoxiang Wang
Ziyan Jiang
Alexandros Papangelis
Han Zhao
OffRL
36
2
0
10 Sep 2024
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model
Lifan Jiang
Zhihui Wang
Siqi Yin
Guangxiao Ma
Peng Zhang
Boxi Wu
DiffM
53
0
0
28 Aug 2024
Advances in Preference-based Reinforcement Learning: A Review
Youssef Abdelkareem
Shady Shehata
Fakhri Karray
OffRL
51
9
0
21 Aug 2024
LLMs can be Dangerous Reasoners: Analyzing-based Jailbreak Attack on Large Language Models
Shi Lin
Rongchang Li
Xun Wang
Changting Lin
Xun Wang
Wenpeng Xing
Meng Han
Meng Han
60
3
0
23 Jul 2024
Prover-Verifier Games improve legibility of LLM outputs
Jan Hendrik Kirchner
Yining Chen
Harri Edwards
Jan Leike
Nat McAleese
Yuri Burda
LRM
AAML
25
24
0
18 Jul 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
To Eun Kim
Alireza Salemi
Andrew Drozdov
Fernando Diaz
Hamed Zamani
56
7
0
17 Jul 2024
MERLIN: Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank Pipeline
D. Han
Eunhwan Park
Gisang Lee
Adam Lee
Nojun Kwak
40
2
0
17 Jul 2024
Preference-Guided Reinforcement Learning for Efficient Exploration
Guojian Wang
Faguo Wu
Xiao Zhang
Tianyuan Chen
Xuyang Chen
Lin Zhao
38
0
0
09 Jul 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
26
4
0
01 Jul 2024
Beyond Human Preferences: Exploring Reinforcement Learning Trajectory Evaluation and Improvement through LLMs
Zichao Shen
Tianchen Zhu
Qingyun Sun
Shiqi Gao
Jianxin Li
OffRL
25
1
0
28 Jun 2024
Aligning Model Properties via Conformal Risk Control
William Overman
Jacqueline Jil Vallon
Mohsen Bayati
33
2
0
26 Jun 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
36
26
0
26 Jun 2024
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Zhirui Chen
Vincent Y. F. Tan
OffRL
38
1
0
18 Jun 2024
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang
Philip S. Yu
Jiawei Zhang
37
17
0
17 Jun 2024
Large Language Models for Automatic Milestone Detection in Group Discussions
Zhuoxu Duan
Zhengye Yang
Samuel Westby
Christoph Riedl
B. F. Welles
Richard J. Radke
30
0
0
16 Jun 2024
Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis
Yuping Lin
Pengfei He
Han Xu
Yue Xing
Makoto Yamada
Hui Liu
Jiliang Tang
34
10
0
16 Jun 2024
Prompt-Based Length Controlled Generation with Multiple Control Types
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
26
6
0
12 Jun 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
64
1
0
11 Jun 2024
AutoSurvey: Large Language Models Can Automatically Write Surveys
Yidong Wang
Qi Guo
Wenjin Yao
Hongbo Zhang
Xin Zhang
...
M. Zhang
Qingsong Wen
Wei Ye
Shikun Zhang
Yue Zhang
LM&MA
30
19
0
10 Jun 2024
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen
Ruiqi Zhong
Pei Ke
Zhihong Shao
Hongning Wang
Minlie Huang
ReLM
34
8
0
07 Jun 2024
Process-Driven Autoformalization in Lean 4
Jianqiao Lu
Zhengying Liu
Yingjia Wan
Yinya Huang
Haiming Wang
Zhicheng YANG
Jing Tang
Zhijiang Guo
AI4CE
37
14
0
04 Jun 2024
1
2
3
4
5
Next