Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.03751
Cited By
Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles
7 March 2023
Zhiwei Tang
Dmitry Rybin
Tsung-Hui Chang
ALM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles"
15 / 15 papers shown
Title
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
37
2
0
25 Sep 2024
Comparisons Are All You Need for Optimizing Smooth Functions
Chenyi Zhang
Tongyang Li
AAML
34
1
0
19 May 2024
CoCoG: Controllable Visual Stimuli Generation based on Human Concept Representations
Chen Wei
Jiachen Zou
Dietmar Heinke
Quanying Liu
48
3
0
25 Apr 2024
Deep Representation Learning for Multi-functional Degradation Modeling of Community-dwelling Aging Population
Suiyao Chen
Xinyi Liu
Yulei Li
Jing Wu
Handong Yao
39
5
0
08 Apr 2024
Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis
Shaojie Li
Haichen Qu
Xinqi Dong
Bo Dang
Hengyi Zang
Yulu Gong
37
10
0
24 Mar 2024
Advanced Feature Manipulation for Enhanced Change Detection Leveraging Natural Language Models
Zhenglin Li
Yangchen Huang
Mengran Zhu
Jingyu Zhang
Jinghao Chang
Houze Liu
26
4
0
23 Mar 2024
Development and Application of a Monte Carlo Tree Search Algorithm for Simulating Da Vinci Code Game Strategies
Ye Zhang
Mengran Zhu
Kailin Gui
Jiayue Yu
Yong Hao
Haozhan Sun
46
29
0
15 Mar 2024
FedLion: Faster Adaptive Federated Optimization with Fewer Communication
Zhiwei Tang
Tsung-Hui Chang
29
5
0
15 Feb 2024
DeepGI: An Automated Approach for Gastrointestinal Tract Segmentation in MRI Scans
Ye Zhang
Yulu Gong
Dongji Cui
Xinrui Li
Xinyu Shen
35
32
0
27 Jan 2024
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention
Zhen Tan
Tianlong Chen
Zhenyu (Allen) Zhang
Huan Liu
44
14
0
22 Dec 2023
ReConTab: Regularized Contrastive Representation Learning for Tabular Data
Suiyao Chen
Jing Wu
N. Hovakimyan
Handong Yao
36
33
0
28 Oct 2023
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
27
177
0
27 May 2023
Prompt-Tuning Decision Transformer with Preference Ranking
Shengchao Hu
Li Shen
Ya-Qin Zhang
Dacheng Tao
OffRL
26
14
0
16 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Low-rank Matrix Recovery With Unknown Correspondence
Zhiwei Tang
Tsung-Hui Chang
X. Ye
H. Zha
39
4
0
15 Oct 2021
1