ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.19731
  4. Cited By
Preference Learning Unlocks LLMs' Psycho-Counseling Skills

Preference Learning Unlocks LLMs' Psycho-Counseling Skills

27 February 2025
Mian Zhang
S. Eack
Zhiyu Zoey Chen
ArXivPDFHTML

Papers citing "Preference Learning Unlocks LLMs' Psycho-Counseling Skills"

11 / 11 papers shown
Title
SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge
SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge
Fengqing Jiang
Fengbo Ma
Zhangchen Xu
Yuetai Li
Bhaskar Ramasubramanian
Luyao Niu
Bo Li
Xianyan Chen
Zhen Xiang
Radha Poovendran
ALM
ELM
25
0
0
27 May 2025
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Ren-Wei Liang
Chin-Ting Hsu
Chan-Hung Yu
Saransh Agrawal
Shih-Cheng Huang
Shang-Tse Chen
Kuan-Hao Huang
Shao-Hua Sun
106
0
0
27 Apr 2025
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy
Mian Zhang
Xianjun Yang
Xinlu Zhang
Travis Labrum
Jamie C. Chiu
S. Eack
Fei Fang
William Wang
Zhiyu Zoey Chen
ELM
AI4MH
LM&MA
60
2
0
28 Jan 2025
HelpSteer2-Preference: Complementing Ratings with Preferences
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang
Alexander Bukharin
Olivier Delalleau
Daniel Egert
Gerald Shen
Jiaqi Zeng
Oleksii Kuchaiev
Yi Dong
ALM
79
48
0
02 Oct 2024
Understanding the performance gap between online and offline alignment
  algorithms
Understanding the performance gap between online and offline alignment algorithms
Yunhao Tang
Daniel Guo
Zeyu Zheng
Daniele Calandriello
Yuan Cao
...
Rémi Munos
Bernardo Avila-Pires
Michal Valko
Yong Cheng
Will Dabney
OffRL
OnRL
70
64
0
14 May 2024
Iterative Reasoning Preference Optimization
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang
Weizhe Yuan
Kyunghyun Cho
He He
Sainbayar Sukhbaatar
Jason Weston
LRM
76
121
0
30 Apr 2024
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Shusheng Xu
Wei Fu
Jiaxuan Gao
Wenjie Ye
Weiling Liu
Zhiyu Mei
Guangju Wang
Chao Yu
Yi Wu
82
145
0
16 Apr 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James V. Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
119
240
0
20 Mar 2024
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
129
731
0
19 Sep 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
226
4,085
0
09 Jun 2023
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
208
18,685
0
20 Jul 2017
1