Neighbor communities
0 / 0 papers shown
Top Contributors
| Name | # Papers | # Citations |
|---|---|---|
Social Events
| Date | Location | Event |
|---|---|---|
| Name | # Papers | # Citations |
|---|---|---|
| Date | Location | Event |
|---|---|---|
Focuses on research that actively explores methods and strategies to ensure language models' outputs align with human values, ethics, and intentions, constituting a significant portion of the paper's content.
IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation Bosi Wen Yilin Niu Cunxiang Wang Xiaoying Ling Ying Zhang Pei Ke Hongning Wang Minlie Huang | |||
When Do Language Models Endorse Limitations on Human Rights Principles? Keenan Samway Nicole Miu Takagi Rada Mihalcea Bernhard Schölkopf Ilias Chalkidis Daniel Hershcovich Zhijing Jin | |||
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Lei Huang Xiang Cheng Chenxiao Zhao Guobin Shen Junjie Yang Xiaocheng Feng Yuxuan Gu Xing Yu Bing Qin | |||
Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks Junjie Chu Xinyue Shen Ye Leng Michael Backes Yun Shen Yang Zhang | |||
How Small Can 6G Reason? Scaling Tiny Language Models for AI-Native Networks Mohamed Amine Ferrag Abderrahmane Lakas Merouane Debbah | |||
FT-Dojo: Towards Autonomous LLM Fine-Tuning with Language Agents Qizheng Li Yifei Zhang Xiao Yang Xu Yang Zhuo Wang Weiqing Liu Jiang Bian | |||
When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation Thibault Prouteau Francis Lareau Nicolas Dugué Jean-Charles Lamirel Christophe Malaterre | |||
RubricBench: Aligning Model-Generated Rubrics with Human Standards Qiyuan Zhang Junyi Zhou Yufei Wang Fuyuan Lyu Yidong Ming ...Qingfeng Sun Kai Zheng Peng Kang Xue Liu Chen Ma | |||
DEP: A Decentralized Large Language Model Evaluation Protocol Jianxiang Peng Junhao Li Hongxiang Wang Haocheng Lyu Hui Guo ...Tianyu Dong Juesi Xiao Lei Yang Yuqi Ren Deyi Xiong | |||
DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science Fan Shu Yite Wang Ruofan Wu Boyi Liu Zhewei Yao Yuxiong He Feng Yan | |||
QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs Santiago Gonzalez Alireza Amiri Bavandpour Peter Ye Edward Zhang Ruslans Aleksejevs ...Sibel Yalçın Jun Yan Ji Zeng Arman Cohan Quanquan C. Liu | |||
CAMEL: Confidence-Gated Reflection for Reward Modeling Zirui Zhu Hailun Xu Yang Luo Yong Liu Kanchan Sarkar Kun Xu Yang You | |||
From Human-Level AI Tales to AI Leveling Human Scales Peter Romero Fernando Martínez-Plumed Zachary R. Tyler Matthieu Téhénan Sipeng Chen ...Yael Moros Daval Daniel Romero-Alvarado Félix Martí Pérez Kevin Wei José Hernández-Orallo | |||
When LLM Judges Inflate Scores: Exploring Overrating in Relevance Assessment Chuting Yu Hang Li Guido Zuccon Joel Mackenzie Teerapong Leelanupab | |||
ConvApparel: A Benchmark Dataset and Validation Framework for User Simulators in Conversational Recommenders Ofer Meshi Krisztian Balog Sally Goldman Avi Caciularu Guy Tennenholtz Jihwan Jeong Amir Globerson Craig Boutilier | |||
When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation Mubashara Akhtar Anka Reuel Prajna Soni Sanchit Ahuja Pawan Sasanka Ammanamanchi ...Mrinmaya Sachan Stella Biderman Zeerak Talat Avijit Ghosh Irene Solaiman | |||
References Improve LLM Alignment in Non-Verifiable Domains Kejian Shi Yixin Liu Peifeng Wang Alexander R. Fabbri Shafiq Joty Arman Cohan | |||
What Is Missing: Interpretable Ratings for Large Language Model Outputs Nicholas Stranges Yimin Yang | |||
ResearchGym: Evaluating Language Model Agents on Real-World AI Research Aniketh Garikaparthi Manasi Patwardhan Arman Cohan | |||
Who Do LLMs Trust? Human Experts Matter More Than Other LLMs Anooshka Bajaj Zoran Tiganj | |||
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Chen Yang Guangyue Peng Jiaying Zhu Ran Le Ruixiang Feng ...Yunzhi Xu Zekai Wang Zhenwei An Zhicong Sun Zongchao Chen | |||
Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments Romain Froger Pierre Andrews Matteo Bettini Amar Budhiraja Ricardo Silveira Cabral ...Mengjue Wang Ian Yu Amine Benhalloum Grégoire Mialon Thomas Scialom | |||
Pedagogically-Inspired Data Synthesis for Language Model Knowledge Distillation Bowei He Yankai Chen Xiaokun Zhang Linghe Kong Philip S. Yu Xue Liu Chen Ma | |||
RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty Ziqian Zhang Xingjian Hu Yue Huang Kai Zhang Ruoxi Chen ...Qingsong Wen Kaidi Xu Xiangliang Zhang Neil Zhenqiang Gong Lichao Sun | |||
Benchmark Illusion: Disagreement among LLMs and Its Scientific Consequences Eddie Yang Dashun Wang | |||
Are Aligned Large Language Models Still Misaligned? Usman Naseem Gautam Siddharth Kashyap Rafiq Ali Ebad Shabbir Sushant Kumar Ray Abdullah Mohammad Agrima Seth | |||
Fine-Tuning GPT-5 for GPU Kernel Generation Ali Tehrani Yahya Emara Essam Wissam Wojciech Paluch Waleed Atallah Łukasz Dudziak Mohamed S. Abdelfattah | |||
Can Large Language Models Make Everyone Happy? Usman Naseem Gautam Siddharth Kashyap Ebad Shabbir Sushant Kumar Ray Abdullah Mohammad Rafiq Ali | |||
Scaling Reward Modeling without Human Supervision Jingxuan Fan Yueying Li Zhenting Qi Dinghuai Zhang Kianté Brantley Sham M. Kakade Hanlin Zhang | |||
FlexMoRE: A Flexible Mixture of Rank-heterogeneous Experts for Efficient Federatedly-trained Large Language Models Annemette Brok Pirchert Jacob Nielsen Mogens Henrik From Lukas Galke Poech Peter Schneider-Kamp | |||
InfiCoEvalChain: A Blockchain-Based Decentralized Framework for Collaborative LLM Evaluation Yifan Yang Jinjia Li Kunxi Li Puhao Zheng Yuanyi Wang Zheyan Qu Yang Yu Jianmin Wu Ming Li Hongxia Yang | |||
Whose Name Comes Up? Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation Lisette Espin-Noboa Gonzalo Gabriel Mendez | |||
When the Model Said Ño Comment', We Knew Helpfulness Was Dead, Honesty Was Alive, and Safety Was Terrified Gautam Siddharth Kashyap Mark Dras Usman Naseem | |||
R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging Yanlin Lai Mitt Huang Hangyu Guo Xiangfeng Wang Haodong Li ...Qi Han Chun Yuan Zheng Ge Xiangyu Zhang Daxin Jiang | |||
AgentCPM-Explore: Realizing Long-Horizon Deep Exploration for Edge-Scale Agents Haotian Chen Xin Cong Shengda Fan Yuyang Fu Ziqin Gong ...Yukun Yan Zhong Zhang Yankai Lin Zhiyuan Liu Maosong Sun | |||
Aligning Large Language Model Behavior with Human Citation Preferences Kenichiro Ando Tatsuya Harada | |||
SAIL: Self-Amplified Iterative Learning for Diffusion Model Alignment with Minimal Human Feedback Xiaoxuan He Siming Fu Wanli Li Zhiyuan Li Dacheng Yin Kang Rong Fengyun Rao Bo Zhang | |||
Scaling Agentic Verifier for Competitive Coding Zeyao Ma Jing Zhang Xiaokang Zhang Jiaxi Yang Zongmeng Zhang ...Lei Zhang Hao Zheng Wenting Zhao Junyang Lin Binyuan Hui | |||
Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework Nora Petrova Andrew Gordon Enzo Blindow | |||
What LLMs Think When You Don't Tell Them What to Think About? Yongchan Kwon James Zou | |||
Didactic to Constructive: Turning Expert Solutions into Learnable Reasoning Ethan Mendes Jungsoo Park Alan Ritter | |||
Aligning Language Model Benchmarks with Pairwise Preferences Marco Gutierrez Xinyi Leng Hannah Cyberey Jonathan Richard Schwarz Ahmed Alaa Thomas Hartvigsen | |||
PeerRank: Autonomous LLM Evaluation Through Web-Grounded, Bias-Controlled Peer Review Yanki Margalit Erni Avram Ran Taig Oded Margalit Nurit Cohen-Inger | |||
Judging the Judges: Human Validation of Multi-LLM Evaluation for High-Quality K--12 Science Instructional Materials Peng He Zhaohui Li Zeyuan Wang Jinjun Xiong Tingting Li | |||
Why Self-Rewarding Works: Theoretical Guarantees for Iterative Alignment of Language Models Shi Fu Yingjie Wang Shengchao Hu Peng Wang Dacheng Tao | |||
CVeDRL: An Efficient Code Verifier via Difficulty-aware Reinforcement Learning Ji Shi Peiming Guo Meishan Zhang Miao Zhang Xuebo Liu Min Zhang Weili Guan | |||
| Name (-) |
|---|
| Name (-) |
|---|
| Name (-) |
|---|
| Date | Location | Event | |
|---|---|---|---|
| No social events available | |||