Title |
|---|
| Name | # Papers | # Citations |
|---|---|---|
| Date | Location | Event |
|---|---|---|
Focuses on research that actively explores methods and strategies to ensure language models' outputs align with human values, ethics, and intentions, constituting a significant portion of the paper's content.
Title |
|---|
Title | |||
|---|---|---|---|
![]() CodeSimpleQA: Scaling Factuality in Code Large Language Models Jian Yang Wei Zhang Yizhi Li Shawn Guo Haowen Wang ...Ge Zhang Zili Wang Zhoujun Li Xianglong Liu Weifeng Lv | |||
![]() SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation Thittipat Pairatsuppawat Abhibhu Tachaapornchai Paweekorn Kusolsomboon Chutikan Chaiwong Thodsaporn Chay-intr Kobkrit Viriyayudhakorn Nongnuch Ketui Aslan B. Wong | |||
![]() CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning Zijun Gao Zhikun Xu Xiao Ye Ben Zhou | |||
![]() DEER: A Comprehensive and Reliable Benchmark for Deep-Research Expert Reports Janghoon Han Heegyu Kim Changho Lee Dahm Lee Min Hyung Park Hosung Song Stanley Jungkyu Choi Moontae Lee Honglak Lee | |||
![]() CIFE: Code Instruction-Following Evaluation Sravani Gunnu Shanmukha Guttula Hima Patel | |||
![]() Are We on the Right Way to Assessing LLM-as-a-Judge? Yuanning Feng Sinan Wang Zhengxiang Cheng Yao Wan Dongping Chen | |||
![]() On Assessing the Relevance of Code Reviews Authored by Generative Models Robert Heumüller Frank Ortmeier | |||
![]() Agreement Between Large Language Models and Human Raters in Essay Scoring: A Research Synthesis Hongli Li Che Han Chen Kevin Fan Chiho Young-Johnson Soyoung Lim Yali Feng | |||
![]() Revisiting the Reliability of Language Models in Instruction-Following Jianshuo Dong Yutong Zhang Yan Liu Zhenyu Zhong Tao Wei Chao Zhang Han Qiu | |||
![]() The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality Aileen Cheng Alon Jacovi Amir Globerson Ben Golan Charles Kwong ...Srinivasan Venkatachary Tulsee Doshi Yossi Matias Sasha Goldshtein Dipanjan Das | |||
![]() PACIFIC: a framework for generating benchmarks to check Precise Automatically Checked Instruction Following In Code Itay Dreyfuss Antonio Abu Nassar Samuel Ackerman Axel Ben David Eitan Farchi Rami Katan Orna Raz Marcel Zalmanovici | |||
![]() PCMind-2.1-Kaiyuan-2B Technical Report Kairong Luo Zhenbo Sun Xinyu Shi Shengqi Chen Bowen Yu ...Hengtao Tao Hui Wang Fangming Liu Kaifeng Lyu Wenguang Chen | |||
![]() Nanbeige4-3B Technical Report: Exploring the Frontier of Small Language Models Chen Yang Guangyue Peng Jiaying Zhu Ran Le Ruixiang Feng ...Yuntao Wen Zekai Wang Zhenwei An Zhicong Sun Zongchao Chen | |||
![]() Counting Without Running: Evaluating LLMs' Reasoning About Code Complexity Gregory Bolet Giorgis Georgakoudis Konstantinos Parasyris Harshitha Menon Niranjan Hasabnis Kirk W. Cameron Gal Oren | |||
![]() PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models Robert Belanec Ivan Srba Maria Bielikova | |||
![]() Financial Instruction Following Evaluation (FIFE) Glenn Matlin Siddharth Anirudh JM Aditya Shukla Yahya Hassan Sudheer Chava | |||
![]() From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence J. Yang Wei Emma Zhang Shark Liu J. Wu Shawn Guo ...Zizheng Zhan Jiajun Zhang Jie Zhang Zhaoxiang Zhang Bo Zheng | |||
| Name (-) |
|---|
| Name (-) |
|---|
| Name (-) |
|---|
| Date | Location | Event | |
|---|---|---|---|
| No social events available | |||