ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.11050
  4. Cited By
A Peek into Token Bias: Large Language Models Are Not Yet Genuine
  Reasoners

A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

16 June 2024
Bowen Jiang
Yangxinyu Xie
Zhuoqun Hao
Xiaomeng Wang
Tanwi Mallick
Weijie J. Su
Camillo J Taylor
Dan Roth
    LRM
ArXivPDFHTML

Papers citing "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners"

24 / 24 papers shown
Title
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Weixiang Zhao
Xingyu Sui
Yulin Hu
Jiahe Guo
Haixiao Liu
Biye Li
Yanyan Zhao
Bing Qin
Ting Liu
OffRL
18
0
0
21 May 2025
SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information
SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information
Chih-Kai Yang
Neo Ho
Yen-Ting Piao
Hung-yi Lee
AuLLM
LRM
21
0
0
19 May 2025
Group-in-Group Policy Optimization for LLM Agent Training
Group-in-Group Policy Optimization for LLM Agent Training
Lang Feng
Zhenghai Xue
Tingcong Liu
Bo An
OffRL
22
0
0
16 May 2025
Enigme: Generative Text Puzzles for Evaluating Reasoning in Language Models
Enigme: Generative Text Puzzles for Evaluating Reasoning in Language Models
John Hawkins
ReLM
LRM
62
0
0
08 May 2025
Optimization Problem Solving Can Transition to Evolutionary Agentic Workflows
Optimization Problem Solving Can Transition to Evolutionary Agentic Workflows
Wenhao Li
Bo Jin
Mingyi Hong
Changhong Lu
Xiangfeng Wang
57
0
0
07 May 2025
A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition
A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition
Yuanpeng Li
CoGe
233
0
0
05 May 2025
SymPlanner: Deliberate Planning in Language Models with Symbolic Representation
SymPlanner: Deliberate Planning in Language Models with Symbolic Representation
Siheng Xiong
Jieyu Zhou
Zhangding Liu
Yusen Su
LLMAG
LM&Ro
238
0
0
02 May 2025
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang
Zhuoqun Hao
Y. Cho
B. Li
Yuan Yuan
Sihao Chen
Lyle Ungar
Camillo J Taylor
Dan Roth
49
1
0
19 Apr 2025
Socratic Chart: Cooperating Multiple Agents for Robust SVG Chart Understanding
Socratic Chart: Cooperating Multiple Agents for Robust SVG Chart Understanding
Yuyang Ji
Haohan Wang
LRM
39
0
0
14 Apr 2025
Evaluating the Generalization Capabilities of Large Language Models on Code Reasoning
Evaluating the Generalization Capabilities of Large Language Models on Code Reasoning
Rem Yang
Julian Dai
N. Vasilakis
Martin Rinard
ELM
LRM
37
0
0
07 Apr 2025
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Aabid Karim
Abdul Karim
Bhoomika Lohana
Matt Keon
Jaswinder Singh
A. Sattar
54
1
0
23 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
63
0
0
13 Mar 2025
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks
Eva Sánchez Salido
Julio Gonzalo
Guillermo Marco
ELM
65
3
0
18 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
57
3
0
17 Feb 2025
Do Large Language Models Reason Causally Like Us? Even Better?
Do Large Language Models Reason Causally Like Us? Even Better?
Hanna M. Dettki
Brenden M. Lake
Charley M. Wu
Bob Rehder
ReLM
ELM
LRM
104
0
0
14 Feb 2025
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
C. Anderson
Joydeep Biswas
Aleksander Boruch-Gruszecki
Federico Cassano
Molly Q. Feldman
Joydeep Biswas
Francesca Lucchetti
Zixuan Wu
Arjun Guha
ReLM
ELM
LRM
47
4
0
03 Feb 2025
Does a Large Language Model Really Speak in Human-Like Language?
Mose Park
Yunjin Choi
Jong-June Jeon
DeLMO
49
0
0
03 Jan 2025
On Memorization of Large Language Models in Logical Reasoning
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
58
25
0
30 Oct 2024
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in
  Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
AIMat
LRM
66
139
0
07 Oct 2024
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors
  in Pretrained Language Models
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
Pengrui Han
Peiyang Song
Haofei Yu
Jiaxuan You
ReLM
LRM
38
1
0
23 Sep 2024
Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs
  Gaussian-Based Methods
Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs Gaussian-Based Methods
Yiming Zhou
Zixuan Zeng
Andi Chen
Xiaofan Zhou
Haowei Ni
Shiyao Zhang
Panfeng Li
Liangxi Liu
Mengyao Zheng
Xupeng Chen
3DGS
50
18
0
08 Aug 2024
Don't Make Your LLM an Evaluation Benchmark Cheater
Don't Make Your LLM an Evaluation Benchmark Cheater
Kun Zhou
Yutao Zhu
Zhipeng Chen
Wentong Chen
Wayne Xin Zhao
Xu Chen
Yankai Lin
Ji-Rong Wen
Jiawei Han
ELM
110
138
0
03 Nov 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
372
3,029
0
22 Mar 2023
Using cognitive psychology to understand GPT-3
Using cognitive psychology to understand GPT-3
Marcel Binz
Eric Schulz
ELM
LLMAG
254
447
0
21 Jun 2022
1