Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15408
Cited By
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective
24 May 2023
Guhao Feng
Bohang Zhang
Yuntian Gu
Haotian Ye
Di He
Liwei Wang
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective"
16 / 166 papers shown
Title
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal
Ziwei Ji
A. S. Rawat
A. Menon
Sanjiv Kumar
Vaishnavh Nagarajan
LRM
24
96
0
03 Oct 2023
Language Models as a Service: Overview of a New Paradigm and its Challenges
Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
ALM
ELM
35
3
0
28 Sep 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming-Yu Liu
Bing Qin
Ting Liu
LRM
AI4CE
37
153
0
27 Sep 2023
Chain-of-Thought Reasoning is a Policy Improvement Operator
Hugh Zhang
David C. Parkes
ReLM
LM&Ro
LRM
31
12
0
15 Sep 2023
Auto-Regressive Next-Token Predictors are Universal Learners
Eran Malach
LRM
24
36
0
13 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDa
VLM
31
30
0
05 Sep 2023
Cumulative Reasoning with Large Language Models
Yifan Zhang
Jingqin Yang
Yang Yuan
Andrew Chi-Chih Yao
ReLM
ELM
LRM
AI4CE
42
69
0
08 Aug 2023
Max-Margin Token Selection in Attention Mechanism
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
40
38
0
23 Jun 2023
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
Yufeng Zhang
Fengzhuo Zhang
Zhuoran Yang
Zhaoran Wang
BDL
36
63
0
30 May 2023
A Knowledge Engineering Primer
Agnieszka Lawrynowicz
27
0
0
26 May 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
328
2,232
0
22 Mar 2023
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
463
0
24 Sep 2022
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
70
51
0
26 May 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
398
8,559
0
28 Jan 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
698
0
27 Aug 2021
Previous
1
2
3
4