ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.04921
  4. Cited By
Crystal: Introspective Reasoners Reinforced with Self-Feedback

Crystal: Introspective Reasoners Reinforced with Self-Feedback

7 October 2023
Jiacheng Liu
Ramakanth Pasunuru
Hannaneh Hajishirzi
Yejin Choi
Asli Celikyilmaz
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Crystal: Introspective Reasoners Reinforced with Self-Feedback"

17 / 17 papers shown
Title
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense
  Question Answering
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering
Francesco Maria Molfese
Simone Conia
Riccardo Orlando
Roberto Navigli
ReLM
LRM
RALM
30
1
0
07 Oct 2024
Rationale-Aware Answer Verification by Pairwise Self-Evaluation
Rationale-Aware Answer Verification by Pairwise Self-Evaluation
Akira Kawabata
Saku Sugawara
LRM
39
3
0
07 Oct 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from
  Preference Feedback
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
Hamish Ivison
Yizhong Wang
Jiacheng Liu
Zeqiu Wu
Valentina Pyatkin
Nathan Lambert
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
46
40
0
13 Jun 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to
  the Edge of Generalization
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
LRM
29
42
0
23 May 2024
RaFe: Ranking Feedback Improves Query Rewriting for RAG
RaFe: Ranking Feedback Improves Query Rewriting for RAG
Shengyu Mao
Yong-jia Jiang
Boli Chen
Xiao Li
Peng Wang
Xinyu Wang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
RALM
39
19
0
23 May 2024
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference
  Learning
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
Yuxi Xie
Anirudh Goyal
Wenyue Zheng
Min-Yen Kan
Timothy Lillicrap
Kenji Kawaguchi
Michael Shieh
ReLM
LRM
55
90
0
01 May 2024
SELF-[IN]CORRECT: LLMs Struggle with Refining Self-Generated Responses
SELF-[IN]CORRECT: LLMs Struggle with Refining Self-Generated Responses
Dongwei Jiang
Jingyu Zhang
Orion Weller
Nathaniel Weir
Benjamin Van Durme
Daniel Khashabi
65
1
0
04 Apr 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before
  Speaking
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
55
113
0
14 Mar 2024
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems
  in Commonsense Reasoning
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
Jiachun Li
Pengfei Cao
Chenhao Wang
Zhuoran Jin
Yubo Chen
Daojian Zeng
Kang Liu
Jun Zhao
LRM
51
8
0
28 Feb 2024
Rule or Story, Which is a Better Commonsense Expression for Talking with
  Large Language Models?
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
Ning Bian
Xianpei Han
Hongyu Lin
Yaojie Lu
Xianpei Han
Le Sun
34
1
0
22 Feb 2024
Making Reasoning Matter: Measuring and Improving Faithfulness of
  Chain-of-Thought Reasoning
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul
Robert West
Antoine Bosselut
Boi Faltings
ReLM
LRM
41
21
0
21 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
Dawei Yin
Pengjie Ren
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
24
5
0
17 Feb 2024
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought
  Reasoning: Advances, Frontiers and Future
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming Liu
Bing Qin
Ting Liu
LRM
AI4CE
37
155
0
27 Sep 2023
SCOTT: Self-Consistent Chain-of-Thought Distillation
SCOTT: Self-Consistent Chain-of-Thought Distillation
Jamie Yap
Zhengyang Wang
Zheng Li
K. Lynch
Bing Yin
Xiang Ren
LRM
78
93
0
03 May 2023
Generate rather than Retrieve: Large Language Models are Strong Context
  Generators
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Wenhao Yu
Dan Iter
Shuohang Wang
Yichong Xu
Mingxuan Ju
Soumya Sanyal
Chenguang Zhu
Michael Zeng
Meng Jiang
RALM
AIMat
240
323
0
21 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
375
12,081
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
447
8,650
0
28 Jan 2022
1