Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.15507
Cited By
v1
v2
v3 (latest)
Activation Steering in Neural Theorem Provers
21 February 2025
Shashank Kirtania
LLMSV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Activation Steering in Neural Theorem Provers"
35 / 35 papers shown
Title
Improving Instruction-Following in Language Models through Activation Steering
Alessandro Stolfo
Vidhisha Balachandran
Safoora Yousefi
Eric Horvitz
Besmira Nushi
LLMSV
120
28
0
15 Oct 2024
Lean-STaR: Learning to Interleave Thinking and Proving
Haohan Lin
Zhiqing Sun
Yiming Yang
Sean Welleck
ReLM
LRM
135
29
0
14 Jul 2024
Multi-property Steering of Large Language Models with Dynamic Activation Composition
Daniel Scalena
Gabriele Sarti
Malvina Nissim
KELM
LLMSV
AI4CE
70
15
0
25 Jun 2024
Refusal in Language Models Is Mediated by a Single Direction
Andy Arditi
Oscar Obeso
Aaquib Syed
Daniel Paleka
Nina Panickssery
Wes Gurnee
Neel Nanda
124
213
0
17 Jun 2024
Proving Theorems Recursively
Haiming Wang
Huajian Xin
Zhengying Liu
Wenda Li
Yinya Huang
...
Zhicheng YANG
Jing Tang
Jian Yin
Zhenguo Li
Xiaodan Liang
LRM
75
16
0
23 May 2024
Extending Activation Steering to Broad Skills and Multiple Behaviours
Teun van der Weij
Massimo Poesio
Nandi Schoots
LLMSV
73
18
0
09 Mar 2024
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Huaiyuan Ying
Shuo Zhang
Linyang Li
Zhejian Zhou
Yunfan Shao
...
Hang Yan
Xipeng Qiu
Jiayu Wang
Kai-xiang Chen
Dahua Lin
ReLM
LRM
61
82
0
09 Feb 2024
Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models
Alexandre Variengien
Eric Winsor
LRM
ReLM
144
12
0
13 Dec 2023
Steering Llama 2 via Contrastive Activation Addition
Nina Rimsky
Nick Gabrieli
Julian Schulz
Meg Tong
Evan Hubinger
Alexander Matt Turner
LLMSV
57
220
0
09 Dec 2023
The Linear Representation Hypothesis and the Geometry of Large Language Models
Kiho Park
Yo Joong Choe
Victor Veitch
LLMSV
MILM
123
186
0
07 Nov 2023
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation
Xiangjue Dong
Yibo Wang
Philip S. Yu
James Caverlee
73
29
0
01 Nov 2023
LLMSTEP: LLM proofstep suggestions in Lean
Sean Welleck
Rahul Saha
34
28
0
27 Oct 2023
In-Context Learning Creates Task Vectors
Roee Hendel
Mor Geva
Amir Globerson
101
165
0
24 Oct 2023
Function Vectors in Large Language Models
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
55
116
0
23 Oct 2023
Llemma: An Open Language Model For Mathematics
Zhangir Azerbayev
Hailey Schoelkopf
Keiran Paster
Marco Dos Santos
Stephen Marcus McAleer
Albert Q. Jiang
Jia Deng
Stella Biderman
Sean Welleck
CLL
89
300
0
16 Oct 2023
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
Huaixiu Steven Zheng
Swaroop Mishra
Xinyun Chen
Heng-Tze Cheng
Ed H. Chi
Quoc V. Le
Denny Zhou
RALM
LRM
62
121
0
09 Oct 2023
An In-Context Learning Agent for Formal Theorem-Proving
Amitayush Thakur
George Tsoukalas
Yeming Wen
Jimmy Xin
Swarat Chaudhuri
LLMAG
62
32
0
06 Oct 2023
The Hydra Effect: Emergent Self-repair in Language Model Computations
Tom McGrath
Matthew Rahtz
János Kramár
Vladimir Mikulik
Shane Legg
MILM
LRM
52
73
0
28 Jul 2023
Large Language Models
Michael R Douglas
LLMAG
LM&MA
138
637
0
11 Jul 2023
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang
Aidan M. Swope
Alex Gu
Rahul Chalamala
Peiyang Song
Shixing Yu
Saad Godil
R. Prenger
Anima Anandkumar
RALM
91
244
0
27 Jun 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
90
579
0
06 Jun 2023
Baldur: Whole-Proof Generation and Repair with Large Language Models
E. First
M. Rabe
Talia Ringer
Yuriy Brun
115
103
0
08 Mar 2023
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
189
518
0
08 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
140
383
0
07 Dec 2022
NaturalProver: Grounded Mathematical Proof Generation with Language Models
Sean Welleck
Jiacheng Liu
Ximing Lu
Hannaneh Hajishirzi
Yejin Choi
AIMat
LRM
73
73
0
25 May 2022
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
Albert Q. Jiang
Wenda Li
Szymon Tworkowski
K. Czechowski
Tomasz Odrzygó'zd'z
Piotr Milo's
Yuhuai Wu
M. Jamnik
AIMat
LRM
85
102
0
22 May 2022
Extracting Latent Steering Vectors from Pretrained Language Models
Nishant Subramani
Nivedita Suresh
Matthew E. Peters
LLMSV
78
98
0
10 May 2022
Locating and Editing Factual Associations in GPT
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
248
1,357
0
10 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
820
9,576
0
28 Jan 2022
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
AIMat
101
176
0
31 Aug 2021
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
147
976
0
04 Dec 2019
Graph Representations for Higher-Order Logic and Theorem Proving
Aditya Sanjay Paliwal
Sarah M. Loos
M. Rabe
Kshitij Bansal
Christian Szegedy
AI4CE
NoLa
178
98
0
24 May 2019
Learning to Prove Theorems via Interacting with Proof Assistants
Kaiyu Yang
Jia Deng
AIMat
LRM
110
146
0
21 May 2019
GamePad: A Learning Environment for Theorem Proving
Daniel Huang
Prafulla Dhariwal
Basel Alomair
Ilya Sutskever
96
110
0
02 Jun 2018
Holophrasm: a neural Automated Theorem Prover for higher-order logic
Daniel Whalen
AIMat
76
50
0
08 Aug 2016
1