ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.11593
  4. Cited By
System III: Learning with Domain Knowledge for Safety Constraints

System III: Learning with Domain Knowledge for Safety Constraints

23 April 2023
Fazl Barez
Hosien Hasanbieg
Alesandro Abbate
ArXivPDFHTML

Papers citing "System III: Learning with Domain Knowledge for Safety Constraints"

6 / 6 papers shown
Title
Towards Interpretable Sequence Continuation: Analyzing Shared Circuits
  in Large Language Models
Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models
Michael Lan
Phillip H. S. Torr
Fazl Barez
LRM
38
3
0
07 Nov 2023
Neuron to Graph: Interpreting Language Model Neurons at Scale
Neuron to Graph: Interpreting Language Model Neurons at Scale
Alex Foote
Neel Nanda
Esben Kran
Ioannis Konstas
Shay B. Cohen
Fazl Barez
MILM
11
24
0
31 May 2023
N2G: A Scalable Approach for Quantifying Interpretable Neuron
  Representations in Large Language Models
N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models
Alex Foote
Neel Nanda
Esben Kran
Ionnis Konstas
Fazl Barez
MILM
28
3
0
22 Apr 2023
Fairness in AI and Its Long-Term Implications on Society
Fairness in AI and Its Long-Term Implications on Society
Ondrej Bohdal
Timothy M. Hospedales
Philip Torr
Fazl Barez
15
4
0
16 Apr 2023
Unsolved Problems in ML Safety
Unsolved Problems in ML Safety
Dan Hendrycks
Nicholas Carlini
John Schulman
Jacob Steinhardt
186
276
0
28 Sep 2021
Modular Deep Reinforcement Learning for Continuous Motion Planning with
  Temporal Logic
Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal Logic
Mingyu Cai
Mohammadhosein Hasanbeig
Shaoping Xiao
Alessandro Abate
Z. Kan
80
86
0
24 Feb 2021
1