ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.06565
  4. Cited By
Concrete Problems in AI Safety

Concrete Problems in AI Safety

21 June 2016
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
ArXivPDFHTML

Papers citing "Concrete Problems in AI Safety"

50 / 487 papers shown
Title
Model-Free Risk-Sensitive Reinforcement Learning
Model-Free Risk-Sensitive Reinforcement Learning
Grégoire Delétang
Jordi Grau-Moya
M. Kunesch
Tim Genewein
Rob Brekelmans
Shane Legg
Pedro A. Ortega
OOD
13
9
0
04 Nov 2021
MultiplexNet: Towards Fully Satisfied Logical Constraints in Neural
  Networks
MultiplexNet: Towards Fully Satisfied Logical Constraints in Neural Networks
Nicholas Hoernle
Rafael-Michael Karampatsis
Vaishak Belle
Y. Gal
27
59
0
02 Nov 2021
GalilAI: Out-of-Task Distribution Detection using Causal Active
  Experimentation for Safe Transfer RL
GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL
Sumedh Anand Sontakke
Stephen Iota
Zizhao Hu
Arash Mehrjou
Laurent Itti
Bernhard Schölkopf
OODD
20
2
0
29 Oct 2021
What Do We Mean by Generalization in Federated Learning?
What Do We Mean by Generalization in Federated Learning?
Honglin Yuan
Warren Morningstar
Lin Ning
K. Singhal
OOD
FedML
46
71
0
27 Oct 2021
Generalized Out-of-Distribution Detection: A Survey
Generalized Out-of-Distribution Detection: A Survey
Jingkang Yang
Kaiyang Zhou
Yixuan Li
Ziwei Liu
193
881
0
21 Oct 2021
On games and simulators as a platform for development of artificial
  intelligence for command and control
On games and simulators as a platform for development of artificial intelligence for command and control
Vinicius G. Goecks
Nicholas R. Waytowich
Derrik E. Asher
Song Jun Park
Mark R. Mittrick
...
Anne Logie
Mark S. Dennison
T. Trout
Priya Narayanan
Alexander Kott
41
26
0
21 Oct 2021
A TinyML Platform for On-Device Continual Learning with Quantized Latent
  Replays
A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays
Leonardo Ravaglia
Manuele Rusci
D. Nadalini
Alessandro Capotondi
Francesco Conti
Luca Benini
BDL
41
64
0
20 Oct 2021
Natural Attribute-based Shift Detection
Natural Attribute-based Shift Detection
Jeonghoon Park
Jimin Hong
Radhika Dua
Daehoon Gwak
Yixuan Li
Jaegul Choo
Edward Choi
OOD
25
3
0
18 Oct 2021
Multi-Agent Constrained Policy Optimisation
Multi-Agent Constrained Policy Optimisation
Shangding Gu
J. Kuba
Munning Wen
Ruiqing Chen
Ziyan Wang
Zheng Tian
Jun Wang
Alois Knoll
Yaodong Yang
98
49
0
06 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
119
357
0
04 Oct 2021
BulletTrain: Accelerating Robust Neural Network Training via Boundary
  Example Mining
BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining
Weizhe Hua
Yichi Zhang
Chuan Guo
Zhiru Zhang
G. E. Suh
OOD
39
15
0
29 Sep 2021
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
40
296
0
22 Sep 2021
MPC-Friendly Commitments for Publicly Verifiable Covert Security
MPC-Friendly Commitments for Publicly Verifiable Covert Security
Nitin Agrawal
James Bell
Adria Gascon
Matt J. Kusner
28
4
0
15 Sep 2021
No True State-of-the-Art? OOD Detection Methods are Inconsistent across
  Datasets
No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Fahim Tajwar
Ananya Kumar
Sang Michael Xie
Percy Liang
OODD
27
21
0
12 Sep 2021
Robust fine-tuning of zero-shot models
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
71
697
0
04 Sep 2021
Evaluating Predictive Uncertainty under Distributional Shift on Dialogue
  Dataset
Evaluating Predictive Uncertainty under Distributional Shift on Dialogue Dataset
Nyoungwoo Lee
chaeHun Park
Ho-Jin Choi
47
0
0
01 Sep 2021
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
  Models
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models
Myeongjun Jang
Thomas Lukasiewicz
24
4
0
29 Aug 2021
Revealing the Distributional Vulnerability of Discriminators by Implicit
  Generators
Revealing the Distributional Vulnerability of Discriminators by Implicit Generators
Zhilin Zhao
LongBing Cao
Kun-Yu Lin
39
11
0
23 Aug 2021
Out-of-Distribution Detection Using Outlier Detection Methods
Out-of-Distribution Detection Using Outlier Detection Methods
Jan Diers
Christian Pigorsch
OODD
28
3
0
18 Aug 2021
A Framework for Understanding AI-Induced Field Change: How AI
  Technologies are Legitimized and Institutionalized
A Framework for Understanding AI-Induced Field Change: How AI Technologies are Legitimized and Institutionalized
B. Larsen
24
4
0
18 Aug 2021
Dealing with Distribution Mismatch in Semi-supervised Deep Learning for
  Covid-19 Detection Using Chest X-ray Images: A Novel Approach Using Feature
  Densities
Dealing with Distribution Mismatch in Semi-supervised Deep Learning for Covid-19 Detection Using Chest X-ray Images: A Novel Approach Using Feature Densities
Saul Calderon-Ramirez
Shengxiang-Yang
David Elizondo
Armaghan Moemeni
OOD
28
24
0
17 Aug 2021
Reimagining an autonomous vehicle
Reimagining an autonomous vehicle
Jeffrey Hawke
E. Haibo
Vijay Badrinarayanan
Alex Kendall
46
11
0
12 Aug 2021
Skill Preferences: Learning to Extract and Execute Robotic Skills from
  Human Feedback
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback
Xiaofei Wang
Kimin Lee
Kourosh Hakhamaneshi
Pieter Abbeel
Michael Laskin
34
42
0
11 Aug 2021
Risk Averse Bayesian Reward Learning for Autonomous Navigation from
  Human Demonstration
Risk Averse Bayesian Reward Learning for Autonomous Navigation from Human Demonstration
Christian Ellis
Maggie B. Wigness
J. Rogers
Craig T. Lennon
L. Fiondella
90
6
0
31 Jul 2021
Who's Afraid of Thomas Bayes?
Who's Afraid of Thomas Bayes?
Erick Galinkin
AAML
28
0
0
30 Jul 2021
Toward AI Assistants That Let Designers Design
Toward AI Assistants That Let Designers Design
Sebastiaan De Peuter
Antti Oulasvirta
Samuel Kaski
AI4CE
29
19
0
22 Jul 2021
Visual Adversarial Imitation Learning using Variational Models
Visual Adversarial Imitation Learning using Variational Models
Rafael Rafailov
Tianhe Yu
Aravind Rajeswaran
Chelsea Finn
SSL
28
49
0
16 Jul 2021
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting
  Pot
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
Joel Z Leibo
Edgar A. Duénez-Guzmán
A. Vezhnevets
J. Agapiou
P. Sunehag
Raphael Köster
Jayd Matyas
Charlie Beattie
Igor Mordatch
T. Graepel
OffRL
58
104
0
14 Jul 2021
Online Adaptation to Label Distribution Shift
Online Adaptation to Label Distribution Shift
Ruihan Wu
Chuan Guo
Yi-Hsun Su
Kilian Q. Weinberger
21
47
0
09 Jul 2021
Physics-Guided Deep Learning for Dynamical Systems: A Survey
Physics-Guided Deep Learning for Dynamical Systems: A Survey
Rui Wang
Rose Yu
AI4CE
PINN
46
65
0
02 Jul 2021
Unsupervised Model Drift Estimation with Batch Normalization Statistics
  for Dataset Shift Detection and Model Selection
Unsupervised Model Drift Estimation with Batch Normalization Statistics for Dataset Shift Detection and Model Selection
Won-Jo Lee
Seokhyun Byun
Jooeun Kim
Minje Park
Kirill Chechil
AI4TS
21
2
0
01 Jul 2021
Learning from an Exploring Demonstrator: Optimal Reward Estimation for
  Bandits
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Wenshuo Guo
Kumar Krishna Agrawal
Aditya Grover
Vidya Muthukumar
A. Pananjady
16
8
0
28 Jun 2021
Ensembling Shift Detectors: an Extensive Empirical Evaluation
Ensembling Shift Detectors: an Extensive Empirical Evaluation
Simona Maggio
L. Dreyfus-Schmidt
AI4TS
34
3
0
28 Jun 2021
Compositional Reinforcement Learning from Logical Specifications
Compositional Reinforcement Learning from Logical Specifications
Kishor Jothimurugan
Suguman Bansal
Osbert Bastani
Rajeev Alur
CoGe
28
78
0
25 Jun 2021
Being a Bit Frequentist Improves Bayesian Neural Networks
Being a Bit Frequentist Improves Bayesian Neural Networks
Agustinus Kristiadi
Matthias Hein
Philipp Hennig
BDL
UQCV
23
15
0
18 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
33
219
0
16 Jun 2021
Safe Reinforcement Learning Using Advantage-Based Intervention
Safe Reinforcement Learning Using Advantage-Based Intervention
Nolan Wagener
Byron Boots
Ching-An Cheng
34
52
0
16 Jun 2021
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Yiming Zhang
Keith Ross
OffRL
41
41
0
14 Jun 2021
Taxonomy of Machine Learning Safety: A Survey and Primer
Taxonomy of Machine Learning Safety: A Survey and Primer
Sina Mohseni
Haotao Wang
Zhiding Yu
Chaowei Xiao
Zhangyang Wang
J. Yadawa
31
31
0
09 Jun 2021
Definitions of intent suitable for algorithms
Definitions of intent suitable for algorithms
Hal Ashton
13
18
0
08 Jun 2021
Learning Policies with Zero or Bounded Constraint Violation for
  Constrained MDPs
Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs
Tao-Wen Liu
Ruida Zhou
D. Kalathil
P. R. Kumar
Chao Tian
42
78
0
04 Jun 2021
Goal Misgeneralization in Deep Reinforcement Learning
Goal Misgeneralization in Deep Reinforcement Learning
L. Langosco
Jack Koch
Lee D. Sharkey
J. Pfau
Laurent Orseau
David M. Krueger
30
78
0
28 May 2021
A Survey on Interactive Reinforcement Learning: Design Principles and
  Open Challenges
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
Christian Arzate Cruz
Takeo Igarashi
OffRL
17
94
0
27 May 2021
Quantifying Uncertainty in Deep Spatiotemporal Forecasting
Quantifying Uncertainty in Deep Spatiotemporal Forecasting
Dongxian Wu
Liyao (Mars) Gao
X. Xiong
Matteo Chinazzi
Alessandro Vespignani
Yi Ma
Rose Yu
AI4TS
16
68
0
25 May 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
21
428
0
24 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
24
76
0
24 May 2021
Stochastic-Shield: A Probabilistic Approach Towards Training-Free
  Adversarial Defense in Quantized CNNs
Stochastic-Shield: A Probabilistic Approach Towards Training-Free Adversarial Defense in Quantized CNNs
Lorena Qendro
Sangwon Ha
R. D. Jong
Partha P. Maji
AAML
FedML
MQ
21
7
0
13 May 2021
Intelligence and Unambitiousness Using Algorithmic Information Theory
Intelligence and Unambitiousness Using Algorithmic Information Theory
Michael K. Cohen
Badri N. Vellambi
Marcus Hutter
16
2
0
13 May 2021
Safety of the Intended Driving Behavior Using Rulebooks
Safety of the Intended Driving Behavior Using Rulebooks
Anne-Sophie Collin
Artur Bilka
S. Pendleton
R. D. Tebbens
6
25
0
10 May 2021
Software Engineering for AI-Based Systems: A Survey
Software Engineering for AI-Based Systems: A Survey
Silverio Martínez-Fernández
Justus Bogner
Xavier Franch
Marc Oriol
Julien Siebert
Adam Trendowicz
Anna Maria Vollmer
Stefan Wagner
27
211
0
05 May 2021
Previous
123...1056789
Next