ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.08500
  4. Cited By
Auditing large language models: a three-layered approach

Auditing large language models: a three-layered approach

16 February 2023
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
    AILaw
    MLAU
ArXivPDFHTML

Papers citing "Auditing large language models: a three-layered approach"

50 / 98 papers shown
Title
Human-AI Governance (HAIG): A Trust-Utility Approach
Human-AI Governance (HAIG): A Trust-Utility Approach
Zeynep Engin
44
0
0
03 May 2025
Position: Ensuring mutual privacy is necessary for effective external evaluation of proprietary AI systems
Ben Bucknall
Robert F. Trager
Michael A. Osborne
80
0
0
03 Mar 2025
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Jonathan Rystrøm
Hannah Rose Kirk
Scott A. Hale
44
2
0
23 Feb 2025
Addressing the regulatory gap: moving towards an EU AI audit ecosystem beyond the AI Act by including civil society
Addressing the regulatory gap: moving towards an EU AI audit ecosystem beyond the AI Act by including civil society
David Hartmann
José Renato Laranjeira de Pereira
Chiara Streitbörger
Bettina Berendt
94
6
0
20 Feb 2025
CALM: Curiosity-Driven Auditing for Large Language Models
Xiang Zheng
Longxiang Wang
Yi Liu
Xingjun Ma
Chao Shen
Cong Wang
MLAU
52
0
0
06 Jan 2025
The Systems Engineering Approach in Times of Large Language Models
The Systems Engineering Approach in Times of Large Language Models
Christian Cabrera
Viviana Bastidas
Jennifer Schooling
Neil D. Lawrence
30
0
0
13 Nov 2024
Safety case template for frontier AI: A cyber inability argument
Safety case template for frontier AI: A cyber inability argument
Arthur Goemans
Marie Davidsen Buhl
Jonas Schuett
Tomek Korbak
Jessica Wang
Benjamin Hilton
Geoffrey Irving
58
15
0
12 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
73
0
0
12 Nov 2024
A Clinical Trial Design Approach to Auditing Language Models in
  Healthcare Setting
A Clinical Trial Design Approach to Auditing Language Models in Healthcare Setting
Lovedeep Gondara
Jonathan Simkin
LM&MA
62
0
0
11 Nov 2024
PRISM: A Methodology for Auditing Biases in Large Language Models
PRISM: A Methodology for Auditing Biases in Large Language Models
Leif Azzopardi
Yashar Moshfeghi
29
0
0
24 Oct 2024
Causality for Large Language Models
Causality for Large Language Models
Anpeng Wu
Kun Kuang
Minqin Zhu
Yingrong Wang
Yujia Zheng
Kairong Han
B. Li
Guangyi Chen
Fei Wu
Kun Zhang
LRM
46
7
0
20 Oct 2024
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and
  Reliability
Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability
Weitong Zhang
Chengqi Zang
Bernhard Kainz
28
0
0
01 Oct 2024
Responsible AI in Open Ecosystems: Reconciling Innovation with Risk
  Assessment and Disclosure
Responsible AI in Open Ecosystems: Reconciling Innovation with Risk Assessment and Disclosure
Mahasweta Chakraborti
Bert Joseph Prestoza
Nicholas Vincent
Seth Frey
36
1
0
27 Sep 2024
Improving governance outcomes through AI documentation: Bridging theory
  and practice
Improving governance outcomes through AI documentation: Bridging theory and practice
Amy A. Winecoff
Miranda Bogen
23
2
0
13 Sep 2024
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals
  in Large Language Models
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
Bang An
Sicheng Zhu
Ruiyi Zhang
Michael-Andrei Panaitescu-Liess
Yuancheng Xu
Furong Huang
AAML
31
11
0
01 Sep 2024
Design of a Quality Management System based on the EU Artificial
  Intelligence Act
Design of a Quality Management System based on the EU Artificial Intelligence Act
Henryk Mustroph
Stefanie Rinderle-Ma
23
1
0
08 Aug 2024
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Sara Abdali
Jia He
C. Barberan
Richard Anarfi
31
7
0
30 Jul 2024
Thorns and Algorithms: Navigating Generative AI Challenges Inspired by
  Giraffes and Acacias
Thorns and Algorithms: Navigating Generative AI Challenges Inspired by Giraffes and Acacias
Waqar Hussain
38
0
0
16 Jul 2024
Auditing of AI: Legal, Ethical and Technical Approaches
Auditing of AI: Legal, Ethical and Technical Approaches
Jakob Mokander
44
37
0
07 Jul 2024
JailbreakHunter: A Visual Analytics Approach for Jailbreak Prompts
  Discovery from Large-Scale Human-LLM Conversational Datasets
JailbreakHunter: A Visual Analytics Approach for Jailbreak Prompts Discovery from Large-Scale Human-LLM Conversational Datasets
Zhihua Jin
Shiyi Liu
Haotian Li
Xun Zhao
Huamin Qu
38
3
0
03 Jul 2024
Nicer Than Humans: How do Large Language Models Behave in the Prisoner's
  Dilemma?
Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?
Nicoló Fontana
Francesco Pierri
L. Aiello
26
10
0
19 Jun 2024
Towards Trustworthy AI: A Review of Ethical and Robust Large Language
  Models
Towards Trustworthy AI: A Review of Ethical and Robust Large Language Models
Meftahul Ferdaus
Mahdi Abdelguerfi
Elias Ioup
Kendall N. Niles
Ken Pathak
Steve Sloan
34
10
0
01 Jun 2024
Risks and Opportunities of Open-Source Generative AI
Risks and Opportunities of Open-Source Generative AI
Francisco Eiras
Aleksander Petrov
Bertie Vidgen
Christian Schroeder
Fabio Pizzati
...
Matthew Jackson
Phillip H. S. Torr
Trevor Darrell
Y. Lee
Jakob N. Foerster
40
18
0
14 May 2024
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Navigating LLM Ethics: Advancements, Challenges, and Future Directions
Junfeng Jiao
S. Afroogh
Yiming Xu
Connor Phillips
AILaw
60
19
0
14 May 2024
Concerns on Bias in Large Language Models when Creating Synthetic
  Personae
Concerns on Bias in Large Language Models when Creating Synthetic Personae
Helena A. Haxvig
SyDa
28
2
0
08 May 2024
More RLHF, More Trust? On The Impact of Human Preference Alignment On
  Language Model Trustworthiness
More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness
Aaron Jiaxun Li
Satyapriya Krishna
Himabindu Lakkaraju
33
3
0
29 Apr 2024
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Francisco Eiras
Aleksandar Petrov
Bertie Vidgen
Christian Schroeder de Witt
Fabio Pizzati
...
Paul Röttger
Philip H. S. Torr
Trevor Darrell
Y. Lee
Jakob N. Foerster
46
5
0
25 Apr 2024
Holistic Safety and Responsibility Evaluations of Advanced AI Models
Holistic Safety and Responsibility Evaluations of Advanced AI Models
Laura Weidinger
Joslyn Barnhart
Jenny Brennan
Christina Butterfield
Susie Young
...
Sebastian Farquhar
Lewis Ho
Iason Gabriel
Allan Dafoe
William S. Isaac
ELM
32
8
0
22 Apr 2024
The Necessity of AI Audit Standards Boards
The Necessity of AI Audit Standards Boards
David Manheim
Sammy Martin
Mark Bailey
Mikhail Samin
Ross Greutzmacher
28
7
0
11 Apr 2024
The Impact of Unstated Norms in Bias Analysis of Language Models
The Impact of Unstated Norms in Bias Analysis of Language Models
Farnaz Kohankhaki
D. B. Emerson
David B. Emerson
Laleh Seyyed-Kalantari
Faiza Khan Khattak
52
1
0
04 Apr 2024
Responsible Reporting for Frontier AI Development
Responsible Reporting for Frontier AI Development
Noam Kolt
Markus Anderljung
Joslyn Barnhart
Asher Brass
K. Esvelt
Gillian K. Hadfield
Lennart Heim
Mikel Rodriguez
Jonas B. Sandbrink
Thomas Woodside
42
13
0
03 Apr 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed
  according to the U.S. Executive Order
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
25
7
0
30 Mar 2024
"I'm categorizing LLM as a productivity tool": Examining ethics of LLM
  use in HCI research practices
"I'm categorizing LLM as a productivity tool": Examining ethics of LLM use in HCI research practices
Shivani Kapania
Ruiyi Wang
Toby Jia-Jun Li
Tianshi Li
Hong Shen
34
7
0
28 Mar 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling
Accelerating Greedy Coordinate Gradient via Probe Sampling
Yiran Zhao
Wenyue Zheng
Tianle Cai
Xuan Long Do
Kenji Kawaguchi
Anirudh Goyal
Michael Shieh
43
2
0
02 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
46
8
0
29 Feb 2024
An Empirical Categorization of Prompting Techniques for Large Language
  Models: A Practitioner's Guide
An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide
Oluwole Fagbohun
Rachel M. Harrison
Anton Dereventsov
44
6
0
18 Feb 2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse
  Harms in Text-to-Image Generation
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
Jessica Quaye
Alicia Parrish
Oana Inel
Charvi Rastogi
Hannah Rose Kirk
...
Nathan Clement
Rafael Mosquera
Juan Ciro
Vijay Janapa Reddi
Lora Aroyo
31
7
0
14 Feb 2024
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe
  Approach
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach
Maryam Amirizaniani
Elias Martin
Tanya Roosta
Aman Chadha
Chirag Shah
18
2
0
14 Feb 2024
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Thilo Hagendorff
21
35
0
13 Feb 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
22
76
0
25 Jan 2024
Visibility into AI Agents
Visibility into AI Agents
Alan Chan
Carson Ezell
Max Kaufmann
K. Wei
Lewis Hammond
...
Nitarshan Rajkumar
David M. Krueger
Noam Kolt
Lennart Heim
Markus Anderljung
17
31
0
23 Jan 2024
LLM-Assisted Crisis Management: Building Advanced LLM Platforms for
  Effective Emergency Response and Public Collaboration
LLM-Assisted Crisis Management: Building Advanced LLM Platforms for Effective Emergency Response and Public Collaboration
Hakan T. Otal
M. A. Canbaz
15
13
0
12 Jan 2024
From Prompt Engineering to Prompt Science With Human in the Loop
From Prompt Engineering to Prompt Science With Human in the Loop
Chirag Shah
34
9
0
01 Jan 2024
Foundational Moral Values for AI Alignment
Foundational Moral Values for AI Alignment
Betty Hou
Brian Patrick Green
24
0
0
28 Nov 2023
Challenges of Large Language Models for Mental Health Counseling
Challenges of Large Language Models for Mental Health Counseling
N. C. Chung
George C. Dyer
L. Brocki
LM&MA
AI4MH
68
14
0
23 Nov 2023
Rethinking Large Language Models in Mental Health Applications
Rethinking Large Language Models in Mental Health Applications
Shaoxiong Ji
Tianlin Zhang
Kailai Yang
Sophia Ananiadou
Erik Cambria
LM&MA
AI4MH
29
18
0
19 Nov 2023
Towards Publicly Accountable Frontier LLMs: Building an External
  Scrutiny Ecosystem under the ASPIRE Framework
Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework
Markus Anderljung
Everett Thornton Smith
Joe O'Brien
Lisa Soder
Ben Bucknall
Emma Bluemke
Jonas Schuett
Robert F. Trager
Lacey Strahm
Rumman Chowdhury
35
16
0
15 Nov 2023
A Survey of Large Language Models in Medicine: Progress, Application,
  and Challenge
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge
Hongjian Zhou
Fenglin Liu
Boyang Gu
Xinyu Zou
Jinfa Huang
...
Yefeng Zheng
Lei A. Clifton
Zheng Li
Fenglin Liu
David A. Clifton
LM&MA
31
106
0
09 Nov 2023
Contextual Confidence and Generative AI
Contextual Confidence and Generative AI
Shrey Jain
Zoe Hitzig
Pamela Mishkin
36
5
0
02 Nov 2023
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for
  Self-determination
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination
Luis-Daniel Ibánez
J. Domingue
Sabrina Kirrane
O. Seneviratne
Aisling Third
Maria-Esther Vidal
20
2
0
30 Oct 2023
12
Next