ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.03216
  4. Cited By
Data Governance in the Age of Large-Scale Data-Driven Language
  Technology

Data Governance in the Age of Large-Scale Data-Driven Language Technology

4 May 2022
Yacine Jernite
Huu Nguyen
Stella Biderman
A. Rogers
Maraim Masoud
V. Danchev
Samson Tan
A. Luccioni
Nishant Subramani
Gérard Dupont
Jesse Dodge
Kyle Lo
Zeerak Talat
Isaac Johnson
Dragomir R. Radev
Somaieh Nikpoor
Jorg Frohberg
Aaron Gokaslan
Peter Henderson
Rishi Bommasani
Margaret Mitchell
ArXivPDFHTML

Papers citing "Data Governance in the Age of Large-Scale Data-Driven Language Technology"

20 / 20 papers shown
Title
Beyond Release: Access Considerations for Generative AI Systems
Beyond Release: Access Considerations for Generative AI Systems
Irene Solaiman
Rishi Bommasani
Dan Hendrycks
Ariel Herbert-Voss
Yacine Jernite
Aviya Skowron
Andrew Trask
62
1
0
23 Feb 2025
Data Processing for the OpenGPT-X Model Family
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
79
2
0
11 Oct 2024
Data Authenticity, Consent, & Provenance for AI are all broken: what
  will it take to fix them?
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them?
Shayne Longpre
Robert Mahari
Naana Obeng-Marnu
William Brannon
Tobin South
Katy Gero
Sandy Pentland
Jad Kabbara
60
5
0
19 Apr 2024
Machine Unlearning in Large Language Models
Machine Unlearning in Large Language Models
Kongyang Chen
Zixin Wang
Bing Mi
Waixi Liu
Shaowei Wang
Xiaojun Ren
Jiaxing Shen
MU
24
11
0
03 Feb 2024
Large Language Models as Superpositions of Cultural Perspectives
Large Language Models as Superpositions of Cultural Perspectives
Grgur Kovač
Masataka Sawayama
Rémy Portelas
Cédric Colas
Peter Ford Dominey
Pierre-Yves Oudeyer
LLMAG
35
33
0
15 Jul 2023
BloombergGPT: A Large Language Model for Finance
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
76
786
0
30 Mar 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African
  Languages
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
27
2
0
22 Mar 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
42
194
0
16 Feb 2023
Trustworthy Social Bias Measurement
Trustworthy Social Bias Measurement
Rishi Bommasani
Percy Liang
27
10
0
20 Dec 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
116
2,310
0
09 Nov 2022
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as
  Artificial Adversaries?
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
Saadia Gabriel
Hamid Palangi
Yejin Choi
AAML
37
1
0
08 Nov 2022
Pile of Law: Learning Responsible Data Filtering from the Law and a
  256GB Open-Source Legal Dataset
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
Peter Henderson
M. Krass
Lucia Zheng
Neel Guha
Christopher D. Manning
Dan Jurafsky
Daniel E. Ho
AILaw
ELM
131
97
0
01 Jul 2022
Systematic Inequalities in Language Technology Performance across the
  World's Languages
Systematic Inequalities in Language Technology Performance across the World's Languages
Damián E. Blasi
Antonios Anastasopoulos
Graham Neubig
125
131
0
13 Oct 2021
Just What do You Think You're Doing, Dave?' A Checklist for Responsible
  Data Use in NLP
Just What do You Think You're Doing, Dave?' A Checklist for Responsible Data Use in NLP
Anna Rogers
Timothy Baldwin
Kobi Leins
104
64
0
14 Sep 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Kenny Peng
Arunesh Mathur
Arvind Narayanan
99
93
0
06 Aug 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
254
285
0
02 Feb 2021
Disembodied Machine Learning: On the Illusion of Objectivity in NLP
Disembodied Machine Learning: On the Illusion of Objectivity in NLP
Zeerak Talat
Smarika Lulz
Joachim Bingel
Isabelle Augenstein
96
51
0
28 Jan 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
253
1,996
0
31 Dec 2020
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
It's Morphin' Time! Combating Linguistic Discrimination with
  Inflectional Perturbations
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
Samson Tan
Shafiq R. Joty
Min-Yen Kan
R. Socher
166
103
0
09 May 2020
1