Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.10158
Cited By
Data-centric Artificial Intelligence: A Survey
17 March 2023
Daochen Zha
Zaid Pervaiz Bhat
Kwei-Herng Lai
Fan Yang
Zhimeng Jiang
Shaochen Zhong
Xia Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Data-centric Artificial Intelligence: A Survey"
50 / 112 papers shown
Title
Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges
Bosheng Ding
Chengwei Qin
Ruochen Zhao
Tianze Luo
Xinze Li
Guizhen Chen
Wenhan Xia
Junjie Hu
A. Luu
Shafiq R. Joty
31
18
0
05 Mar 2024
A Survey on Evaluation of Out-of-Distribution Generalization
Han Yu
Jiashuo Liu
Xingxuan Zhang
Jiayun Wu
Peng Cui
OOD
47
8
0
04 Mar 2024
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
Yuan Ge
Yilun Liu
Chi Hu
Weibin Meng
Shimin Tao
Xiaofeng Zhao
Hongxia Ma
Li Zhang
Hao Yang
Tong Xiao
ALM
32
24
0
28 Feb 2024
The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review
Daniel Schwabe
Katinka Becker
Martin Seyferth
Andreas Klass
Tobias Schäffter
34
20
0
21 Feb 2024
LongWanjuan: Towards Systematic Measurement for Long Text Quality
Kai Lv
Xiaoran Liu
Qipeng Guo
Hang Yan
Conghui He
Xipeng Qiu
Dahua Lin
33
4
0
21 Feb 2024
Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits
Hiroyuki Namba
Shota Horiguchi
Masaki Hamamoto
Masashi Egi
TDI
19
0
0
13 Feb 2024
EXMOS: Explanatory Model Steering Through Multifaceted Explanations and Data Configurations
Aditya Bhattacharya
Simone Stumpf
Lucija Gosak
Gregor Stiglic
K. Verbert
52
18
0
01 Feb 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
15
0
30 Jan 2024
Multimodal Data Curation via Object Detection and Filter Ensembles
Tzu-Heng Huang
Changho Shin
Sui Jiet Tay
Dyah Adila
Frederic Sala
34
5
0
05 Jan 2024
Data-Centric Foundation Models in Computational Healthcare: A Survey
Yunkun Zhang
Jin Gao
Zheling Tan
Lingfeng Zhou
Kexin Ding
Mu Zhou
Shaoting Zhang
Dequan Wang
AI4CE
37
22
0
04 Jan 2024
Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning
Xiao-Yang Liu
Rongyi Zhu
Daochen Zha
Jiechao Gao
Shan Zhong
Matt White
Meikang Qiu
23
15
0
29 Dec 2023
README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP
Zonghai Yao
Nandyala Siddharth Kantu
Guanghao Wei
Hieu Tran
Zhangqi Duan
Sunjae Kwon
Zhichao Yang
Readme annotation team
Hong-ye Yu
26
7
0
24 Dec 2023
Chasing Fairness in Graphs: A GNN Architecture Perspective
Zhimeng Jiang
Xiaotian Han
Chao Fan
Zirui Liu
Na Zou
Ali Mostafavi
Xia Hu
36
4
0
19 Dec 2023
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Nabeel Seedat
Nicolas Huynh
B. V. Breugel
M. Schaar
26
25
0
19 Dec 2023
KnowGPT: Knowledge Graph based Prompting for Large Language Models
Qinggang Zhang
Junnan Dong
Hao Chen
Daochen Zha
Zailiang Yu
Xiao Huang
KELM
RALM
24
5
0
11 Dec 2023
Adversarial Learning for Feature Shift Detection and Correction
Míriam Barrabés
D. M. Montserrat
Margarita Geleta
Xavier Giró-i-Nieto
A. Ioannidis
OOD
31
2
0
07 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
27
22
0
01 Dec 2023
Annotation Sensitivity: Training Data Collection Methods Affect Model Performance
Christoph Kern
Stephanie Eckman
Jacob Beck
Rob Chew
Bolei Ma
Frauke Kreuter
24
9
0
23 Nov 2023
Understanding Fairness Surrogate Functions in Algorithmic Fairness
Wei Yao
Zhanke Zhou
Zhicong Li
Bo Han
Yong Liu
29
3
0
17 Oct 2023
Towards Deep Learning Models Resistant to Transfer-based Adversarial Attacks via Data-centric Robust Learning
Yulong Yang
Chenhao Lin
Xiang Ji
Qiwei Tian
Qian Li
Hongshan Yang
Zhibo Wang
Chao Shen
30
7
0
15 Oct 2023
TabLib: A Dataset of 627M Tables with Context
Gus Eggert
Kevin Huo
Mike Biven
Justin Waugh
LMTD
28
10
0
11 Oct 2023
Data-centric Graph Learning: A Survey
Jixi Liu
Deyu Bo
Cheng Yang
Haoran Dai
Qi Zhang
Yixin Xiao
Yufei Peng
Chuan Shi
GNN
27
19
0
08 Oct 2023
Data Cleaning and Machine Learning: A Systematic Literature Review
Pierre-Olivier Coté
Amin Nikanjam
Nafisa Ahmed
D. Humeniuk
Foutse Khomh
41
22
0
03 Oct 2023
CODA: Temporal Domain Generalization via Concept Drift Simulator
Chia-Yuan Chang
Yu-Neng Chuang
Zhimeng Jiang
Kwei-Herng Lai
Anxiao Jiang
Na Zou
OOD
24
5
0
02 Oct 2023
Provable advantages of kernel-based quantum learners and quantum preprocessing based on Grover's algorithm
Till Muser
Elias Zapusek
Vasilis Belis
Florentin Reiter
30
5
0
25 Sep 2023
Towards Data-centric Graph Machine Learning: Review and Outlook
Xin Zheng
Yixin Liu
Zhifeng Bao
Meng Fang
Xia Hu
Alan Wee-Chung Liew
Shirui Pan
GNN
AI4CE
31
19
0
20 Sep 2023
A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation
Ziyan Huang
Zhongying Deng
Jin Ye
Haoyu Wang
Yan-Cheng Su
...
Junjun He
Yun Gu
Shaoting Zhang
Lixu Gu
Yu Qiao
ELM
24
3
0
07 Sep 2023
DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research
Yu-Neng Chuang
Guanchu Wang
Chia-Yuan Chang
Kwei-Herng Lai
Daochen Zha
...
Fan Yang
Alfredo Costilla Reyes
Kaixiong Zhou
Xiaoqian Jiang
Xia Hu
25
4
0
04 Sep 2023
Tackling Diverse Minorities in Imbalanced Classification
Kwei-Herng Lai
Daochen Zha
Huiyuan Chen
M. Bendre
Yuzhong Chen
Mahashweta Das
Hao Yang
Xia Hu
23
0
0
28 Aug 2023
YOLOv8 for Defect Inspection of Hexagonal Directed Self-Assembly Patterns: A Data-Centric Approach
Enrique Dehaerne
Bappaditya Dey
Hossein Esfandiar
L. Verstraete
H. Suh
S. Halder
S. de Gendt
19
13
0
28 Jul 2023
A Deep Learning Approach for Overall Survival Prediction in Lung Cancer with Missing Values
Camillo Maria Caruso
V. Guarrasi
Sara Ramella
Paolo Soda
11
10
0
21 Jul 2023
UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition
Aidan Mannion
Thierry Chevalier
D. Schwab
Lorraine Goeuriot
MedIm
51
3
0
20 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
38
43
0
19 Jul 2023
AlpaGasus: Training A Better Alpaca with Fewer Data
Lichang Chen
Shiyang Li
Jun Yan
Hai Wang
Kalpa Gunaratna
...
Zheng Tang
Vijay Srinivasan
Dinesh Manocha
Heng-Chiao Huang
Hongxia Jin
ALM
44
0
0
17 Jul 2023
Visual Analytics For Machine Learning: A Data Perspective Survey
Junpeng Wang
Shixia Liu
Wei Zhang
HAI
32
17
0
15 Jul 2023
DataCI: A Platform for Data-Centric AI on Streaming Data
Huaizheng Zhang
Yizheng Huang
Yuanming Li
20
0
0
27 Jun 2023
Transcending Traditional Boundaries: Leveraging Inter-Annotator Agreement (IAA) for Enhancing Data Management Operations (DMOps)
Damrin Kim
Namhyeok Kim
Chanjun Park
Harksoo Kim
19
1
0
26 Jun 2023
OpenGSL: A Comprehensive Benchmark for Graph Structure Learning
Zhiyao Zhou
Sheng Zhou
Bochao Mao
Xu Zhou
Jiawei Chen
Qiaoyu Tan
Daochen Zha
Yan Feng
Chun-Yen Chen
C. Wang
29
20
0
17 Jun 2023
When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset
Jiaxin Pei
David Jurgens
40
31
0
12 Jun 2023
Transition Role of Entangled Data in Quantum Machine Learning
Xinbiao Wang
Yuxuan Du
Zhuozhuo Tu
Yong Luo
Xiao Yuan
Dacheng Tao
40
8
0
06 Jun 2023
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Jia-Bin Huang
Yi Ren
Rongjie Huang
Dongchao Yang
Zhenhui Ye
Chen Zhang
Jinglin Liu
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
18
59
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
26
27
0
28 May 2023
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Zirui Liu
Guanchu Wang
Shaochen Zhong
Zhaozhuo Xu
Daochen Zha
...
Zhimeng Jiang
Kaixiong Zhou
V. Chaudhary
Shuai Xu
Xia Hu
41
12
0
24 May 2023
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
David Z. Pan
AI4CE
27
16
0
24 May 2023
MediTab: Scaling Medical Tabular Data Predictors via Data Consolidation, Enrichment, and Refinement
Zifeng Wang
Chufan Gao
Cao Xiao
Jimeng Sun
LMTD
25
12
0
20 May 2023
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Zhaozhuo Xu
Zirui Liu
Beidi Chen
Yuxin Tang
Jue Wang
Kaixiong Zhou
Xia Hu
Anshumali Shrivastava
MQ
24
29
0
17 May 2023
Comparing Foundation Models using Data Kernels
Brandon Duderstadt
Hayden S. Helm
Carey E. Priebe
21
5
0
09 May 2023
Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Daochen Zha
Louis Feng
Liangchen Luo
Bhargav Bhushanam
Zirui Liu
...
J. McMahon
Yuzhen Huang
Bryan Clarke
A. Kejariwal
Xia Hu
52
7
0
03 May 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
131
622
0
26 Apr 2023
Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Xiao-Yang Liu
Ziyi Xia
Hongyang Yang
Jiechao Gao
Daochen Zha
Ming Zhu
Chris Wang
Zhaoran Wang
Jian Guo
OffRL
26
27
0
25 Apr 2023
Previous
1
2
3
Next