Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.01852
Cited By
Data Augmentation Approaches in Natural Language Processing: A Survey
5 October 2021
Bohan Li
Yutai Hou
Wanxiang Che
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Data Augmentation Approaches in Natural Language Processing: A Survey"
50 / 80 papers shown
Title
Towards High-Fidelity Synthetic Multi-platform Social Media Datasets via Large Language Models
Henry Tari
Nojus Sereiva
Rishabh Kaushal
T. Bertaglia
Adriana Iamnitchi
30
0
0
02 May 2025
Data Augmentation and Hyperparameter Tuning for Low-Resource MFA
Alessio Tosolini
Claire Bowern
21
0
0
09 Apr 2025
Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains
Ming Liu
Massimo Poesio
16
0
0
09 Apr 2025
Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education
Ahatsham Hayat
Bilal Khan
Mohammad Hasan
AI4Ed
68
0
0
27 Mar 2025
Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection
Chatrine Qwaider
Bashar Alhafni
Kirill Chirkunov
Nizar Habash
Ted Briscoe
57
1
0
22 Mar 2025
Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation
Jackie Lin
Georg Götz
Hermes Sampedro Llopis
Haukur Hafsteinsson
Steinar Guðjónsson
...
Paris Smaragdis
Dinesh Manocha
John Hershey
Trausti Kristjansson
Minje Kim
82
2
0
22 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
12
0
31 Dec 2024
Channel Reflection: Knowledge-Driven Data Augmentation for EEG-Based Brain-Computer Interfaces
Ziwei Wang
Siyang Li
Jingwei Luo
Jiajing Liu
Dongrui Wu
80
6
0
04 Dec 2024
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
Michael Pieler
Marco Bellagente
H. Teufel
Duy Phung
Nathan Cooper
...
Reshinth Adithyan
Zaid Alyafeai
Nikhil Pinnaparaju
Maksym Zhuravinskyi
Carlos Riquelme
27
0
0
28 Oct 2024
Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models
Mohammad Beigi
Sijia Wang
Ying Shen
Zihao Lin
Adithya Kulkarni
...
Ming Jin
Jin-Hee Cho
Dawei Zhou
Chang-Tien Lu
Lifu Huang
26
1
0
26 Oct 2024
Pseudo-Non-Linear Data Augmentation via Energy Minimization
Pingbang Hu
Mahito Sugiyama
23
0
0
01 Oct 2024
Advancing Post-OCR Correction: A Comparative Study of Synthetic Data
Shuhao Guan
Derek Greene
26
6
0
05 Aug 2024
Face4RAG: Factual Consistency Evaluation for Retrieval Augmented Generation in Chinese
Yunqi Xu
Tianchi Cai
Jiyan Jiang
Xierui Song
33
2
0
01 Jul 2024
A Survey on Data Quality Dimensions and Tools for Machine Learning
Yuhan Zhou
Fengjiao Tu
Kewei Sha
Junhua Ding
Haihua Chen
38
4
0
28 Jun 2024
Data Generation Using Large Language Models for Text Classification: An Empirical Case Study
Yinheng Li
Rogerio Bonatti
Sara Abdali
Justin Wagle
K. Koishida
SyDa
39
5
0
27 Jun 2024
InternalInspector
I
2
I^2
I
2
: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi
Ying Shen
Runing Yang
Zihao Lin
Qifan Wang
Ankith Mohan
Jianfeng He
Ming Jin
Chang-Tien Lu
Lifu Huang
HILM
34
4
0
17 Jun 2024
FaceMixup: Enhancing Facial Expression Recognition through Mixed Face Regularization
Fabio A. Faria
Mateus M. Souza
R. F. D. S. Teixeira
Maurício Pamplona Segundo
47
0
0
30 May 2024
Data Augmentation Method Utilizing Template Sentences for Variable Definition Extraction
Kotaro Nagayama
Shota Kato
Manabu Kano
14
1
0
23 May 2024
Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin
Pin-Jie Lin
Merel C. J. Scholman
Muhammed Saeed
Vera Demberg
24
2
0
28 Apr 2024
Research and application of artificial intelligence based webshell detection model: A literature review
Mingrui Ma
Lansheng Han
Chunjie Zhou
81
2
0
28 Apr 2024
Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks
Minju Seo
Jinheon Baek
James Thorne
Sung Ju Hwang
RALM
29
9
0
21 Feb 2024
Evaluation Metrics for Text Data Augmentation in NLP
Marcellus Amadeus
William Alberto Cruz Castañeda
30
1
0
09 Feb 2024
AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
Juhwan Choi
Kyohoon Jin
Junho Lee
Sangmin Song
Youngbin Kim
22
1
0
08 Feb 2024
Prompt-Time Symbolic Knowledge Capture with Large Language Models
Tolga Çöplü
Arto Bendiken
Andrii Skomorokhov
Eduard Bateiko
Stephen Cobb
Joshua J. Bouw
KELM
29
1
0
01 Feb 2024
A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
Muhammad Ilyas Azeem
Sallam Abualhaija
35
5
0
23 Nov 2023
Portuguese FAQ for Financial Services
Paulo Finardi
Wanderley M. Melo
Edgard D. Medeiros Neto
Alex F. Mansano
Pablo B. Costa
Vinicius Fernandes Caridá
33
0
0
19 Nov 2023
The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text
Yanzhu Guo
Guokan Shang
Michalis Vazirgiannis
Chloé Clavel
26
48
0
16 Nov 2023
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts
Nafis Irtiza Tripto
Saranya Venkatraman
Dominik Macko
Robert Moro
Ivan Srba
Adaku Uchendu
Thai V. Le
Dongwon Lee
DeLMO
35
16
0
14 Nov 2023
Noise-Robust Fine-Tuning of Pretrained Language Models via External Guidance
Song Wang
Zhen Tan
Ruocheng Guo
Jundong Li
NoLa
14
20
0
02 Nov 2023
Data Optimization in Deep Learning: A Survey
Ou Wu
Rujing Yao
32
1
0
25 Oct 2023
Using GPT-4 to Augment Unbalanced Data for Automatic Scoring
Luyang Fang
Gyeong-Geon Lee
Xiaoming Zhai
26
17
0
25 Oct 2023
A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models
Yuanfeng Song
Yuanqin He
Xuefang Zhao
Hanlin Gu
Di Jiang
Haijun Yang
Lixin Fan
Qiang Yang
32
3
0
24 Oct 2023
Enhancing Abstractiveness of Summarization Models through Calibrated Distillation
Hwanjun Song
Igor Shalyminov
Hang Su
Siffi Singh
Kaisheng Yao
Saab Mansour
19
6
0
20 Oct 2023
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
19
32
0
20 Oct 2023
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4
Katikapalli Subramanyam Kalyan
LM&MA
AI4CE
LRM
AILaw
ELM
32
221
0
04 Oct 2023
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer
Leixin Yang
Yu Xiang
23
0
0
22 Sep 2023
Synthetic Text Generation using Hypergraph Representations
Natraj Raman
Sameena Shah
8
1
0
06 Sep 2023
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation
Jianghao Lin
Rongjie Shan
Chenxu Zhu
Kounianhua Du
Bo Chen
Shigang Quan
Ruiming Tang
Yong Yu
Weinan Zhang
LRM
32
79
0
22 Aug 2023
Making Pre-trained Language Models both Task-solvers and Self-calibrators
Yangyi Chen
Xingyao Wang
Heng Ji
18
0
0
21 Jul 2023
Investigating Masking-based Data Generation in Language Models
Edward Ma
33
0
0
16 Jun 2023
Simple Data Augmentation Techniques for Chinese Disease Normalization
Wenqian Cui
Xiangling Fu
Shao-Chen Liu
Mingjun Gu
Xien Liu
Ji Wu
Irwin King
27
0
0
02 Jun 2023
GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks
Xuming Hu
Aiwei Liu
Zeqi Tan
Xin Zhang
Chenwei Zhang
Irwin King
Philip S. Yu
42
16
0
26 May 2023
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang
Y. Song
Xuan Ren
Chenyang Lyu
Yidong Wang
Lingqiao Liu
Jindong Wang
Jennifer Foster
Yue Zhang
OOD
32
2
0
23 May 2023
Target-Side Augmentation for Document-Level Machine Translation
Guangsheng Bao
Zhiyang Teng
Yue Zhang
21
10
0
08 May 2023
Controllable Data Augmentation for Context-Dependent Text-to-SQL
Dingzirui Wang
Longxu Dou
Wanxiang Che
15
0
0
27 Apr 2023
Conversational Process Modeling: Can Generative AI Empower Domain Experts in Creating and Redesigning Process Models?
Nataliia Klievtsova
Janik-Vasily Benzin
T. Kampik
Juergen Mangler
S. Rinderle-Ma
17
6
0
19 Apr 2023
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Bohan Li
Longxu Dou
Yutai Hou
Yunlong Feng
Honglin Mu
Qingfu Zhu
Qinghua Sun
Wanxiang Che
VLM
29
3
0
19 Apr 2023
Towards Understanding How Data Augmentation Works with Imbalanced Data
Damien Dablain
Nitesh V. Chawla
AI4CE
26
2
0
12 Apr 2023
A review of ensemble learning and data augmentation models for class imbalanced problems: combination, implementation and evaluation
A. Khan
Omkar Chaudhari
Rohitash Chandra
31
164
0
06 Apr 2023
No Place to Hide: Dual Deep Interaction Channel Network for Fake News Detection based on Data Augmentation
Biwei Cao
Lulu Hua
Jiuxin Cao
Jie Gui
Bo Liu
James T. Kwok
13
1
0
31 Mar 2023
1
2
Next