Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.03118
Cited By
Not Enough Data? Deep Learning to the Rescue!
8 November 2019
Ateret Anaby-Tavor
Boaz Carmeli
Esther Goldbraich
Amir Kantor
George Kour
Segev Shlomov
N. Tepper
Naama Zwerdling
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Not Enough Data? Deep Learning to the Rescue!"
50 / 69 papers shown
Title
Synthetic vs. Gold: The Role of LLM-Generated Labels and Data in Cyberbullying Detection
Arefeh Kazemi
Sri Balaaji Natarajan Kalaivendan
Joachim Wagner
Hamza Qadeer
Brian Davis
66
1
0
21 Feb 2025
Diversity-Oriented Data Augmentation with Large Language Models
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
80
1
0
17 Feb 2025
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Suhas S Kowshik
Abhishek Divekar
Vijit Malik
SyDa
37
0
0
13 Nov 2024
Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification
Guanyi Mou
Yichuan Li
Kyumin Lee
36
3
0
26 Sep 2024
Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference
Claudio Angione
Yue Zhao
Harry Yang
Ahmad Farhan
Fielding Johnston
James Buban
Patrick Colangelo
42
1
0
29 Jul 2024
A Comprehensive Survey on Data Augmentation
Zaitian Wang
Pengfei Wang
Kunpeng Liu
Pengyang Wang
Yanjie Fu
Chang-Tien Lu
Charu Aggarwal
Jian Pei
Yuanchun Zhou
ViT
109
22
0
15 May 2024
Edisum: Summarizing and Explaining Wikipedia Edits at Scale
Marija Sakota
Isaac Johnson
Guosheng Feng
Robert West
SyDa
KELM
43
2
0
04 Apr 2024
AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes
Juhwan Choi
Kyohoon Jin
Junho Lee
Sangmin Song
Youngbin Kim
30
1
0
08 Feb 2024
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
33
11
0
16 Nov 2023
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
27
32
0
20 Oct 2023
Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges
Vinay Samuel
Houda Aynaou
Arijit Ghosh Chowdhury
Karthik Venkat Ramanan
Aman Chadha
SyDa
33
7
0
21 Sep 2023
Community-Based Hierarchical Positive-Unlabeled (PU) Model Fusion for Chronic Disease Prediction
Yang Wu
Xurui Li
Xuhong Zhang
Yangyang Kang
Changlong Sun
Xiaozhong Liu
32
3
0
06 Sep 2023
I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection
Yongzhu Chang
Rongsheng Zhang
Jiashu Pu
38
1
0
08 Aug 2023
From Fake to Hyperpartisan News Detection Using Domain Adaptation
Razvan-Alexandru Smadu
Sebastian-Vasile Echim
Dumitru-Clementin Cercel
Iuliana Marin
Florin-Catalin Pop
29
3
0
04 Aug 2023
Semi-supervised Relation Extraction via Data Augmentation and Consistency-training
Komal K. Teru
40
5
0
16 Jun 2023
Targeted Data Generation: Finding and Fixing Model Weaknesses
Zexue He
Marco Tulio Ribeiro
Fereshte Khani
29
13
0
28 May 2023
Boosting Event Extraction with Denoised Structure-to-Text Augmentation
Bo Wang
Heyan Huang
Xiaochi Wei
Ge Shi
Xiao Liu
Chong Feng
Tong Zhou
Shuai Wang
Dawei Yin
46
5
0
16 May 2023
NLP-LTU at SemEval-2023 Task 10: The Impact of Data Augmentation and Semi-Supervised Learning Techniques on Text Classification Performance on an Imbalanced Dataset
Sana Al-Azzawi
Gyorgy Kovács
Filip Nilsson
Tosin P. Adewumi
Marcus Liwicki
28
6
0
25 Apr 2023
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers
Jon Saad-Falcon
Omar Khattab
Keshav Santhanam
Radu Florian
M. Franz
Salim Roukos
Avirup Sil
Md Arafat Sultan
Christopher Potts
29
41
0
01 Mar 2023
STA: Self-controlled Text Augmentation for Improving Text Classifications
Congcong Wang
Gonzalo Fiz Pontiveros
Steven Derby
Tri Kurniawan Wijaya
46
3
0
24 Feb 2023
What happens before and after: Multi-Event Commonsense in Event Coreference Resolution
Sahithya Ravi
Christy Tanner
R. Ng
Vered Shwarz
45
16
0
20 Feb 2023
Data Augmentation for Modeling Human Personality: The Dexter Machine
Yair Neuman
Vladyslav Kozhukhov
Dan Vilenchik
SyDa
27
4
0
20 Jan 2023
Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction
Jun Gao
Changlong Yu
Wei Wang
Huan Zhao
Ruifeng Xu
23
33
0
06 Jan 2023
A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
Chengtai Cao
Fan Zhou
Yurou Dai
Jianping Wang
Kunpeng Zhang
AAML
24
28
0
21 Dec 2022
On-the-fly Denoising for Data Augmentation in Natural Language Understanding
Tianqing Fang
Wenxuan Zhou
Fangyu Liu
Hongming Zhang
Yangqiu Song
Muhao Chen
41
1
0
20 Dec 2022
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
ALM
48
362
0
19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
22
367
0
19 Dec 2022
Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
George Kour
Samuel Ackerman
Orna Raz
E. Farchi
Boaz Carmeli
Ateret Anaby-Tavor
41
10
0
29 Nov 2022
Textual Data Augmentation for Patient Outcomes Prediction
Qiuhao Lu
Dejing Dou
Thien Huu Nguyen
11
15
0
13 Nov 2022
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
Yu Meng
Martin Michalski
Jiaxin Huang
Yu Zhang
Tarek F. Abdelzaher
Jiawei Han
VLM
56
47
0
06 Nov 2022
Learning to Infer from Unlabeled Data: A Semi-supervised Learning Approach for Robust Natural Language Inference
Mobashir Sadat
Cornelia Caragea
21
2
0
05 Nov 2022
ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback
Jiacheng Ye
Jiahui Gao
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
VLM
78
72
0
22 Oct 2022
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation
I. Sarhan
P. Mosteiro
Marco Spruit
31
2
0
07 Oct 2022
Selective Text Augmentation with Word Roles for Low-Resource Text Classification
Biyang Guo
Songqiao Han
Hailiang Huang
19
9
0
04 Sep 2022
Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP
Johann Frei
Frank Kramer
29
1
0
30 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
39
3
0
11 Aug 2022
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence
Markus Bayer
Tobias Frey
Christian A. Reuter
AAML
24
15
0
22 Jul 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
27
18
0
25 May 2022
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
Le Zhang
Zichao Yang
Diyi Yang
36
24
0
12 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
33
1
0
09 May 2022
High-quality Conversational Systems
Samuel Ackerman
Ateret Anaby-Tavor
E. Farchi
Esther Goldbraich
George Kour
Ella Ravinovich
Orna Raz
Saritha Route
Marcel Zalmanovici
Naama Zwerdling
AI4MH
19
0
0
27 Apr 2022
EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification
Minyi Zhao
Lu Zhang
Yi Xu
Jiandong Ding
Jihong Guan
Shuigeng Zhou
VLM
49
10
0
24 Apr 2022
Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies
Md. Rizwan Parvez
Jianfeng Chi
Wasi Uddin Ahmad
Yuan Tian
Kai-Wei Chang
RALM
21
13
0
19 Apr 2022
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis
Bing Wang
Liang Ding
Qihuang Zhong
Ximing Li
Dacheng Tao
29
32
0
16 Apr 2022
Data Augmentation for Intent Classification with Off-the-shelf Large Language Models
Gaurav Sahu
Pau Rodríguez López
I. Laradji
Parmida Atighehchian
David Vazquez
Dzmitry Bahdanau
24
61
0
05 Apr 2022
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
Zhuang Li
Lizhen Qu
Qiongkai Xu
Tongtong Wu
Tianyang Zhan
Gholamreza Haffari
CoGe
UD
DRL
44
4
0
27 Feb 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
45
212
0
16 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
392
4,154
0
28 Jan 2022
Semantic-based Data Augmentation for Math Word Problems
Ai Li
Jiaqing Liang
Yanghua Xiao
AAML
24
7
0
07 Jan 2022
To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP
Gözde Gül Sahin
40
33
0
18 Nov 2021
1
2
Next