Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.01794
Cited By
v1
v2 (latest)
GenAug: Data Augmentation for Finetuning Text Generators
5 October 2020
Steven Y. Feng
Varun Gangal
Dongyeop Kang
Teruko Mitamura
Eduard H. Hovy
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GenAug: Data Augmentation for Finetuning Text Generators"
35 / 35 papers shown
Title
The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR
Injy Hamed
Ngoc Thang Vu
Nizar Habash
74
0
0
30 Mar 2025
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
204
0
0
09 Feb 2025
Exploring Empty Spaces: Human-in-the-Loop Data Augmentation
Catherine Yeh
Donghao Ren
Yannick Assogba
Dominik Moritz
Fred Hohman
100
0
0
01 Oct 2024
A Survey of Data Synthesis Approaches
Hsin-Yu Chang
Pei-Yu Chen
Tun-Hsiang Chou
Chang-Sheng Kao
Hsuan-Yun Yu
Yen-Ting Lin
Yun-Nung Chen
87
7
0
04 Jul 2024
Targeted Augmentation for Low-Resource Event Extraction
Sijia Wang
Lifu Huang
67
2
0
14 May 2024
Evaluation Metrics for Text Data Augmentation in NLP
Marcellus Amadeus
William Alberto Cruz Castañeda
70
1
0
09 Feb 2024
Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications
Saed Rezayi
Zheng Liu
Zihao Wu
Chandra Dhakal
Bao Ge
...
Gengchen Mai
Ninghao Liu
Chen Zhen
Tianming Liu
Sheng Li
73
33
0
20 Jun 2023
Data Augmentation for Low-Resource Keyphrase Generation
Krishna Garg
Jishnu Ray Chowdhury
Cornelia Caragea
85
10
0
29 May 2023
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
Anders Giovanni Møller
Jacob Aarup Dalsgaard
Arianna Pera
L. Aiello
152
39
0
26 Apr 2023
STA: Self-controlled Text Augmentation for Improving Text Classifications
Congcong Wang
Gonzalo Fiz Pontiveros
Steven Derby
Tri Kurniawan Wijaya
67
4
0
24 Feb 2023
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation
Biyang Guo
Yeyun Gong
Yelong Shen
Songqiao Han
Hailiang Huang
Nan Duan
Weizhu Chen
VLM
80
19
0
18 Nov 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
108
35
0
25 Oct 2022
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Steven Y. Feng
Vivek Khetan
Bogdan Sacaleanu
A. Gershman
Eduard H. Hovy
LRM
88
10
0
09 Oct 2022
PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification data for Learning Enhanced generation
Sedrick Scott Keh
Kevin Lu
Varun Gangal
Steven Y. Feng
Harsh Jhamtani
Malihe Alikhani
Eduard H. Hovy
110
2
0
16 Sep 2022
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters Automatically
Sedrick Scott Keh
Steven Y. Feng
Varun Gangal
Malihe Alikhani
Eduard H. Hovy
57
4
0
13 Sep 2022
Selective Text Augmentation with Word Roles for Low-Resource Text Classification
Biyang Guo
Songqiao Han
Hailiang Huang
61
9
0
04 Sep 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
100
18
0
25 May 2022
Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation
Wanyu Du
Hanjie Chen
Yangfeng Ji
56
1
0
19 May 2022
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
Le Zhang
Zichao Yang
Diyi Yang
106
25
0
12 May 2022
UTNLP at SemEval-2022 Task 6: A Comparative Analysis of Sarcasm Detection Using Generative-based and Mutation-based Data Augmentation
Amirhossein Abaskohi
A. Rasouli
Tanin Zeraati
B. Bahrak
63
11
0
18 Apr 2022
DAGAM: Data Augmentation with Generation And Modification
Byeong-Cheol Jo
Tak-Sung Heo
Yeongjoon Park
Yongmin Yoo
Won-Ik Cho
Kyungsun Kim
VLM
65
2
0
06 Apr 2022
Impact of Environmental Noise on Alzheimer's Disease Detection from Speech: Should You Let a Baby Cry?
Jekaterina Novikova
51
0
0
31 Mar 2022
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
Zhuang Li
Zhuang Li
Xingliang Yuan
Tongtong Wu
Tianyang Zhan
Gholamreza Haffari
CoGe
UD
DRL
107
4
0
27 Feb 2022
MDPFuzz: Testing Models Solving Markov Decision Processes
Qi Pang
Yuanyuan Yuan
Shuai Wang
96
30
0
06 Dec 2021
To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP
Gözde Gül Sahin
68
34
0
18 Nov 2021
Data Augmentation Methods for Anaphoric Zero Pronouns
Abdulrahman Aloraini
Massimo Poesio
73
5
0
20 Sep 2021
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
Steven Y. Feng
Kevin Lu
Zhuofu Tao
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
Varun Gangal
LRM
79
13
0
08 Sep 2021
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Historical News Collections
Jiexin Wang
Adam Jatowt
Masatoshi Yoshikawa
132
36
0
08 Sep 2021
SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation
Steven Y. Feng
Jessica Huynh
Chaitanya Narisetty
Eduard H. Hovy
Varun Gangal
VLM
58
9
0
15 Aug 2021
A Survey on Data Augmentation for Text Classification
Markus Bayer
M. Kaufhold
Christian A. Reuter
145
355
0
07 Jul 2021
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
Jiaao Chen
Derek Tam
Colin Raffel
Joey Tianyi Zhou
Diyi Yang
116
178
0
14 Jun 2021
Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation
Varun Gangal
Harsh Jhamtani
Eduard H. Hovy
Taylor Berg-Kirkpatrick
49
9
0
05 Jun 2021
A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng
Varun Gangal
Jason W. Wei
Sarath Chandar
Soroush Vosoughi
Teruko Mitamura
Eduard H. Hovy
AIMat
119
831
0
07 May 2021
NAREOR: The Narrative Reordering Problem
Varun Gangal
Steven Y. Feng
Malihe Alikhani
Teruko Mitamura
Eduard H. Hovy
94
26
0
14 Apr 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
110
71
0
08 Feb 2021
1