Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.04140
Cited By
Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions
7 June 2023
John Joon Young Chung
Ece Kamar
Saleema Amershi
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions"
50 / 71 papers shown
Title
Multilingual Prompting for Improving LLM Generation Diversity
Qihan Wang
Shidong Pan
Tal Linzen
Emily Black
LRM
17
0
0
21 May 2025
Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations
Moran Mizrahi
Chen Shani
Gabriel Stanovsky
Dan Jurafsky
Dafna Shahaf
29
0
0
29 Apr 2025
FinNLI: Novel Dataset for Multi-Genre Financial Natural Language Inference Benchmarking
Jabez Magomere
Elena Kochkina
Samuel Mensah
Simerjot Kaur
Charese Smiley
36
1
0
22 Apr 2025
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan
Chen Henry Wu
Charles Ding
Aditi Raghunathan
45
0
0
21 Apr 2025
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
Heng Chang
Zhiting Fan
Ruizhe Chen
Xiaotang Gai
Luqi Gong
Yan Zhang
Zuozhu Liu
LLMSV
40
1
0
20 Apr 2025
Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes
Qi Tao
Yin Jinhua
Cai Dongqi
Xie Yueqi
Wang Huili
...
Zhou Zhili
Wang Shangguang
Lyu Lingjuan
Huang Yongfeng
Lane Nicholas
40
0
0
24 Mar 2025
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning
Sheng Wang
Pengan Chen
Jingqi Zhou
Qintong Li
Jingwei Dong
Jiahui Gao
Boyang Xue
Jiyue Jiang
Lingpeng Kong
Chuan Wu
SyDa
71
0
0
21 Mar 2025
Modifying Large Language Model Post-Training for Diverse Creative Writing
John Joon Young Chung
Vishakh Padmakumar
Melissa Roemmele
Yuqian Sun
Max Kreminski
MoMe
51
1
0
21 Mar 2025
Measuring Similarity in Causal Graphs: A Framework for Semantic and Structural Analysis
Ning-Yuan Georgia Liu
Flower Yang
Mohammad S. Jalali
CML
69
0
0
14 Mar 2025
Biases in Large Language Model-Elicited Text: A Case Study in Natural Language Inference
Grace Proebsting
Adam Poliak
55
0
0
06 Mar 2025
Exploring and Controlling Diversity in LLM-Agent Conversation
Kuanchao Chu
Yi-Pei Chen
Hideki Nakayama
LLMAG
50
1
0
24 Feb 2025
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
Chenlong Wang
Zhaoyang Chu
Zhengxiang Cheng
Xuyi Yang
Kaiyue Qiu
Yao Wan
Zhou Zhao
Xuanhua Shi
Danny Chen
ALM
SyDa
47
0
0
23 Feb 2025
Measuring Diversity in Synthetic Datasets
Yuchang Zhu
Huizhe Zhang
Bingzhe Wu
Jintang Li
Zibin Zheng
Peilin Zhao
Liang Chen
Yatao Bian
100
0
0
12 Feb 2025
Diverse Preference Optimization
Jack Lanchantin
Angelica Chen
S. Dhuliawala
Ping Yu
Jason Weston
Sainbayar Sukhbaatar
Ilia Kulikov
112
4
0
30 Jan 2025
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu
Hejie Cui
Yue Yu
Xuan Kan
Wenqi Shi
Yuchen Zhuang
Wei Jin
Joyce C. Ho
Carl Yang
71
14
0
28 Jan 2025
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
Qingbin Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
96
1
0
16 Jan 2025
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
Yeming Wen
Swarat Chaudhuri
39
0
0
11 Nov 2024
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?
Ioannis Tsiamas
Matthias Sperber
Andrew Finch
Sarthak Garg
41
0
0
31 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
228
0
0
22 Oct 2024
Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
Maggie Mi
Aline Villavicencio
Nafise Sadat Moosavi
53
1
0
21 Oct 2024
A Comprehensive Evaluation of Cognitive Biases in LLMs
Simon Malberg
Roman Poletukhin
Carolin M. Schuster
Georg Groh
ELM
47
5
0
20 Oct 2024
A Survey on Data Synthesis and Augmentation for Large Language Models
Ke Wang
Jiahui Zhu
Minjie Ren
Zichen Liu
Shiwei Li
...
Yiming Lei
Xiaoyu Wu
Qiqi Zhan
Qingjie Liu
Yunhong Wang
SyDa
53
18
0
16 Oct 2024
Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios
Sabit Hassan
Anthony Sicilia
Malihe Alikhani
31
2
0
14 Oct 2024
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Fei Wang
Ninareh Mehrabi
Palash Goyal
Rahul Gupta
Kai-Wei Chang
Aram Galstyan
ALM
48
2
0
07 Oct 2024
Scaling Parameter-Constrained Language Models with Quality Data
Ernie Chang
Matteo Paltenghi
Yang Li
Pin-Jie Lin
Changsheng Zhao
Patrick Huber
Zechun Liu
Rastislav Rabatin
Yangyang Shi
Vikas Chandra
62
1
0
04 Oct 2024
Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting
Siyi Liu
Yang Li
Jiang Li
Shan Yang
Yunshi Lan
LRM
42
3
0
02 Oct 2024
Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models
Patrick Amadeus Irawan
Genta Indra Winata
Samuel Cahyawijaya
Ayu Purwarianti
39
0
0
23 Sep 2024
Text2Traj2Text: Learning-by-Synthesis Framework for Contextual Captioning of Human Movement Trajectories
Hikaru Asano
Ryo Yonetani
Taiki Sekii
Hiroki Ouchi
77
0
0
19 Sep 2024
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do
Guijin Son
Hyunwoo Ko
Hoyoung Lee
Yewon Kim
Seunghyeok Hong
ALM
ELM
54
8
0
17 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
66
23
0
10 Sep 2024
Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices
Patrícia Schmidtová
Saad Mahamood
Simone Balloccu
Ondřej Dušek
Albert Gatt
Dimitra Gkatzia
David M. Howcroft
Ondřej Plátek
Adarsa Sivaprasad
47
3
0
17 Aug 2024
The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights
Nura Aljaafari
Danilo S. Carvalho
André Freitas
KELM
40
0
0
05 Aug 2024
Beyond Aesthetics: Cultural Competence in Text-to-Image Models
Nithish Kannen
Arif Ahmad
Marco Andreetto
Vinodkumar Prabhakaran
Utsav Prabhu
Adji Bousso Dieng
Pushpak Bhattacharyya
Shachi Dave
56
16
0
09 Jul 2024
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
Ahmet Üstün
Sara Hooker
47
23
0
02 Jul 2024
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models
Siyuan Wu
Yue Huang
Chujie Gao
Dongping Chen
Qihui Zhang
...
Tianyi Zhou
Xiangliang Zhang
Jianfeng Gao
Chaowei Xiao
Lichao Sun
SyDa
40
22
0
27 Jun 2024
Temporal Knowledge Graph Question Answering: A Survey
Miao Su
Zixuan Li
Zhuo Chen
Long Bai
Xiaolong Jin
Jiafeng Guo
61
2
0
20 Jun 2024
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification
Prashanth Vijayaraghavan
Hongzhi Wang
Luyao Shi
Tyler Baldwin
David Beymer
Ehsan Degan
37
1
0
16 Jun 2024
On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Lin Long
Rui Wang
Ruixuan Xiao
Junbo Zhao
Xiao Ding
Gang Chen
Haobo Wang
SyDa
63
95
0
14 Jun 2024
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena
Aidar Myrzakhan
Sondos Mahmoud Bsharat
Zhiqiang Shen
ELM
44
28
0
11 Jun 2024
Synth-SBDH: A Synthetic Dataset of Social and Behavioral Determinants of Health for Clinical Text
Avijit Mitra
Emily Druhl
Raelene Goodwin
Hong Yu
43
2
0
10 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
48
16
0
08 Jun 2024
Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Jonas Becker
Jan Philip Wahle
Bela Gipp
Terry Ruas
37
9
0
24 May 2024
Empowering Large Language Models for Textual Data Augmentation
Yichuan Li
Kaize Ding
Jianling Wang
Kyumin Lee
32
10
0
26 Apr 2024
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models
Sunhao Dai
Chen Xu
Shicheng Xu
Liang Pang
Zhenhua Dong
Jun Xu
48
67
0
17 Apr 2024
Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding
Yanda Li
Dixuan Wang
Jiaqing Liang
Guochao Jiang
Qi He
Yanghua Xiao
Deqing Yang
LRM
ELM
48
7
0
04 Apr 2024
Fairness in Large Language Models: A Taxonomic Survey
Zhibo Chu
Zichong Wang
Wenbin Zhang
AILaw
48
33
0
31 Mar 2024
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques
Ashok Urlana
Aditya Saibewar
B. Garlapati
Charaka Vinayak Kumar
Ajeet Kumar Singh
S. Chalamala
DeLMO
48
1
0
25 Mar 2024
Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Zhuowan Li
Bhavan A. Jasani
Peng Tang
Shabnam Ghadar
LRM
39
8
0
25 Mar 2024
EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling
Shimao Zhang
Yu Bao
Shujian Huang
27
8
0
21 Mar 2024
ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models
Yuzhao Heng
Chun-Ying Deng
Yitong Li
Yue Yu
Yinghao Li
Rongzhi Zhang
Chao Zhang
33
4
0
17 Mar 2024
1
2
Next