Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.15076
Cited By
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing
30 June 2022
Jason Alan Fries
Leon Weber
Natasha Seelam
Gabriel Altay
Debajyoti Datta
Samuele Garda
Myungsun Kang
Ruisi Su
Wojciech Kusa
Samuel Cahyawijaya
Fabio Barth
Simon Ott
Matthias Samwald
Stephen H. Bach
Stella Biderman
Mario Sanger
Bo Wang
A. Callahan
Daniel León Perinán
Théo Gigant
Patrick Haller
Jenny Chim
J. Posada
John Giorgi
Karthi Sivaraman
Marc Pàmies
Marianna Nezhurina
Robert Martin
Michael Cullan
M. Freidank
N. Dahlberg
Shubhanshu Mishra
Shamik Bose
N. Broad
Yanis Labrak
Shlok S Deshmukh
Sid Kiblawi
Ayush Singh
Minh Chien Vu
Trishala Neeraj
Jonas Golde
Albert Villanova del Moral
Benjamin Beilharz
LM&MA
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing"
23 / 23 papers shown
Title
Knowledge-augmented Pre-trained Language Models for Biomedical Relation Extraction
Mario Sanger
Ulf Leser
52
0
0
01 May 2025
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu
Hejie Cui
Yue Yu
Xuan Kan
Wenqi Shi
Yuchen Zhuang
Wei Jin
Joyce C. Ho
Carl Yang
69
14
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Min Zhang
LM&MA
AILaw
107
155
0
28 Jan 2025
Prompt engineering paradigms for medical applications: scoping review and recommendations for better practices
Jamil Zaghir
Marco Naguib
Mina Bjelogrlic
Aurélie Névéol
Xavier Tannier
Christian Lovis
AI4CE
LM&MA
40
6
0
02 May 2024
Oasis: Data Curation and Assessment System for Pretraining of Large Language Models
Tong Zhou
Yubo Chen
Pengfei Cao
Kang Liu
Jun Zhao
Shengping Liu
29
3
0
21 Nov 2023
nach0: Multimodal Natural and Chemical Languages Foundation Model
M. Livne
Z. Miftahutdinov
E. Tutubalina
Maksim Kuznetsov
Daniil Polykovskiy
...
Aastha Jhunjhunwala
Anthony Costa
Alex Aliper
Alán Aspuru-Guzik
Alex Zhavoronkov
AI4CE
27
13
0
21 Nov 2023
InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
Willy Chung
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Pascale Fung
29
5
0
13 Oct 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
28
3
0
26 Jul 2023
End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies
Xu-Xia Ai
Ramakanth Kavuluru
27
5
0
03 Apr 2023
The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs
Michael Wornow
Yizhe Xu
Rahul Thapa
Birju S. Patel
E. Steinberg
Scott L. Fleming
M. Pfeffer
Jason Alan Fries
N. Shah
LM&MA
28
32
0
22 Mar 2023
PUnifiedNER: A Prompting-based Unified NER System for Diverse Datasets
Jinghui Lu
Rui Zhao
Brian Mac Namee
Fei Tan
24
19
0
27 Nov 2022
How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling
Samuel Cahyawijaya
Bryan Wilie
Holy Lovenia
Huang Zhong
Mingqian Zhong
Yuk-Yu Nancy Ip
Pascale Fung
LM&MA
25
2
0
25 Oct 2022
Towards Answering Open-ended Ethical Quandary Questions
Yejin Bang
Nayeon Lee
Tiezheng Yu
Leila Khalatbari
Yan Xu
...
Romain Barraud
Elham J. Barezi
Andrea Madotto
Hayden Kee
Pascale Fung
ELM
35
6
0
12 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
384
12,081
0
04 Mar 2022
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
228
340
0
02 Feb 2022
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
Suchin Gururangan
Dallas Card
Sarah K. Drier
E. K. Gade
Leroy Z. Wang
Zeyu Wang
Luke Zettlemoyer
Noah A. Smith
175
74
0
25 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
218
1,663
0
15 Oct 2021
Template-free Prompt Tuning for Few-shot NER
Ruotian Ma
Xin Zhou
Tao Gui
Y. Tan
Linyang Li
Qi Zhang
Xuanjing Huang
VLM
155
178
0
28 Sep 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
599
0
14 Jul 2021
Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation
Aparna Elangovan
Jiayuan He
Karin Verspoor
TDI
FedML
167
89
0
03 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,007
0
31 Dec 2020
Explainable Automated Fact-Checking for Public Health Claims
Neema Kotonya
Francesca Toni
218
250
0
19 Oct 2020
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
243
825
0
13 Sep 2019
1