Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.01088
Cited By
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks
2 November 2018
Jason Phang
Thibault Févry
Samuel R. Bowman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks"
50 / 112 papers shown
Title
TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts
Pradip Kunwar
Minh Vu
Maanak Gupta
Mahmoud Abdelsalam
Manish Bhattarai
MoE
MoMe
163
0
0
29 Apr 2025
Sarcasm Detection as a Catalyst: Improving Stance Detection with Cross-Target Capabilities
Gibson Nkhata Shi Yin Hong
Susan Gauch
58
0
0
05 Mar 2025
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
48
0
0
15 Feb 2025
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
Ge Gao
Jongin Kim
Sejin Paik
Ekaterina Novozhilova
Yi Liu
Sarah Bonna
Margrit Betke
Derry Wijaya
44
0
0
14 Jul 2024
When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks
Hao Peng
Xiaozhi Wang
Jianhui Chen
Weikai Li
Y. Qi
...
Zhili Wu
Kaisheng Zeng
Bin Xu
Lei Hou
Juanzi Li
34
28
0
15 Nov 2023
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Hillary Ngai
Rohan Agrawal
Neeraj Gaur
Ronny Huang
Parisa Haghani
P. M. Mengibar
MoMe
46
0
0
17 Oct 2023
A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction
Zefa Hu
Ziyi Ni
Jing Shi
Shuang Xu
Bo Xu
MedIm
43
1
0
30 Jul 2023
Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning
Xindi Wang
Yufei Wang
Can Xu
Xiubo Geng
Bowen Zhang
Chongyang Tao
Frank Rudzicz
Robert E. Mercer
Daxin Jiang
33
11
0
28 Jul 2023
Advances and Challenges in Meta-Learning: A Technical Review
Anna Vettoruzzo
Mohamed-Rafik Bouguelia
Joaquin Vanschoren
Thorsteinn Rögnvaldsson
K. Santosh
OffRL
29
70
0
10 Jul 2023
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong
Heming Xia
Damai Dai
Runxin Xu
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
27
0
0
24 May 2023
Taxonomy Expansion for Named Entity Recognition
Karthikeyan K
Yogarshi Vyas
Jie Ma
Giovanni Paolini
Neha Ann John
Shuai Wang
Yassine Benajiba
Vittorio Castelli
Dan Roth
Miguel Ballesteros
19
2
0
22 May 2023
Generating multiple-choice questions for medical question answering with distractors and cue-masking
Damien Sileo
Kanimozhi Uma
Marie-Francine Moens
43
5
0
13 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
32
1
0
13 Mar 2023
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
25
4
0
15 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
34
49
0
09 Feb 2023
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models
Changan Niu
Chuanyi Li
Vincent Ng
Bin Luo
ELM
ALM
34
9
0
08 Feb 2023
Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective
Jongwoo Ko
Seungjoon Park
Minchan Jeong
S. Hong
Euijai Ahn
Duhyeuk Chang
Se-Young Yun
23
6
0
03 Feb 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
13
2
0
25 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
22
17
0
12 Jan 2023
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
26
2
0
31 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
24
214
0
19 Dec 2022
MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
26
6
0
19 Dec 2022
Incorporating Emotions into Health Mention Classification Task on Social Media
O. Aduragba
Jialin Yu
Alexandra I. Cristea
25
1
0
09 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
28
52
0
02 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
33
19
0
01 Dec 2022
ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT
Rui Pan
Shizhe Diao
Jianlin Chen
Tong Zhang
VLM
19
7
0
30 Nov 2022
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Jason Phang
Yi Mao
Pengcheng He
Weizhu Chen
24
30
0
22 Nov 2022
Syntax-Aware On-the-Fly Code Completion
Wannita Takerngsaksiri
C. Tantithamthavorn
Yuankui Li
41
18
0
09 Nov 2022
SocioProbe: What, When, and Where Language Models Learn about Sociodemographics
Anne Lauscher
Federico Bianchi
Samuel R. Bowman
Dirk Hovy
32
7
0
08 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
23
13
0
01 Nov 2022
Zero-Shot Text Classification with Self-Training
Ariel Gera
Alon Halfon
Eyal Shnarch
Yotam Perlitz
L. Ein-Dor
Noam Slonim
VLM
28
59
0
31 Oct 2022
Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5
Irina Bigoulaeva
Rachneet Sachdeva
Harish Tayyar Madabushi
Aline Villavicencio
Iryna Gurevych
LRM
53
5
0
31 Oct 2022
Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models
Syrielle Montariol
Arij Riabi
Djamé Seddah
29
10
0
24 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
121
94
0
06 Oct 2022
Visual Comparison of Language Model Adaptation
Rita Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
44
16
0
17 Aug 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Shijie Wu
Benjamin Van Durme
Mark Dredze
30
6
0
12 Jul 2022
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
18
12
0
25 May 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
24
18
0
25 May 2022
PreQuEL: Quality Estimation of Machine Translation Outputs in Advance
Shachar Don-Yehiya
Leshem Choshen
Omri Abend
33
10
0
18 May 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
19
21
0
17 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
33
1
0
09 May 2022
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence
Myeongjun Jang
Frank Mtumbuka
Thomas Lukasiewicz
36
9
0
08 May 2022
Improving In-Context Few-Shot Learning via Self-Supervised Training
Mingda Chen
Jingfei Du
Ramakanth Pasunuru
Todor Mihaylov
Srini Iyer
Ves Stoyanov
Zornitsa Kozareva
SSL
AI4MH
38
64
0
03 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis
Sandra Wankmüller
33
2
0
03 May 2022
Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning
Vishakh Padmakumar
Leonard Lausen
Miguel Ballesteros
Sheng Zha
He He
George Karypis
31
18
0
23 Apr 2022
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles for Detecting Patronizing and Condescending Language
David Koleczek
Alexander Scarlatos
Siddha Makarand Karkare
Preshma Linet Pereira
21
0
0
18 Apr 2022
IDPG: An Instance-Dependent Prompt Generation Method
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Rui Hou
Yuxiao Dong
V. Vydiswaran
Hao Ma
VLM
38
58
0
09 Apr 2022
Fusing finetuned models for better pretraining
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
54
87
0
06 Apr 2022
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavavs
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
21
16
0
16 Mar 2022
1
2
3
Next