ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.01088
  4. Cited By
Sentence Encoders on STILTs: Supplementary Training on Intermediate
  Labeled-data Tasks

Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks

2 November 2018
Jason Phang
Thibault Févry
Samuel R. Bowman
ArXivPDFHTML

Papers citing "Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks"

50 / 112 papers shown
Title
TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts
TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts
Pradip Kunwar
Minh Vu
Maanak Gupta
Mahmoud Abdelsalam
Manish Bhattarai
MoE
MoMe
163
0
0
29 Apr 2025
Sarcasm Detection as a Catalyst: Improving Stance Detection with Cross-Target Capabilities
Gibson Nkhata Shi Yin Hong
Susan Gauch
58
0
0
05 Mar 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
48
0
0
15 Feb 2025
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT
  and Seq2Seq Models for Free-Text Generation
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
Ge Gao
Jongin Kim
Sejin Paik
Ekaterina Novozhilova
Yi Liu
Sarah Bonna
Margrit Betke
Derry Wijaya
44
0
0
14 Jul 2024
When does In-context Learning Fall Short and Why? A Study on
  Specification-Heavy Tasks
When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks
Hao Peng
Xiaozhi Wang
Jianhui Chen
Weikai Li
Y. Qi
...
Zhili Wu
Kaisheng Zeng
Bin Xu
Lei Hou
Juanzi Li
34
28
0
15 Nov 2023
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and
  Non-Destructive Multi-task Speech Recognition
Audio-AdapterFusion: A Task-ID-free Approach for Efficient and Non-Destructive Multi-task Speech Recognition
Hillary Ngai
Rohan Agrawal
Neeraj Gaur
Ronny Huang
Parisa Haghani
P. M. Mengibar
MoMe
46
0
0
17 Oct 2023
A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue
  Information Extraction
A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction
Zefa Hu
Ziyi Ni
Jing Shi
Shuang Xu
Bo Xu
MedIm
43
1
0
30 Jul 2023
Investigating the Learning Behaviour of In-context Learning: A
  Comparison with Supervised Learning
Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning
Xindi Wang
Yufei Wang
Can Xu
Xiubo Geng
Bowen Zhang
Chongyang Tao
Frank Rudzicz
Robert E. Mercer
Daxin Jiang
33
11
0
28 Jul 2023
Advances and Challenges in Meta-Learning: A Technical Review
Advances and Challenges in Meta-Learning: A Technical Review
Anna Vettoruzzo
Mohamed-Rafik Bouguelia
Joaquin Vanschoren
Thorsteinn Rögnvaldsson
K. Santosh
OffRL
29
70
0
10 Jul 2023
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net
  Estimation and Optimization
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong
Heming Xia
Damai Dai
Runxin Xu
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
27
0
0
24 May 2023
Taxonomy Expansion for Named Entity Recognition
Taxonomy Expansion for Named Entity Recognition
Karthikeyan K
Yogarshi Vyas
Jie Ma
Giovanni Paolini
Neha Ann John
Shuai Wang
Yassine Benajiba
Vittorio Castelli
Dan Roth
Miguel Ballesteros
19
2
0
22 May 2023
Generating multiple-choice questions for medical question answering with
  distractors and cue-masking
Generating multiple-choice questions for medical question answering with distractors and cue-masking
Damien Sileo
Kanimozhi Uma
Marie-Francine Moens
43
5
0
13 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
32
1
0
13 Mar 2023
Measuring the Instability of Fine-Tuning
Measuring the Instability of Fine-Tuning
Yupei Du
D. Nguyen
25
4
0
15 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
34
49
0
09 Feb 2023
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code
  Models
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models
Changan Niu
Chuanyi Li
Vincent Ng
Bin Luo
ELM
ALM
34
9
0
08 Feb 2023
Revisiting Intermediate Layer Distillation for Compressing Language
  Models: An Overfitting Perspective
Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective
Jongwoo Ko
Seungjoon Park
Minchan Jeong
S. Hong
Euijai Ahn
Duhyeuk Chang
Se-Young Yun
23
6
0
03 Feb 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
13
2
0
25 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Toward Building General Foundation Models for Language, Vision, and
  Vision-Language Understanding Tasks
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
22
17
0
12 Jan 2023
Towards Proactively Forecasting Sentence-Specific Information Popularity
  within Online News Documents
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
26
2
0
31 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
24
214
0
19 Dec 2022
MIGA: A Unified Multi-task Generation Framework for Conversational
  Text-to-SQL
MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
26
6
0
19 Dec 2022
Incorporating Emotions into Health Mention Classification Task on Social
  Media
Incorporating Emotions into Health Mention Classification Task on Social Media
O. Aduragba
Jialin Yu
Alexandra I. Cristea
25
1
0
09 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
28
52
0
02 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
33
19
0
01 Dec 2022
ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT
ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT
Rui Pan
Shizhe Diao
Jianlin Chen
Tong Zhang
VLM
19
7
0
30 Nov 2022
HyperTuning: Toward Adapting Large Language Models without
  Back-propagation
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Jason Phang
Yi Mao
Pengcheng He
Weizhu Chen
24
30
0
22 Nov 2022
Syntax-Aware On-the-Fly Code Completion
Syntax-Aware On-the-Fly Code Completion
Wannita Takerngsaksiri
C. Tantithamthavorn
Yuankui Li
41
18
0
09 Nov 2022
SocioProbe: What, When, and Where Language Models Learn about
  Sociodemographics
SocioProbe: What, When, and Where Language Models Learn about Sociodemographics
Anne Lauscher
Federico Bianchi
Samuel R. Bowman
Dirk Hovy
32
7
0
08 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate
  Speech Detection
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
23
13
0
01 Nov 2022
Zero-Shot Text Classification with Self-Training
Zero-Shot Text Classification with Self-Training
Ariel Gera
Alon Halfon
Eyal Shnarch
Yotam Perlitz
L. Ein-Dor
Noam Slonim
VLM
28
59
0
31 Oct 2022
Effective Cross-Task Transfer Learning for Explainable Natural Language
  Inference with T5
Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5
Irina Bigoulaeva
Rachneet Sachdeva
Harish Tayyar Madabushi
Aline Villavicencio
Iryna Gurevych
LRM
53
5
0
31 Oct 2022
Multilingual Auxiliary Tasks Training: Bridging the Gap between
  Languages for Zero-Shot Transfer of Hate Speech Detection Models
Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models
Syrielle Montariol
Arij Riabi
Djamé Seddah
29
10
0
24 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
121
94
0
06 Oct 2022
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model Adaptation
Rita Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
44
16
0
17 Aug 2022
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Zero-shot Cross-lingual Transfer is Under-specified Optimization
Shijie Wu
Benjamin Van Durme
Mark Dredze
30
6
0
12 Jul 2022
Eliciting and Understanding Cross-Task Skills with Task-Level
  Mixture-of-Experts
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
18
12
0
25 May 2022
Leveraging QA Datasets to Improve Generative Data Augmentation
Leveraging QA Datasets to Improve Generative Data Augmentation
Dheeraj Mekala
Tu Vu
Timo Schick
Jingbo Shang
24
18
0
25 May 2022
PreQuEL: Quality Estimation of Machine Translation Outputs in Advance
PreQuEL: Quality Estimation of Machine Translation Outputs in Advance
Shachar Don-Yehiya
Leshem Choshen
Omri Abend
33
10
0
18 May 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for
  Pre-Trained Encoder Transfer Learning
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
19
21
0
17 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
33
1
0
09 May 2022
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text
  Correspondence
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence
Myeongjun Jang
Frank Mtumbuka
Thomas Lukasiewicz
36
9
0
08 May 2022
Improving In-Context Few-Shot Learning via Self-Supervised Training
Improving In-Context Few-Shot Learning via Self-Supervised Training
Mingda Chen
Jingfei Du
Ramakanth Pasunuru
Todor Mihaylov
Srini Iyer
Ves Stoyanov
Zornitsa Kozareva
SSL
AI4MH
38
64
0
03 May 2022
A Comparison of Approaches for Imbalanced Classification Problems in the
  Context of Retrieving Relevant Documents for an Analysis
A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis
Sandra Wankmüller
33
2
0
03 May 2022
Exploring the Role of Task Transferability in Large-Scale Multi-Task
  Learning
Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning
Vishakh Padmakumar
Leonard Lausen
Miguel Ballesteros
Sheng Zha
He He
George Karypis
31
18
0
23 Apr 2022
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles
  for Detecting Patronizing and Condescending Language
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles for Detecting Patronizing and Condescending Language
David Koleczek
Alexander Scarlatos
Siddha Makarand Karkare
Preshma Linet Pereira
21
0
0
18 Apr 2022
IDPG: An Instance-Dependent Prompt Generation Method
IDPG: An Instance-Dependent Prompt Generation Method
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Rui Hou
Yuxiao Dong
V. Vydiswaran
Hao Ma
VLM
38
58
0
09 Apr 2022
Fusing finetuned models for better pretraining
Fusing finetuned models for better pretraining
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
54
87
0
06 Apr 2022
Geographic Adaptation of Pretrained Language Models
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavavs
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
21
16
0
16 Mar 2022
123
Next