ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.24200
  4. Cited By
Length-Induced Embedding Collapse in PLM-based Models
v1v2 (latest)

Length-Induced Embedding Collapse in PLM-based Models

31 October 2024
Yuqi Zhou
Sunhao Dai
Zhanshuo Cao
Xiao Zhang
Jun Xu
ArXiv (abs)PDFHTML

Papers citing "Length-Induced Embedding Collapse in PLM-based Models"

37 / 37 papers shown
Title
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language
  Models
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
Wenqi Fan
Yujuan Ding
Liang-bo Ning
Shijie Wang
Hengyun Li
Dawei Yin
Tat-Seng Chua
Qing Li
RALM3DV
141
246
0
10 May 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
93
25
0
18 Apr 2024
GISTEmbed: Guided In-sample Selection of Training Negatives for Text
  Embedding Fine-tuning
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning
Aivin V. Solatorio
63
23
0
26 Feb 2024
Benchmarking and Building Long-Context Retrieval Models with LoCo and
  M2-BERT
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Jon Saad-Falcon
Daniel Y. Fu
Simran Arora
Neel Guha
Christopher Ré
RALM
94
17
0
12 Feb 2024
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity
  Text Embeddings Through Self-Knowledge Distillation
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Jianlv Chen
Shitao Xiao
Peitian Zhang
Kun Luo
Defu Lian
Zheng Liu
645
433
0
05 Feb 2024
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nomic Embed: Training a Reproducible Long Context Text Embedder
Zach Nussbaum
John X. Morris
Brandon Duderstadt
Andriy Mulyar
101
122
0
02 Feb 2024
Improving Text Embeddings with Large Language Models
Improving Text Embeddings with Large Language Models
Liang Wang
Nan Yang
Xiaolong Huang
Linjun Yang
Rangan Majumder
Furu Wei
SyDa
109
185
0
31 Dec 2023
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DVRALM
214
1,814
1
18 Dec 2023
Linear Log-Normal Attention with Unbiased Concentration
Linear Log-Normal Attention with Unbiased Concentration
Yury Nahshan
Dor-Joseph Kampeas
E. Haleva
57
8
0
22 Nov 2023
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long
  Documents
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents
Michael Gunther
Jackmin Ong
Isabelle Mohr
Alaeddine Abdessalem
Tanguy Abel
...
Saba Sturua
Bo Wang
Maximilian Werk
Nan Wang
Han Xiao
RALM
211
65
0
30 Oct 2023
Search-in-the-Chain: Interactively Enhancing Large Language Models with
  Search for Knowledge-intensive Tasks
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
Shicheng Xu
Liang Pang
Huawei Shen
Xueqi Cheng
Tat-Seng Chua
RALMKELMLRM
87
46
0
28 Apr 2023
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
246
619
0
07 Dec 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
199
3,150
0
20 Oct 2022
MTEB: Massive Text Embedding Benchmark
MTEB: Massive Text Embedding Benchmark
Niklas Muennighoff
Nouamane Tazi
L. Magne
Nils Reimers
525
413
0
13 Oct 2022
MASSIVE: A 1M-Example Multilingual Natural Language Understanding
  Dataset with 51 Typologically-Diverse Languages
MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages
Jack G. M. FitzGerald
C. Hench
Charith Peris
Scott Mackie
Kay Rottmann
...
Laurie Crist
Misha Britan
Wouter Leeuwis
Gokhan Tur
Premkumar Natarajan
63
134
0
18 Apr 2022
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain
  Analysis: From Theory to Practice
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
Peihao Wang
Wenqing Zheng
Tianlong Chen
Zhangyang Wang
ViT
76
139
0
09 Mar 2022
Large Dual Encoders Are Generalizable Retrievers
Large Dual Encoders Are Generalizable Retrievers
Jianmo Ni
Chen Qu
Jing Lu
Zhuyun Dai
Gustavo Hernández Ábrego
...
Vincent Zhao
Yi Luan
Keith B. Hall
Ming-Wei Chang
Yinfei Yang
DML
167
459
0
15 Dec 2021
ConSERT: A Contrastive Framework for Self-Supervised Sentence
  Representation Transfer
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
Yuanmeng Yan
Rumei Li
Sirui Wang
Fuzheng Zhang
Wei Wu
Weiran Xu
SSL
121
559
0
25 May 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILawSSL
276
3,411
0
18 Apr 2021
SummScreen: A Dataset for Abstractive Screenplay Summarization
SummScreen: A Dataset for Abstractive Screenplay Summarization
Mingda Chen
Zewei Chu
Sam Wiseman
Kevin Gimpel
72
95
0
14 Apr 2021
TWEAC: Transformer with Extendable QA Agent Classifiers
TWEAC: Transformer with Extendable QA Agent Classifiers
Gregor Geigle
Nils Reimers
Andreas Rucklé
Iryna Gurevych
ViT
131
26
0
14 Apr 2021
QMSum: A New Benchmark for Query-based Multi-domain Meeting
  Summarization
QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization
Ming Zhong
Da Yin
Tao Yu
A. Zaidi
Mutethia Mutuma
...
Ahmed Hassan Awadallah
Asli Celikyilmaz
Yang Liu
Xipeng Qiu
Dragomir R. Radev
RALM
87
338
0
13 Apr 2021
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of
  Reasoning Steps
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
Xanh Ho
A. Nguyen
Saku Sugawara
Akiko Aizawa
RALMLRM
81
465
0
02 Nov 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
383
627
0
13 Oct 2020
Top2Vec: Distributed Representations of Topics
Top2Vec: Distributed Representations of Topics
D. Angelov
75
348
0
19 Aug 2020
SummEval: Re-evaluating Summarization Evaluation
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
100
720
0
24 Jul 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense
  Text Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
139
1,232
0
01 Jul 2020
A Note on Over-Smoothing for Graph Neural Networks
A Note on Over-Smoothing for Graph Neural Networks
Chen Cai
Yusu Wang
83
276
0
23 Jun 2020
Fact or Fiction: Verifying Scientific Claims
Fact or Fiction: Verifying Scientific Claims
David Wadden
Shanchuan Lin
Kyle Lo
Lucy Lu Wang
Madeleine van Zuylen
Arman Cohan
Hannaneh Hajishirzi
HAI
144
459
0
30 Apr 2020
SPECTER: Document-level Representation Learning using Citation-informed
  Transformers
SPECTER: Document-level Representation Learning using Citation-informed Transformers
Arman Cohan
Sergey Feldman
Iz Beltagy
Doug Downey
Daniel S. Weld
AI4TS
84
556
0
15 Apr 2020
Efficient Intent Detection with Dual Sentence Encoders
Efficient Intent Detection with Dual Sentence Encoders
I. Casanueva
Tadas Temvcinas
D. Gerz
Matthew Henderson
Ivan Vulić
VLM
370
476
0
10 Mar 2020
Evaluation of Sentence Representations in Polish
Evaluation of Sentence Representations in Polish
Slawomir Dadas
Michal Perelkiewicz
Rafal Poswiata
169
15
0
25 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
677
24,541
0
26 Jul 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,175
0
11 Oct 2018
The NarrativeQA Reading Comprehension Challenge
The NarrativeQA Reading Comprehension Challenge
Tomás Kociský
Jonathan Richard Schwarz
Phil Blunsom
Chris Dyer
Karl Moritz Hermann
Gábor Melis
Edward Grefenstette
142
784
0
19 Dec 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
732
132,363
0
12 Jun 2017
Character-level Convolutional Networks for Text Classification
Character-level Convolutional Networks for Text Classification
Xiang Zhang
Jiaqi Zhao
Yann LeCun
268
6,130
0
04 Sep 2015
1