ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.13663
  4. Cited By
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
  Fast, Memory Efficient, and Long Context Finetuning and Inference

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

18 December 2024
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
Said Taghadouini
Alexis Gallagher
Raja Biswas
Faisal Ladhak
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
ArXivPDFHTML

Papers citing "Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference"

27 / 27 papers shown
Title
GuRE:Generative Query REwriter for Legal Passage Retrieval
GuRE:Generative Query REwriter for Legal Passage Retrieval
Daehee Kim
Deokhyung Kang
Jonghwi Kim
Sangwon Ryu
Gary Geunbae Lee
RALM
AILaw
14
0
0
19 May 2025
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Rui Melo
Claudia Mamede
Andre Catarino
Rui Abreu
Henrique Lopes Cardoso
31
0
0
15 May 2025
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
Yile Wang
Zhanyu Shen
Hui Huang
29
0
0
15 May 2025
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Xiaoliang Luo
Xinyi Xu
Michael Ramscar
Bradley C. Love
30
0
0
13 May 2025
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
52
0
0
06 May 2025
Semantic Probabilistic Control of Language Models
Semantic Probabilistic Control of Language Models
Kareem Ahmed
Catarina G Belém
Padhraic Smyth
Sameer Singh
44
0
0
04 May 2025
How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues
How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues
Suhas BN
Andrew M. Sherrill
Saeed Abdullah
Rosa I. Arriaga
Saeed Abdullah
Andrew M. Sherrill
35
1
0
30 Apr 2025
BrightCookies at SemEval-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification
BrightCookies at SemEval-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification
Foteini Papadopoulou
Osman Mutlu
Neris Özen
Bas H. M. van der Velden
I. Hendrickx
Ali Hürriyetoǧlu
ViT
39
0
0
29 Apr 2025
GLaMoR: Consistency Checking of OWL Ontologies using Graph Language Models
GLaMoR: Consistency Checking of OWL Ontologies using Graph Language Models
Justin Mücke
A. Scherp
160
0
0
26 Apr 2025
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length
Issa Sugiura
Kouta Nakayama
Yusuke Oda
34
0
0
22 Apr 2025
Grounded in Context: Retrieval-Based Method for Hallucination Detection
Grounded in Context: Retrieval-Based Method for Hallucination Detection
Assaf Gerner
Netta Madvil
Nadav Barak
Alex Zaikman
Jonatan Liberman
...
Yaron Friedman
Neal Harow
Noam Bresler
Shir Chorev
Philip Tannor
HILM
29
0
0
22 Apr 2025
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Chang Zong
Bin Li
Shoujun Zhou
Jian Wan
Lei Zhang
165
0
0
22 Apr 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Yucheng Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
53
0
0
19 Apr 2025
Transferrable Surrogates in Expressive Neural Architecture Search Spaces
Transferrable Surrogates in Expressive Neural Architecture Search Spaces
Shiwen Qin
Gabriela Kadlecová
Martin Pilát
Shay B. Cohen
Roman Neruda
Elliot J. Crowley
Jovita Lukasik
Linus Ericsson
AI4CE
131
0
0
17 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
47
0
0
15 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
37
1
0
10 Apr 2025
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization
Bojana Ranković
P. Schwaller
BDL
208
0
0
08 Apr 2025
FISH-Tuning: Enhancing PEFT Methods with Fisher Information
FISH-Tuning: Enhancing PEFT Methods with Fisher Information
Kang Xue
Ming Dong
Xinhui Tu
Tingting He
41
0
0
05 Apr 2025
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking
Chris Samarinas
Hamed Zamani
ALM
LRM
74
0
0
04 Apr 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
155
1
0
07 Mar 2025
MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings
MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings
Andrea Gurioli
Federico Pennino
João Monteiro
Maurizio Gabbrielli
51
0
0
04 Mar 2025
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
Jonathan Bourne
77
0
0
24 Feb 2025
Machine-generated text detection prevents language model collapse
Machine-generated text detection prevents language model collapse
George Drayson
Emine Yilmaz
Vasileios Lampos
DeLMO
62
0
0
21 Feb 2025
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
Seanie Lee
Dong Bok Lee
Dominik Wagner
Minki Kang
Haebin Seong
Tobias Bocklet
Juho Lee
Sung Ju Hwang
12
1
0
18 Feb 2025
Prompt-based Depth Pruning of Large Language Models
Prompt-based Depth Pruning of Large Language Models
Juyun Wee
Minjae Park
Jaeho Lee
VLM
93
0
0
17 Feb 2025
Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification
Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification
Zicheng Liu
Siyuan Li
Zhiyuan Chen
Lei Xin
Fang Wu
Chang Yu
Qirong Yang
Yucheng Guo
Yifan Yang
Stan Z. Li
SyDa
AI4CE
92
0
0
11 Feb 2025
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
54
14
0
23 Aug 2023
1