Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.17771
Cited By
v1
v2
v3
v4 (latest)
Banyan: Improved Representation Learning with Explicit Structure
25 July 2024
Mattia Opper
N. Siddharth
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Banyan: Improved Representation Learning with Explicit Structure"
41 / 41 papers shown
Title
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
100
0
0
29 Mar 2025
Compositional Generalization Across Distributional Shifts with Sparse Tree Operations
Paul Soulos
Henry Conklin
Mattia Opper
P. Smolensky
Jianfeng Gao
Roland Fernandez
123
5
0
18 Dec 2024
Self-StrAE at SemEval-2024 Task 1: Making Self-Structuring AutoEncoders Learn More With Less
Mattia Opper
Siddharth Narayanaswamy
60
3
0
02 Apr 2024
SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages
N. Ousidhoum
Shamsuddeen Hassan Muhammad
Mohamed Abdalla
Idris Abdulmumin
Ibrahim Said Ahmad
...
Thamar Solorio
Nirmal Surange
Krishnapriya Vishnubhotla
Seid Muhie Yimam
Saif M. Mohammad
78
13
0
27 Mar 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
100
130
0
29 Feb 2024
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages
N. Ousidhoum
Shamsuddeen Hassan Muhammad
Mohamed Abdalla
Idris Abdulmumin
Ibrahim Said Ahmad
...
Hailegnaw Getaneh Tilaye
Krishnapriya Vishnubhotla
Genta Indra Winata
Seid Muhie Yimam
Saif M. Mohammad
83
40
0
13 Feb 2024
Anisotropy Is Inherent to Self-Attention in Transformers
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
43
19
0
22 Jan 2024
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu
Tri Dao
Mamba
146
2,699
0
01 Dec 2023
On the effect of curriculum learning with developmental data for grammar acquisition
Mattia Opper
J. Morrison
N. Siddharth
57
2
0
31 Oct 2023
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
Shikhar Murty
Pratyusha Sharma
Jacob Andreas
Christopher D. Manning
AI4CE
73
14
0
29 Oct 2023
Beam Tree Recursive Cells
Jishnu Ray Chowdhury
Cornelia Caragea
59
6
0
31 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
227
597
0
22 May 2023
Just Rank: Rethinking Evaluation with Word and Sentence Similarities
Bin Wang
C.-C. Jay Kuo
Haizhou Li
ELM
49
30
0
05 Mar 2022
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale
Laurent Sartran
Samuel Barrett
A. Kuncoro
Milovs Stanojević
Phil Blunsom
Chris Dyer
73
50
0
01 Mar 2022
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation
Xiang Hu
Haitao Mi
Liang Li
Gerard de Melo
58
14
0
01 Mar 2022
What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study
Mohamed Abdalla
Krishnapriya Vishnubhotla
Saif M. Mohammad
52
25
0
10 Oct 2021
R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling
Xiang Hu
Haitao Mi
Zujie Wen
Yafang Wang
Yi Su
Jing Zheng
Gerard de Melo
40
23
0
02 Jul 2021
Can contrastive learning avoid shortcut solutions?
Joshua Robinson
Li Sun
Ke Yu
Kayhan Batmanghelich
Stefanie Jegelka
S. Sra
SSL
74
145
0
21 Jun 2021
Modeling Hierarchical Structures with Continuous Recursive Neural Networks
Jishnu Ray Chowdhury
Cornelia Caragea
54
15
0
10 Jun 2021
Paraphrastic Representations at Scale
John Wieting
Kevin Gimpel
Graham Neubig
Taylor Berg-Kirkpatrick
97
19
0
30 Apr 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILaw
SSL
274
3,396
0
18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
425
1,041
0
17 Apr 2021
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
Tongzhou Wang
Phillip Isola
SSL
160
1,840
0
20 May 2020
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
223
6,565
0
05 Nov 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,226
0
27 Aug 2019
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
Iulia Turc
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
65
224
0
23 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
665
24,528
0
26 Jul 2019
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders
Andrew Drozdov
Pat Verga
Mohit Yadav
Mohit Iyyer
Andrew McCallum
47
123
0
03 Apr 2019
Cooperative Learning of Disjoint Syntax and Semantics
Serhii Havrylov
Germán Kruszewski
Armand Joulin
55
48
0
25 Feb 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,182
0
20 Apr 2018
ListOps: A Diagnostic Dataset for Latent Tree Learning
Nikita Nangia
Samuel R. Bowman
57
138
0
17 Apr 2018
BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Benjamin Heinzerling
Michael Strube
58
232
0
05 Oct 2017
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer
Mona T. Diab
Eneko Agirre
I. Lopez-Gazpio
Lucia Specia
430
1,882
0
31 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
713
132,199
0
12 Jun 2017
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features
Matteo Pagliardini
Prakhar Gupta
Martin Jaggi
SSL
164
694
0
07 Mar 2017
Using Fast Weights to Attend to the Recent Past
Jimmy Ba
Geoffrey E. Hinton
Volodymyr Mnih
Joel Z Leibo
Catalin Ionescu
63
272
0
20 Oct 2016
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
328
2,876
0
26 Sep 2016
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Kai Sheng Tai
R. Socher
Christopher D. Manning
AIMat
142
3,122
0
28 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.9K
150,115
0
22 Dec 2014
SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation
Felix Hill
Roi Reichart
Anna Korhonen
101
1,303
0
15 Aug 2014
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
680
31,512
0
16 Jan 2013
1