v1v2v3v4 (latest)

Banyan: Improved Representation Learning with Explicit Structure

25 July 2024

Papers citing "Banyan: Improved Representation Learning with Explicit Structure"

41 / 41 papers shown

Title
TRA: Better Length Generalisation with Threshold Relative Attention Mattia Opper Roland Fernandez P. Smolensky Jianfeng Gao 100 0 0 29 Mar 2025
Compositional Generalization Across Distributional Shifts with Sparse Tree Operations Paul Soulos Henry Conklin Mattia Opper P. Smolensky Jianfeng Gao Roland Fernandez 123 5 0 18 Dec 2024
Self-StrAE at SemEval-2024 Task 1: Making Self-Structuring AutoEncoders Learn More With Less Mattia Opper Siddharth Narayanaswamy 60 3 0 02 Apr 2024
SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages N. Ousidhoum Shamsuddeen Hassan Muhammad Mohamed Abdalla Idris Abdulmumin Ibrahim Said Ahmad ... Thamar Solorio Nirmal Surange Krishnapriya Vishnubhotla Seid Muhie Yimam Saif M. Mohammad 78 13 0 27 Mar 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Soham De Samuel L. Smith Anushan Fernando Aleksandar Botev George-Christian Muraru ... David Budden Yee Whye Teh Razvan Pascanu Nando de Freitas Çağlar Gülçehre Mamba 100 130 0 29 Feb 2024
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages N. Ousidhoum Shamsuddeen Hassan Muhammad Mohamed Abdalla Idris Abdulmumin Ibrahim Said Ahmad ... Hailegnaw Getaneh Tilaye Krishnapriya Vishnubhotla Genta Indra Winata Seid Muhie Yimam Saif M. Mohammad 83 40 0 13 Feb 2024
Anisotropy Is Inherent to Self-Attention in Transformers Nathan Godey Eric Villemonte de la Clergerie Benoît Sagot 43 19 0 22 Jan 2024
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Albert Gu Tri Dao Mamba 146 2,699 0 01 Dec 2023
On the effect of curriculum learning with developmental data for grammar acquisition Mattia Opper J. Morrison N. Siddharth 57 2 0 31 Oct 2023
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models Shikhar Murty Pratyusha Sharma Jacob Andreas Christopher D. Manning AI4CE 73 14 0 29 Oct 2023
Beam Tree Recursive Cells Jishnu Ray Chowdhury Cornelia Caragea 59 6 0 31 May 2023
RWKV: Reinventing RNNs for the Transformer Era Bo Peng Eric Alcaide Quentin G. Anthony Alon Albalak Samuel Arcadinho ... Qihang Zhao P. Zhou Qinghua Zhou Jian Zhu Rui-Jie Zhu 227 597 0 22 May 2023
Just Rank: Rethinking Evaluation with Word and Sentence Similarities Bin Wang C.-C. Jay Kuo Haizhou Li ELM 49 30 0 05 Mar 2022
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale Laurent Sartran Samuel Barrett A. Kuncoro Milovs Stanojević Phil Blunsom Chris Dyer 73 50 0 01 Mar 2022
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation Xiang Hu Haitao Mi Liang Li Gerard de Melo 58 14 0 01 Mar 2022
What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study Mohamed Abdalla Krishnapriya Vishnubhotla Saif M. Mohammad 52 25 0 10 Oct 2021
R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling Xiang Hu Haitao Mi Zujie Wen Yafang Wang Yi Su Jing Zheng Gerard de Melo 40 23 0 02 Jul 2021
Can contrastive learning avoid shortcut solutions? Joshua Robinson Li Sun Ke Yu Kayhan Batmanghelich Stefanie Jegelka S. Sra SSL 74 145 0 21 Jun 2021
Modeling Hierarchical Structures with Continuous Recursive Neural Networks Jishnu Ray Chowdhury Cornelia Caragea 54 15 0 10 Jun 2021
Paraphrastic Representations at Scale John Wieting Kevin Gimpel Graham Neubig Taylor Berg-Kirkpatrick 97 19 0 30 Apr 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings Tianyu Gao Xingcheng Yao Danqi Chen AILaw SSL 274 3,396 0 18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models Nandan Thakur Nils Reimers Andreas Rucklé Abhishek Srivastava Iryna Gurevych VLM 425 1,041 0 17 Apr 2021
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere Tongzhou Wang Phillip Isola SSL 160 1,840 0 20 May 2020
Unsupervised Cross-lingual Representation Learning at Scale Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzmán Edouard Grave Myle Ott Luke Zettlemoyer Veselin Stoyanov 223 6,565 0 05 Nov 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Nils Reimers Iryna Gurevych 1.3K 12,226 0 27 Aug 2019
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models Iulia Turc Ming-Wei Chang Kenton Lee Kristina Toutanova 65 224 0 23 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 665 24,528 0 26 Jul 2019
Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders Andrew Drozdov Pat Verga Mohit Yadav Mohit Iyyer Andrew McCallum 47 123 0 03 Apr 2019
Cooperative Learning of Disjoint Syntax and Semantics Serhii Havrylov Germán Kruszewski Armand Joulin 55 48 0 25 Feb 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 1.1K 7,182 0 20 Apr 2018
ListOps: A Diagnostic Dataset for Latent Tree Learning Nikita Nangia Samuel R. Bowman 57 138 0 17 Apr 2018
BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages Benjamin Heinzerling Michael Strube 58 232 0 05 Oct 2017
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation Daniel Cer Mona T. Diab Eneko Agirre I. Lopez-Gazpio Lucia Specia 430 1,882 0 31 Jul 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 713 132,199 0 12 Jun 2017
Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features Matteo Pagliardini Prakhar Gupta Martin Jaggi SSL 164 694 0 07 Mar 2017
Using Fast Weights to Attend to the Recent Past Jimmy Ba Geoffrey E. Hinton Volodymyr Mnih Joel Z Leibo Catalin Ionescu 63 272 0 20 Oct 2016
Pointer Sentinel Mixture Models Stephen Merity Caiming Xiong James Bradbury R. Socher RALM 328 2,876 0 26 Sep 2016
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks Kai Sheng Tai R. Socher Christopher D. Manning AIMat 142 3,122 0 28 Feb 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.9K 150,115 0 22 Dec 2014
SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation Felix Hill Roi Reichart Anna Korhonen 101 1,303 0 15 Aug 2014
Efficient Estimation of Word Representations in Vector Space Tomas Mikolov Kai Chen G. Corrado J. Dean 3DV 680 31,512 0 16 Jan 2013