Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain

2 September 2024

Papers citing "Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain"

24 / 24 papers shown

Title
STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals Weihang Su Yiran Hu Anzhe Xie Qingyao Ai Zibing Que Ning Zheng Yun Liu Weixing Shen Yiqun Liu ELM AILaw 56 11 0 21 Jun 2024
CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval Santosh T.Y.S.S Kristina Kaiser Matthias Grabmair 37 3 0 31 Mar 2024
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction Keshav Santhanam Omar Khattab Jon Saad-Falcon Christopher Potts Matei A. Zaharia 110 417 0 02 Dec 2021
A Statutory Article Retrieval Dataset in French Antoine Louis Gerasimos Spanakis RALM AILaw 40 43 0 26 Aug 2021
On the Ethical Limits of Natural Language Processing on Legal Text D. Tsarapatsanis Nikolaos Aletras ELM AILaw 63 43 0 06 May 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings Tianyu Gao Xingcheng Yao Danqi Chen AILaw SSL 280 3,415 0 18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models Nandan Thakur Nils Reimers Andreas Rucklé Abhishek Srivastava Iryna Gurevych VLM 425 1,055 0 17 Apr 2021
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling Sebastian Hofstatter Sheng-Chieh Lin Jheng-Hong Yang Jimmy J. Lin Allan Hanbury VLM 93 402 0 14 Apr 2021
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval Lee Xiong Chenyan Xiong Ye Li Kwok-Fung Tang Jialin Liu Paul N. Bennett Junaid Ahmed Arnold Overwijk 141 1,236 0 01 Jul 2020
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT Omar Khattab Matei A. Zaharia 138 1,380 0 27 Apr 2020
Minimizing FLOPs to Learn Efficient Sparse Representations Biswajit Paria Chih-Kuan Yeh Ian En-Hsu Yen N. Xu Pradeep Ravikumar Barnabás Póczós 71 69 0 12 Apr 2020
A Simple Framework for Contrastive Learning of Visual Representations Ting-Li Chen Simon Kornblith Mohammad Norouzi Geoffrey E. Hinton SSL 390 18,897 0 13 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval Wei-Cheng Chang Felix X. Yu Yin-Wen Chang Yiming Yang Sanjiv Kumar RALM 82 306 0 10 Feb 2020
FlauBERT: Unsupervised Language Model Pre-training for French Hang Le Loïc Vial Jibril Frej Vincent Segonne Maximin Coavoux Benjamin Lecouteux A. Allauzen Benoît Crabbé Laurent Besacier D. Schwab AI4CE 96 400 0 11 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 565 42,639 0 03 Dec 2019
CamemBERT: a Tasty French Language Model Louis Martin Benjamin Muller Pedro Ortiz Suarez Yoann Dupont Laurent Romary Eric Villemonte de la Clergerie Djamé Seddah Benoît Sagot 126 976 0 10 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzmán Edouard Grave Myle Ott Luke Zettlemoyer Veselin Stoyanov 228 6,593 0 05 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 262 7,554 0 02 Oct 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Nils Reimers Iryna Gurevych 1.3K 12,316 0 27 Aug 2019
Deeper Text Understanding for IR with Contextual Neural Language Modeling Zhuyun Dai Jamie Callan 65 449 0 22 May 2019
Passage Re-ranking with BERT Rodrigo Nogueira Kyunghyun Cho OOD 126 1,097 0 13 Jan 2019
End-to-End Retrieval in Continuous Space D. Gillick Alessandro Presta Gaurav Singh Tomar 119 104 0 19 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,229 0 11 Oct 2018
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset Payal Bajaj Daniel Fernando Campos Nick Craswell Li Deng Jianfeng Gao ... Mir Rosenberg Xia Song Alina Stoica Saurabh Tiwary Tong Wang RALM 181 2,745 0 28 Nov 2016