ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.04683
  4. Cited By
RACE: Large-scale ReAding Comprehension Dataset From Examinations
v1v2v3v4v5 (latest)

RACE: Large-scale ReAding Comprehension Dataset From Examinations

15 April 2017
Guokun Lai
Qizhe Xie
Hanxiao Liu
Yiming Yang
Eduard H. Hovy
    ELM
ArXiv (abs)PDFHTML

Papers citing "RACE: Large-scale ReAding Comprehension Dataset From Examinations"

50 / 815 papers shown
Title
A Vietnamese Dataset for Text Segmentation and Multiple Choices Reading Comprehension
A Vietnamese Dataset for Text Segmentation and Multiple Choices Reading Comprehension
Toan Nguyen Hai
Ha Nguyen Viet
Truong Quan Xuan
Duc Do Minh
10
0
0
19 Jun 2025
Learning-Time Encoding Shapes Unlearning in LLMs
Learning-Time Encoding Shapes Unlearning in LLMs
Ruihan Wu
Konstantin Garov
Kamalika Chaudhuri
MU
18
0
0
18 Jun 2025
From Model to Classroom: Evaluating Generated MCQs for Portuguese with Narrative and Difficulty Concerns
From Model to Classroom: Evaluating Generated MCQs for Portuguese with Narrative and Difficulty Concerns
Bernardo Leite
Henrique Lopes Cardoso
Pedro Pinto
Abel Ferreira
Luís Abreu
Isabel Rangel
Sandra Monteiro
26
0
0
18 Jun 2025
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Mozhi Zhang
Howe Tissue
Lu Wang
Xipeng Qiu
112
1
0
12 Jun 2025
DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts
DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts
Yuchen Feng
Bowen Shen
Naibin Gu
Jiaxuan Zhao
Peng Fu
Zheng Lin
Weiping Wang
MoMeMoE
50
0
0
11 Jun 2025
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
J. Oswald
Nino Scherrer
Seijin Kobayashi
Luca Versari
Songlin Yang
...
Guillaume Lajoie
Charlotte Frenkel
Razvan Pascanu
Blaise Agüera y Arcas
João Sacramento
94
1
0
05 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
MANBench: Is Your Multimodal Model Smarter than Human?
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
15
0
0
04 Jun 2025
Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights
Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights
Jakub Krajewski
Marcin Chochowski
Daniel Korzekwa
MoEALM
57
0
0
03 Jun 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Beyond Text Compression: Evaluating Tokenizers Across Scales
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
50
0
0
03 Jun 2025
Recipes for Pre-training LLMs with MXFP8
Asit K. Mishra
Dusan Stosic
Simon Layton
MQ
12
0
0
30 May 2025
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning
Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning
Wanyun Xie
F. Tonin
Volkan Cevher
24
0
0
30 May 2025
From Chat Logs to Collective Insights: Aggregative Question Answering
From Chat Logs to Collective Insights: Aggregative Question Answering
Wentao Zhang
Woojeong Kim
Yuntian Deng
LMTD
48
0
0
29 May 2025
Pretraining Language Models to Ponder in Continuous Space
Pretraining Language Models to Ponder in Continuous Space
Boyi Zeng
Shixiang Song
Siyuan Huang
Yixuan Wang
He Li
Ziwei He
Xinbing Wang
Zhiyu Li
Zhouhan Lin
LRM
81
0
0
27 May 2025
How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
Xin Lu
Yanyan Zhao
Si Wei
Shijin Wang
Bing Qin
Ting Liu
43
0
0
24 May 2025
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
Zeping Yu
Sophia Ananiadou
MoMeKELMCLL
100
0
0
22 May 2025
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Mingyu Yang
Mehdi Rezagholizadeh
Guihong Li
Vikram Appia
Emad Barsoum
74
0
0
22 May 2025
INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling
INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling
Haochen Shi
Tianshi Zheng
Weiqi Wang
Baixuan Xu
Chunyang Li
Chunkit Chan
Tao Fan
Yangqiu Song
Qiang Yang
76
1
0
22 May 2025
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Silvia Cappelletti
Tobia Poppi
Samuele Poppi
Zheng-Xin Yong
Diego Garcia-Olano
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELMAAML
59
0
0
21 May 2025
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models
Zhanyue Qin
Yue Ding
Deyuan Liu
Qingbin Liu
Junxian Cai
Xi Chen
Zhiying Tu
Dianhui Chu
Cuiyun Gao
Dianbo Sui
76
0
0
21 May 2025
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and Inference
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and Inference
Shuqing Luo
Pingzhi Li
Jie Peng
Hanrui Wang
Yang
Zhao
Yu Cheng
Tianlong Chen
MoE
78
0
0
19 May 2025
RLAP: A Reinforcement Learning Enhanced Adaptive Planning Framework for Multi-step NLP Task Solving
RLAP: A Reinforcement Learning Enhanced Adaptive Planning Framework for Multi-step NLP Task Solving
Zepeng Ding
Dixuan Wang
Ziqin Luo
Guochao Jiang
Deqing Yang
Jiaqing Liang
72
0
0
17 May 2025
Parallel Scaling Law for Language Models
Parallel Scaling Law for Language Models
Mouxiang Chen
Binyuan Hui
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
Jianling Sun
Junyang Lin
Zhongxin Liu
MoELRM
91
2
0
15 May 2025
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
Zongqian Li
Yixuan Su
Nigel Collier
MoE
58
0
0
14 May 2025
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection
Kai Hua
Steven Wu
Ge Zhang
Ke Shen
LRM
80
0
0
12 May 2025
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining
Xiaomi LLM-Core Team
Bingquan Xia
Bo Shen
Cici
Dawei Zhu
...
Yun Wang
Yue Yu
Zhenru Lin
Zhichao Song
Zihao Yue
MoEReLMLRMAI4CE
169
7
0
12 May 2025
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
Lennart Luettgau
Harry Coppock
Magda Dubois
Christopher Summerfield
Cozmin Ududec
102
0
0
08 May 2025
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
Dmitriy Shopkhoev
Ammar Ali
Magauiya Zhussip
Valentin Malykh
Stamatios Lefkimmiatis
N. Komodakis
Sergey Zagoruyko
VLM
503
0
0
05 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
114
2
0
02 May 2025
HalluLens: LLM Hallucination Benchmark
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
130
5
0
24 Apr 2025
Compass-V2 Technical Report
Compass-V2 Technical Report
Sophia Maria
MoELRM
111
0
0
22 Apr 2025
aiXamine: Simplified LLM Safety and Security
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
335
0
0
21 Apr 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Yucheng Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
151
0
0
19 Apr 2025
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model
Grace Byun
Jinho D. Choi
EGVM
85
0
0
18 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLMVLM
221
132
1
14 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
127
0
0
13 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELMLM&MA
84
1
0
13 Apr 2025
Long Context In-Context Compression by Getting to the Gist of Gisting
Long Context In-Context Compression by Getting to the Gist of Gisting
Aleksandar Petrov
Mark Sandler
A. Zhmoginov
Nolan Miller
Max Vladymyrov
119
0
0
11 Apr 2025
Efficient Evaluation of Large Language Models via Collaborative Filtering
Efficient Evaluation of Large Language Models via Collaborative Filtering
Xu-Xiang Zhong
Chao Yi
Han-Jia Ye
115
0
0
05 Apr 2025
Entropy-Based Block Pruning for Efficient Large Language Models
Entropy-Based Block Pruning for Efficient Large Language Models
Liangwei Yang
Yuhui Xu
Juntao Tan
Doyen Sahoo
Siyang Song
Caiming Xiong
Han Wang
Shelby Heinecke
AAML
64
0
0
04 Apr 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
207
0
0
02 Apr 2025
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Diana Galván-Sosa
Gabrielle Gaudeau
Pride Kavumba
Yunmeng Li
Hongyi gu
Zheng Yuan
Keisuke Sakaguchi
P. Buttery
LRM
130
0
0
31 Mar 2025
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Roberto Garcia
Jerry Liu
Daniel Sorvisto
Sabri Eyuboglu
176
0
0
23 Mar 2025
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Xiaoou Liu
Tiejin Chen
Longchao Da
Chacha Chen
Zhen Lin
Hua Wei
HILM
142
8
0
20 Mar 2025
Mixture of Lookup Experts
Mixture of Lookup Experts
Shibo Jie
Yehui Tang
Kai Han
Yongqian Li
Duyu Tang
Zhi-Hong Deng
Yunhe Wang
MoE
131
1
0
20 Mar 2025
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu
Qian Liu
Haonan Wang
Shiqi Chen
Xiangming Gu
Tianyu Pang
Min-Yen Kan
102
0
0
19 Mar 2025
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
Xinyan Jiang
Hang Ye
Yongxin Zhu
Xiaoying Zheng
Zikang Chen
Jun Gong
110
0
0
17 Mar 2025
X-EcoMLA: Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression
X-EcoMLA: Upcycling Pre-Trained Attention into MLA for Efficient and Extreme KV Compression
Guihong Li
Mehdi Rezagholizadeh
Mingyu Yang
Vikram Appia
Emad Barsoum
VLM
104
1
0
14 Mar 2025
IteRABRe: Iterative Recovery-Aided Block Reduction
Haryo Akbarianto Wibowo
Haiyue Song
Hideki Tanaka
Masao Utiyama
Alham Fikri Aji
Raj Dabre
87
1
0
08 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
Chenyu Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoEALM
194
5
0
07 Mar 2025
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Yanjie Wang
Zhuohan Xie
...
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
ALMKELM
130
0
0
03 Mar 2025
1234...151617
Next