Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.16495
Cited By
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
24 June 2024
Jingze Shi
Ting Xie
Bingheng Wu
Chunjun Zheng
Kai Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser"
7 / 7 papers shown
Title
An Empirical Study of Mamba-based Language Models
R. Waleffe
Wonmin Byeon
Duncan Riach
Brandon Norick
V. Korthikanti
...
Vartika Singh
Jared Casper
Jan Kautz
Mohammad Shoeybi
Bryan Catanzaro
98
72
0
12 Jun 2024
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
142
1,075
0
08 Jan 2024
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
107
1,535
0
18 Mar 2021
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
166
1,755
0
29 Jun 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
82
383
0
13 Apr 2020
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
147
2,635
0
25 Sep 2018
The NarrativeQA Reading Comprehension Challenge
Tomás Kociský
Jonathan Richard Schwarz
Phil Blunsom
Chris Dyer
Karl Moritz Hermann
Gábor Melis
Edward Grefenstette
128
771
0
19 Dec 2017
1