ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14858
  4. Cited By
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient
  Pre-LN Transformers

Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers

24 May 2023
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
David Z. Pan
    AI4CE
ArXivPDFHTML

Papers citing "Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers"

14 / 14 papers shown
Title
Qwen3 Technical Report
Qwen3 Technical Report
A. Yang
A. Li
Baosong Yang
Beichen Zhang
Binyuan Hui
...
Zekun Wang
Zeyu Cui
Z. Zhang
Zhenhong Zhou
Zihan Qiu
LLMAG
OSLM
LRM
42
0
0
14 May 2025
Bielik 11B v2 Technical Report
Bielik 11B v2 Technical Report
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
34
0
0
05 May 2025
Bielik v3 Small: Technical Report
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
27
0
0
05 May 2025
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal
Vaibhav Aggarwal
Ojasv Kamal
Abhinav Japesh
Zhijing Jin
Bernhard Schölkopf
52
1
0
18 Mar 2025
Qwen2.5-1M Technical Report
A. Yang
Bowen Yu
Chong Li
Dayiheng Liu
Fei Huang
...
Xingzhang Ren
Xinlong Yang
Y. Li
Zhiying Xu
Z. Zhang
71
10
0
28 Jan 2025
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and
  Evaluation
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
27
1
0
24 Oct 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLM
VLM
MU
60
792
0
15 Jul 2024
RouteFinder: Towards Foundation Models for Vehicle Routing Problems
RouteFinder: Towards Foundation Models for Vehicle Routing Problems
Federico Berto
Chuanbo Hua
Nayeli Gast Zepeda
André Hottung
N. Wouda
Leon Lan
Kevin Tierney
J. Park
Jinkyoo Park
56
10
0
21 Jun 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
Sailor: Open Language Models for South-East Asia
Sailor: Open Language Models for South-East Asia
Longxu Dou
Qian Liu
Guangtao Zeng
Jia Guo
Jiahui Zhou
Wei Lu
Min-Bin Lin
LRM
34
7
0
04 Apr 2024
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
29
1,577
0
28 Sep 2023
$\rm SP^3$: Enhancing Structured Pruning via PCA Projection
SP3\rm SP^3SP3: Enhancing Structured Pruning via PCA Projection
Yuxuan Hu
Jing Zhang
Zhe Zhao
Chengliang Zhao
Xiaodong Chen
Cuiping Li
Hong Chen
35
1
0
31 Aug 2023
RepVGG: Making VGG-style ConvNets Great Again
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
136
1,548
0
11 Jan 2021
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,452
0
18 Mar 2020
1