Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.14315
Cited By
AI-coupled HPC Workflow Applications, Middleware and Performance
20 June 2024
Wes Brewer
Ana Gainaru
Frédéric Suter
Feiyi Wang
M. Emani
S. Jha
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AI-coupled HPC Workflow Applications, Middleware and Performance"
30 / 30 papers shown
Title
Optimizing Distributed Training on Frontier for Large Language Models
Sajal Dash
Isaac Lyngaas
Junqi Yin
Xiao Wang
Romain Egele
Guojing Cong
Feiyi Wang
Prasanna Balaprakash
ALM
MoE
163
16
0
20 Dec 2023
Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review
Daniel Rosendo
Alexandru Costan
P. Valduriez
Gabriel Antoniu
54
84
0
29 Apr 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
535
6,301
0
05 Apr 2022
Pathways: Asynchronous Distributed Dataflow for ML
P. Barham
Aakanksha Chowdhery
J. Dean
Sanjay Ghemawat
Steven Hand
...
Parker Schuh
Ryan Sepassi
Laurent El Shafey
C. A. Thekkath
Yonghui Wu
GNN
MoE
115
132
0
23 Mar 2022
Simulation Intelligence: Towards a New Generation of Scientific Methods
Alexander Lavin
D. Krakauer
Hector Zenil
Justin Emile Gottschlich
Tim Mattson
...
A. Hanuka
Manuela Veloso
Samuel A. Assefa
Stephan Zheng
Avi Pfeffer
135
112
0
06 Dec 2021
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
Yongbin Li
Hongxin Liu
Zhengda Bian
Boxiang Wang
Haichen Huang
Fan Cui
Chuan-Qing Wang
Yang You
GNN
96
148
0
28 Oct 2021
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
S. Farrell
M. Emani
J. Balma
L. Drescher
Aleksandr Drozd
...
Akihiro Tabuchi
V. Vishwanath
Mohamed Wahib
Masafumi Yamazaki
Junqi Yin
VLM
66
37
0
21 Oct 2021
Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing
Logan T. Ward
Ganesh Sivaraman
J. G. Pauloski
Y. Babuji
Ryan Chard
...
R. Assary
Kyle Chard
L. Curtiss
R. Thakur
Ian Foster
51
39
0
06 Oct 2021
Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval
Zhengchun Liu
Ahsan Ali
Peter Kenesei
Antonino Miceli
Hemant Sharma
...
Naoufal Layad
Jana Thayer
R. Herbst
Chun Hong Yoon
Ian Foster
52
23
0
28 May 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
138
707
0
09 Apr 2021
tf.data: A Machine Learning Data Processing Framework
D. Murray
Jiří Šimša
Ana Klimovic
Ihor Indyk
PINN
AI4CE
LMTD
104
89
0
28 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
61
57
0
21 Jan 2021
RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem
Eric Liang
Zhanghao Wu
Michael Luo
Sven Mika
Joseph E. Gonzalez
Ion Stoica
AI4CE
72
12
0
25 Nov 2020
IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads
Aymen Al-Saadi
Dario Alfè
Y. Babuji
Agastya P. Bhati
Ben Blaiszik
...
A. Tsaris
Matteo Turilli
H. V. Dam
S. Wan
David Wifling
58
31
0
13 Oct 2020
BraggNN: Fast X-ray Bragg Peak Analysis Using Deep Learning
Zhengchun Liu
H. Sharma
Jun-Sang Park
Peter Kenesei
Antonino Miceli
J. Almer
R. Kettimuthu
Ian Foster
35
38
0
18 Aug 2020
Kafka-ML: connecting the data stream with ML/AI frameworks
Cristian Martín
Peter Langendoerfer
Pouya Soltani Zarrin
M. Díaz
B. Rubio
26
43
0
07 Jun 2020
funcX: A Federated Function Serving Fabric for Science
Ryan Chard
Y. Babuji
Zhuozhao Li
Tyler J. Skluzacek
A. Woodard
Ben Blaiszik
Ian Foster
Kyle Chard
121
191
0
07 May 2020
Building high accuracy emulators for scientific simulations with deep neural architecture search
M. F. Kasim
D. Watson‐Parris
L. Deaconu
Sophy Oliver
P. Hatfield
...
S. Khatiwala
J. Korenaga
J. Topp-Mugglestone
E. Viezzer
S. Vinko
AI4CE
44
95
0
17 Jan 2020
MLPerf Training Benchmark
Arya D. McCarthy
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
...
Carole-Jean Wu
Lingjie Xu
Masafumi Yamazaki
C. Young
Matei A. Zaharia
103
316
0
02 Oct 2019
Exascale Deep Learning to Accelerate Cancer Research
Robert M. Patton
J. T. Johnston
Steven R. Young
Catherine D. Schuman
T. Potok
...
Junghoon Chae
L. Hou
Shahira Abousamra
Dimitris Samaras
Joel H. Saltz
37
15
0
26 Sep 2019
DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding
Hyungro Lee
Heng Ma
Matteo Turilli
D. Bhowmik
S. Jha
A. Ramanathan
AI4CE
48
71
0
17 Sep 2019
Parsl: Pervasive Parallel Programming in Python
Y. Babuji
A. Woodard
Zhuozhao Li
Daniel S. Katz
Ben Clifford
...
Ryan Chard
Justin M. Wozniak
Ian Foster
Michael Wilde
Kyle Chard
MoE
62
253
0
06 May 2019
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
R. Mayer
Hans-Arno Jacobsen
GNN
74
190
0
27 Mar 2019
Deep Active Learning with a Neural Architecture Search
Yonatan Geifman
Ran El-Yaniv
AI4CE
71
45
0
19 Nov 2018
Horovod: fast and easy distributed deep learning in TensorFlow
Alexander Sergeev
Mike Del Balso
102
1,222
0
15 Feb 2018
Regularized Evolution for Image Classifier Architecture Search
Esteban Real
A. Aggarwal
Yanping Huang
Quoc V. Le
185
3,039
0
05 Feb 2018
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Vivienne Sze
Yu-hsin Chen
Tien-Ju Yang
J. Emer
AAML
3DV
120
3,028
0
27 Mar 2017
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas Guibas
3DH
3DPC
3DV
PINN
500
14,371
0
02 Dec 2016
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
484
5,385
0
05 Nov 2016
Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
Lisha Li
Kevin Jamieson
Giulia DeSalvo
Afshin Rostamizadeh
Ameet Talwalkar
240
2,336
0
21 Mar 2016
1