Pretraining Language Models to Ponder in Continuous Space
v1v2 (latest)

Pretraining Language Models to Ponder in Continuous Space

    LRM

Papers citing "Pretraining Language Models to Ponder in Continuous Space"

49 / 49 papers shown
Title
Computation vs. Communication Scaling for Future Transformers on Future
  Hardware
Computation vs. Communication Scaling for Future Transformers on Future Hardware
Suchita Pati
Shaizeen Aga
Mahzabeen Islam
Nuwan Jayasena
Matthew D. Sinclair
40
10
0
06 Feb 2023

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.