Ahmad Khan
commited on
Commit
Β·
03e7b52
1
Parent(s):
96abf36
Fix variable name for extra memory note
Browse files- src/index.html +1 -1
src/index.html
CHANGED
|
@@ -452,7 +452,7 @@
|
|
| 452 |
<div class="note-box">
|
| 453 |
<p class="note-box-title">π Note</p>
|
| 454 |
<div class="note-box-content">
|
| 455 |
-
<p>Some libraries store grads in FP32, which would require an additional <d-math>m_{
|
| 456 |
</div>
|
| 457 |
</div>
|
| 458 |
|
|
|
|
| 452 |
<div class="note-box">
|
| 453 |
<p class="note-box-title">π Note</p>
|
| 454 |
<div class="note-box-content">
|
| 455 |
+
<p>Some libraries store grads in FP32, which would require an additional <d-math>m_{grad\_fp32} = 4 * N</d-math> memory. This is done, for example, in Nanotron, because BF16 is lossy for smaller values and we always prioritize stability. See <a href="https://github.com/microsoft/DeepSpeed/issues/1773">this DeepSpeed issue</a> for more information.</p>
|
| 456 |
</div>
|
| 457 |
</div>
|
| 458 |
|