🤖 Harold's Notes

Search

❯

❯

❯

❯

Neat tricks

Jul 16, 20241 min read

Neat tricks

immediately async prefetch next batch while model is doing the forward pass on the GPU
Pinning the batches to memory in the prefetching function get_batch allows us to us to move them to GPU asynchronously (non_blocking=True) and in a faster way (pinning)
Flush the gradients as soon as we can (i.e. just after optimizer.step()), no need to hold this memory anymore

Loading extremely large files in memory when not enough RAM

use numpy.memmap to read a file on disk, and treat it as if it were on RAM. Just need to chunk the data previously and write into a binary file or .txt
Pinned memory, helps with data transfer times when calling x.cuda() (https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/)

Graph View

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2025