Search
❯
May 19, 20251 min read
https://www.essential.ai/blog/infra layer sharding for large scale training with muon