🤖 Harold's Notes

Search

❯

❯

❯

❯

Scaling Laws

Jul 03, 20241 min read

OpenAI thoughts

Scaling Laws for Batch Size
Scaling Laws for Transfer
Scaling Laws for Neural Language Models

Others

General scaling
- (TBD) Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
  - revisiting scaling laws without cosine schedule, just warm-up, constant, and cooldowns
  - allows for reusage of previous training runs
- (TBD) Unraveling the Mystery of Scaling Laws: Part I
- (TBD) Tele-FLM Technical Report
- (TBD) Language models scale reliably with over-training and on downstream tasks

Graph View

OpenAI thoughts
Others

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2024