Monday, July 10, 2023

What is Chinchilla Model

A more compute-optimal 70B model, called Chinchilla, is trained on 1.4 trillion tokens. Not only does Chinchilla outperform its much larger counterpart, Gopher, but its reduced model size reduces inference cost considerably.



references:

https://sh-tsang.medium.com/brief-review-chinchilla-training-compute-optimal-large-language-models-7e4d00680142


No comments:

Post a Comment