Recent Developments in LLaMA Models
Following the release of LLaMA 3 70B, anticipation grew for the much-discussed 400B model, which was expected to rival GPT-4 while maintaining open weights. Speculation circulated that this larger model might not be freely accessible and could instead be offered through a subscription service. In a surprising turn of events, however, it was not the anticipated 400B model that surfaced, but rather an enhanced version known as LLaMA 3.1.
The most significant upgrade in version 3.1 is the expansion of context size, which has increased dramatically from 8K to 128K. Initial benchmarks suggest that the 8B version of LLaMA 3.1 outperforms the LLaMA 3 70B, while the newly introduced 405B model stands as a formidable competitor to GPT-4o.
It appears that the 70B and 8B models were derived through a distillation process from the 405B model. This method may slightly diminish their quality compared to natively trained 70B models; however, benchmark results indicate that they still surpass the previous LLaMA 3 70B.
Unfortunately, the repositories on Hugging Face for the 8B and 70B models have been swiftly removed, leaving users without current links. It is hoped that members of the community will share relevant resources in the comments section.
- Discussion on Reddit regarding the 405B model can be found here: Reddit Discussion
- The 405B model is available on Hugging Face: Hugging Face Repository