Meta has this week released an Open Source version of LLM mode, Llama 2, for public use. The large language model (LLM), which can be used to create a chat GPT like chatbot.
Many believe that Llama 2 is the industry’s most important release since ChatGPT in November 2022.
Llama-2, an updated version of Llama 1, trained on a new mix of publicly available data. Meta increased the size of the pretraining corpus by 40%, doubled the context length of the model, and adopted grouped-query attention. Llama 2 was released with 7B, 13B, and 70B parameters.
Llama 2-Chat, a fine-tuned version of Llama 2 that is optimized for dialogue use cases. The variants of this model have 7B, 13B, and 70B parameters as well.
Pretraining data: The Llama-2 training corpus includes a new mix of data from publicly available sources that does not include data from Meta’s products or services. Removed data from certain sites known to contain a high volume of personal information about private individuals. The model was trained on 2 trillion tokens of data as this provides a good performance–cost trade-off, up-sampling the most factual sources in an effort to increase knowledge and dampen hallucinations.
FineTuning: Llama 2-Chat is the result of several months of research and iterative applications of alignment techniques, including both instruction tuning and RLHF, requiring significant computational and annotation resources. RLHF is a model training procedure that is applied to a fine-tuned language model to further align model behavior with human preferences and instruction following.
references:
https://generativeai.pub/unlocking-data-privacy-how-to-build-your-private-enterprise-data-app-with-private-gpt-and-llama-2-eb50d032d145
No comments:
Post a Comment