LLaMA, which stands for Large Language Model Meta AI, is a state-of-the-art foundational large language model publicly released by Meta as part of their commitment to open science. This model is designed to help researchers advance their work in the subfield of AI, particularly those who do not have access to large amounts of infrastructure.
Features of LLaMA
- Scalability: LLaMA is available in several sizes (7B, 13B, 33B, and 65B parameters), making it accessible to researchers with varying computational resources. This scalability allows for the testing of new approaches, validation of others’ work, and exploration of new use cases.
- Training on Large Unlabeled Data: LLaMA is trained on a large set of unlabeled data, making it ideal for fine-tuning for a variety of tasks. The largest models, LLaMA 65B and LLaMA 33B, are trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, is trained on one trillion tokens.
- Versatility: As a foundational model, LLaMA is designed to be versatile and can be applied to many different use cases, as opposed to a fine-tuned model that is designed for a specific task. This versatility allows researchers to test new approaches to limiting or eliminating problems in large language models, such as bias, toxicity, and the potential for generating misinformation.