Meta’s LLaMA, new kid on the block

Vivek Pandit
3 min readMar 3, 2023

--

For the past few weeks, the language models created and used by Microsoft, Google, and OpenAI have received a lot of attention in the IT community. But, Meta, the parent company of Facebook, has also made major strides in this area and is now making available LLaMA, a new AI language generator.

LLaMA unlike ChatGPT doesn't have open access. It is actually a research instrument that calls for the usage of prompt engineering. LLaMA was created to help researchers progress their work in the branch of AI.

Training Data

LLaMA only uses publicly available data, such as CCNet, C4, GitHub, Wikipedia, books, ArXiv, and Stack Exchange. The majority i.e. 67% data comes from CCNet.

Although LLaMA was trained in 20 different languages, it is anticipated that it will perform better in English than in the other languages because English training data made up the majority of the training data. The FAIR team also discovered that the model’s effectiveness can vary for other dialects.

From the research paper

Model Variants

The LLaMA model was created by the FAIR team of Meta AI between December 2022 and February 2023. An auto-regressive language model built on the transformer architecture is the model’s initial iteration.

Smaller models that have been trained on a greater number of tokens — word fragments — are simpler to retrain and fine-tune for certain prospective product use cases. Using 1.4 trillion tokens, they trained LLaMA 65B and LLaMA 33B. One trillion tokens were used to train our smallest model, LLaMA 7B.

From the research paper

Comparison with other LLMs

LLaMA in comparison with other LLMs is a smaller model, if we compare it based on the number of parameters, but has been trained on a pretty large number of tokens.

LLaMA was benchmarked against all other popular LLMs on various tasks such as Common Sense Reasoning, Closed-book Question Answering, Reading Comprehension etc. The largest LLaMA variant seems to outperform other LLMs on quite a few of these tasks.

Access to the model

As of now, Meta hasn’t provided access to the model in OpenAI style but they will provide access to academic researchers, civil society, policymakers, and industry. You can fill out the below form and let Meta review your application to access LLaMA.

https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform

--

--

Vivek Pandit
Vivek Pandit

No responses yet