Leslie D'Monte The Generative AI race just got hotter with Meta releasing the second version of its free open-source large language model, Llama 2, for research and commercial use, thus providing an alternative to the pricy proprietary LLMs sold by OpenAI like ChatGPT Plus and Google Bard while giving a boost to open source LLMs. Developers began flocking to LLaMA--Meta’s open-source LLM that was released in February (https://ai.meta.com/blog/large-language-model-llama-meta-ai/). Researchers made more than 100,000 requests for for Llama 1, according to Meta.
LLaMA requires “far less computing power and resources to test new approaches, validate others’ work, and explore new use cases", according to Meta. Meta made LLaMA available in several sizes (7B, 13B, 33B, and 65B parameters -- B stands for billion) and had also shared a LLaMA model card that detailed how it built the model, very unlike the lack of transparency at OpenAI. Generative Pre-trained Transformer series (GPT-3), on the other hand, has 175 billion parameters while GPT-4 was rumoured to have been launched with 100 trillion parameters, a claim that was dismissed by OpenAI CEO Sam Altman.
Foundation models train on a large set of unlabelled data, which makes them ideal for fine-tuning a variety of tasks. For instance, ChatGPT based on GPT 3.5 was trained on 570GB of text data from the internet containing hundreds of billions of words, including text harvested from books, articles, and websites, including social media. However, according to Meta, smaller models trained on more tokens —pieces of words — are easier to re-train and fine-tune for specific potential product use cases.
Read more on livemint.com