L ast week, Meta announced LLaMA, its latest stab at making a GPT-style “large language model”*. If AI is the future of tech, then big tech companies need to control their own models or be left behind by the competition. LLaMA joins OpenAI’s GPT (licensed by Microsoft for Bing and underpinning OpenAI’s own ChatGPT) and Google’s LaMDA (which will power Bard, its ChatGPT rival) in the upper echelons of the field.
Meta’s goal wasn’t simply to replicate GPT. It says that LLaMA is a “smaller, more performant model” than its peers, built to achieve the same feats of comprehension and articulation with a smaller footprint in terms of compute*, and so has a correspondingly smaller environmental impact. (The fact that it’s cheaper to run doesn’t hurt, either.)
But the company also sought to differentiate itself in another way, by making LLaMA “open”, implicitly pointing out that despite its branding, “OpenAI” is anything but. From its announcement:
Even with all the recent advancements in large language models, full research access to them remains limited because of the resources that are required to train and run such large models. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation.
By sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating these problems in large language models.
By releasing LLaMA for researchers to use, Meta has cut out one of the key limits on academic AI research: the vast cost of training an LLM*. Three years ago, each training run of
Read more on theguardian.com