Meta has officially unveiled its latest suite of artificial intelligence models, Llama 4, introducing a notable progression within its flagship AI family. Released on a Saturday, the new lineup underscores Meta’s aggressive push to stay competitive amid intense rivalry in advanced AI development.
The Llama 4 series is composed of three distinct models: Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth, each trained extensively on large volumes of unlabeled textual, visual, and video data, granting them broad multimodal capabilities. According to Meta, the emergence of highly effective open-source AI solutions like those produced by China’s DeepSeek prompted the tech giant to accelerate its own AI initiatives.
The company reportedly established dedicated “war rooms” to closely study DeepSeek’s innovations, particularly aiming to understand how competitors achieved lower deployment costs for their high-performance AI models. That rapid response has culminated in the early availability of Scout and Maverick, both now accessible via Meta’s Llama.com platform and partners, including widely used infrastructure providers like Hugging Face. Meanwhile, the most powerful model of the three, Behemoth, remains in active training and has yet to be publicly released.
Meta has also incorporated Llama 4 capabilities into its widely deployed digital assistant, Meta AI, enhancing performance across apps such as WhatsApp, Messenger, and Instagram in 40 countries, a significant development that currently includes multimodal features available only in English to U.S. users.
Some developers, however, may find the conditions of using Llama 4 restrictive. Specifically, Meta’s license explicitly blocks access to businesses and individuals domiciled within the European Union, likely due to complexities introduced by EU privacy and AI governance standards, regulations that Meta has publicly criticized as overly burdensome. Moreover, larger companies with user bases exceeding 700 million monthly active users must acquire a dedicated license from Meta, with the company reserving full authority over approvals.
Meta touts the rollout of Llama 4 as a pivotal moment, marking the entry into an expanded AI ecosystem using an efficient architecture known as Mixture of Experts (MoE). MoE design distributes computing tasks across smaller specialist models, thereby providing enhanced efficiency in both training and inference.
Maverick employs this MoE structure with 400 billion total parameters and 128 expert sub-models, with only 17 billion parameters active at any one time. Meta’s internal evaluations position Maverick ahead of models like OpenAI’s GPT-4o and Google’s Gemini 2.0 for specific tasks, such as certain coding, reasoning, multilingual support, long-context comprehension, and image-based benchmarks. It still trails behind recently launched advanced models like Gemini 2.5 Pro, Claude 3.7 Sonnet from Anthropic, and OpenAI’s GPT-4.5, which currently dominate broader AI tasks.
Meanwhile, Scout stands out through its exceptional capability in large-document summarization and code analysis, notably possessing a context window that can handle up to 10 million tokens—enabling it to process documents spanning millions of words paired with images. Scout’s practicality is also highlighted by its relatively modest infrastructure requirements, as it can run on a single Nvidia H100 GPU, whereas Maverick demands a significantly more powerful Nvidia H100 DGX system.
Meta’s upcoming Behemoth model, with nearly 2 trillion total parameters and 288 billion active parameters, sets even higher demands on computational resources. Internal assessments demonstrate that Behemoth surpasses GPT-4.5, Gemini 2.0 Pro, and Claude 3.7 Sonnet in several technical and scientific benchmarks—but still falls slightly short of Gemini 2.5 Pro.
Significantly, none of these new Llama models employ a dedicated “reasoning” approach like OpenAI’s latest generation, which systematically verifies and cross-checks responses, resulting in greater reliability but longer response times. Meta suggests that its models, while quicker, may sometimes favor speed over intensive self-validation.
On content moderation, Meta has notably adjusted Llama 4 to tackle “contentious” political and social questions more frequently—and more neutrally—than prior generations. According to company representatives, these models have become intentionally more balanced and less likely to dismiss inquiries outright, reflecting an effort to mitigate concerns about AI models’ perceived political bias.
This shift in moderation policy comes amid growing political sensitivity toward AI content moderation practices, with influential voices close to former President Trump, including tech mogul Elon Musk and AI policy advisor David Sacks, accusing major tech firms of embedding politically biased content filters. Meta, echoing industry peers like OpenAI, appears intent on addressing criticisms around perceived ideological censoring while simultaneously recognizing the technical challenges inherent in completely removing bias from AI content generation.
In its announcement, Meta emphasized that Llama 4 is merely the starting point for a rapidly evolving AI ecosystem, paving the way for robust multimodal capabilities that bridge conventional language and image models. The company intends to further refine models like Scout, Maverick, and Behemoth, viewing Llama 4 as foundational solutions upon which future AI products and innovations will be built.