IBM has announced the launch of its most advanced family of AI models to date, Granite 3.0. IBM's third generation of Granite language models can outperform or match the performance of similarly sized models from leading vendors on many academic and industry benchmarks, demonstrating robust performance, transparency, and security.
In line with the company's commitment to open-source AI, Granite models are released under the permissive Apache 2.0 license, which makes them unique in their combination of performance, flexibility, and autonomy for corporate clients and the community at large.
Granite 3.0 family
General language/use: Granite 3.0 8B-Instruct, Granite 3.0 2B-Instruct, Granite 3.0 8B Base, Granite 3.0 2B Base
Safety and security barriers: Granite Guardian 3.0 8B, Granite Guardian 3.0 2B
Mix of experts (MoE): Granite 3.0 3B A800M Instruct, Granite 3.0 1B A400M Instruct, Granite 3.0 3B A800M Base, Granite 3.0 1B A400M Base
The new Granite 3.0 8B and 2B language models have been designed as "workhorses" for enterprise AI, offering high performance and economy in tasks such as regeneration augmented retrieval (RAG), classification, summarization, entity extraction, and tool usage. These compact and versatile models are designed to accurately match corporate data and integrate seamlessly into any business environment or workflow.
Although many large language models (LLMs) are trained on public data, much corporate data remains untapped. By combining a small Granite model with corporate data, especially using the revolutionary InstructLab alignment technique - introduced by IBM and RedHat in May - IBM believes that companies can achieve task-specific performance that rivals larger models at a fraction of the cost (with an observed reduction of between 3 and 23 times the cost of large, state-of-the-art models in several early proofs of concept).
The launch of Granite 3.0 reaffirms IBM's commitment to creating transparency, security, and trust in AI products. The Granite 3.0v white paper and responsible use guide describe the datasets used to train these models, detail the filtering, cleaning, and curation steps applied, and provide comprehensive results of model performance in major academic and business tests.
Crucially, IBM offers an intellectual property (IP) indemnity for all Granite models on watsonx.ai, allowing corporate clients greater confidence when combining their data with these models.
Testing Granite 3.0
The Granite 3.0 language models also show promising results regarding raw performance.
In standard academic tasks defined by Hugging Face's OpenLLM Leaderboard, the Granite 3.0 8B Instruct model's overall performance is, on average, superior to similarly sized open-source models from Meta and Mistral. In IBM's AttaQ security test, the Granite 3.0 8B Instruct model leads in all measured security dimensions compared to the Meta and Mistral models.
In the basic business tasks of RAG, tool usage, and cyber security, the Granite 3.0 8B Instruct model outperforms, on average, the open-source Mistral and Meta models of similar size.
The Granite 3.0 models were launched with more than 12 billion tokens of data from 12 natural languages and 116 different programming languages, using a new two-stage training method that takes advantage of the results of several thousand experiments designed to optimize the quality of the data and the selection and training parameters. By the end of the year, the 8B and 2B language models are also expected to include support for an extended 128K context window and multimodal document understanding features.
Demonstrating an excellent balance between inference performance and cost, IBM is offering its Granite Mixture of Experts (MoE) Architecture, Granite 3.0 1B A400M and Granite 3.0 3B A800M as smaller, lightweight models that can be deployed in low-latency applications as well as for CPU-based deployments.
IBM has also announced an updated version of its pre-trained Granite Time Series models, the first versions launched earlier this year. These new models are trained with three times as much data and perform strongly in the main time series tests, outperforming, on average, 10 times larger models from Google and Alibaba. The updated models also offer greater flexibility with support for external variables and continuous forecasting.
A new era of responsible AI
As part of this release, IBM has also introduced a new family of Granite Guardian models that allow application developers to implement firewalls and verify user requests and LLM responses for a variety of risks. The Granite 3.0 8B and 2B models offer today's most comprehensive set of risk and damage detection features.
In addition to dimensions of harm such as social prejudice, hate, toxicity, profanity, violence, jailbreaking techniques, etc., these models also offer several unique RAG-specific checks such as logic, context relevance, and response relevance. In extensive testing in more than 19 security and RAG benchmark tests, the Granite Guardian 3.0 8B model achieved higher accuracy in detecting damage than all three generations of Meta's Llama Guard models. Its performance in detecting hallucinations is similar to that of the specialized WeCheck and MiniCheck models.
Although Granite Guardian models are derived from the corresponding Granite language models, they can be used to implement security measures alongside any AI model, whether open or proprietary.
Availability
The entire set of Granite 3.0 models and the updated time series models are available for download from HuggingFace under the Apache 2.0 permissive license. The instructional variants of the new Granite 3.0 8B and 2B language models and the Granite Guardian 3.0 models will be available from Tuesday for commercial use on the IBM Watsonx platform. Several Granite 3.0 models will also be available as NVIDIA NIM microservices and through Google Cloud's Vertex AI Model Garden integrations with HuggingFace.
A selected set of Granite 3.0 models is also available on Ollama and Replicate to facilitate developer choice and use, as well as local implementations.
The latest generation of Granite models extends IBM's robust open-source portfolio of powerful LLMs. IBM has collaborated with ecosystem partners such as AWS, Docker, Domo, Qualcomm Technologies, Inc. through its AI Hub, Salesforce, and SAP, among others, to integrate a variety of Granite templates into these partners' offerings or make Granite templates available on their platforms, offering more options for companies around the world.