In May of this year, we shared that IBM is focusing on code generation with its open-source Granite generative AI models and also introduced its initiative called InstructLab. IBM has announced the release of new Granite 3.0 8B and Granite 3.0 2B models under the Apache 2.0 license.

The company also unveiled Mixture of Experts (MoE) models, which include Granite 3.0 3B A800M Instruct, Granite 3.0 1B A400M Instruct, Granite 3.0 3B A800M Base, and Granite 3.0 1B A400M Base. Additionally, it’s worth noting that IBM has a new group that includes Granite Guardian 3.0 8B and Granite Guardian 3.0 2B models, which stand out due to their optimized safety and security features.

According to Rob Thomas, IBM’s Senior Vice President and Chief Commercial Officer, the company now has over $2 billion in business volume focused on generative AI, as mentioned during its recent earnings call. Reflecting on his 25 years at IBM, Thomas stated that they have never experienced such rapid growth in a business sector before.

Users can benefit from the Granite 3.0 model family in various areas focused on enterprise applications, such as customer service, IT automation, Business Process Outsourcing (BPO), application development, and cybersecurity.

Training Data

The new Granite 3.0 models were trained by a central data model factory team responsible for sourcing and organizing the data used by IBM for training. The model was trained using a new two-stage training method, utilizing data from 12 different natural languages and 116 different programming languages. According to Dario Gil, IBM’s Senior Vice President and Research Director, the training process involved 12 trillion tokens of data.

Benchmarking Metrics

According to the information shared by the company, in core enterprise tasks related to RAG, tool usage, and cybersecurity, the Granite 3.0 8B Instruct model demonstrates leading performance on average compared to similar-sized open-source models from Mistral and Meta.

IBM indicates that in standard academic benchmarks defined by Hugging Face’s OpenLLM Leaderboard, the overall performance of the Granite 3.0 8B Instruct model averages out as a leader against the state-of-the-art performance of similarly sized open-source models from Meta and Mistral. However, we observe that particularly in the MMLU, MMLU-Pro, and AGI-Eval metrics, Llama 3.2 3B surpasses Granite 3.0 2B. Similarly, Llama 3.1 8B outperforms Granite 3.0 8B in the Massively Multi-Task Language Understanding (MMLU) metric.

In addition, the company states that in the AttaQ security benchmarking, the Granite 3.0 8B Instruct model leads in all measured security dimensions compared to models from Meta and Mistral. More detailed comparisons can be found in the technical documentation of the new IBM language models. It is advisable to read the technical document before testing the models.

The new models will also be available to users alongside IBM’s watsonX service, as well as on Amazon Bedrock, Amazon Sagemaker, and Hugging Face. By the end of the year, it is expected that the 3.0 8B and 2B language models will include support for an expanded 128K context window and multimodal document understanding capabilities.


Source