IBM’s Corp.’s Red Hat subsidiary today released to general availability the latest generation of Red Hat Enterprise Linux AI, a version of the company’s core Linux platform optimized for developing, testing and running artificial intelligence large language models.
RHEL AI provides a foundation model for building and running LLMs and includes image mode, which allows users to build, deploy and manage RHEL as a bootable container image. Version 1.3 supports the latest release of the Granite LLMs that IBM announced and released to open source in October. It also has improved data preparation features, expanded choices for hybrid cloud deployment and support for Intel Corp.’s Gaudi 3 AI accelerator.
Citing a recent International Data Corp. report that found 61% of enterprises plan to use open-source foundation models for generative AI, Red Hat said it’s orienting its operating system features to support smaller, open-source-licensed models, fine-tuning capabilities and inference performance engineering.
The new release also pays homage to the company’s corporate parent with support for Granite and InstructLab, an initiative birthed at IBM that aims to accelerate open-source contributions to generative AI development.
The new release supports Granite 3.0 8b, an 8 billion-parameter converged model that supports more than a dozen natural languages and has code generation and function-calling capabilities. Non-English languages, code generation and function features are available as a developer preview within RHEL AI 1.3, with the expectation that the capabilities will be supported in future RHEL AI releases.
Document unlock
It also includes support for Docling, an open-source project developed by IBM Research that allows PDFs, manuals and slide decks to be converted into specialized data formats such as JavaScript Object Notation and Markdown, a lightweight language that allows formatting elements to be added to plain text without using tags or a formal text editor. Users can convert documents into Markdown for simplified data ingestion for model tuning with InstructLab.
Docling includes context-aware chunking, a method used in natural language processing to break down text or data into smaller, meaningful segments while considering the surrounding context. This helps resulting applications deliver more coherent and contextually appropriate responses to questions and tasks out of the box.
Gaudi 3 is Intel’s answer to Nvidia Corp.’s H100 graphics processing unit, which has since been succeeded by the H200 series. Intel has said Gaudi 3 can inference at up to 2.3 times the power efficiency of the H100 while speeding up LLM training times. Red Hat already supports GPUs from Nvidia and Advanced Micro Devices Inc.
Red Hat OpenShift AI, which natively supports RHEL AI, now supports parallelized serving across multiple nodes with the vLLM runtimes, enabling multiple requests to be handled in real-time. VLLM is a high-performance inference engine for serving LLMs at low latency and high throughput. Users can dynamically alter an LLM’s parameters when being served, such as sharding – or distributing – the model across multiple GPUs or quantizing it to a smaller footprint.
Photo: Red Hat
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU
Leave a Comment