The newest release of VMware Aria Automation 8.18.0,
integrated into the VMware Cloud Foundation 5.2 platform, brings significant
advancements in Private AI Automation Services. These enhancements aim to
simplify processes, expand capabilities, and improve the overall efficiency of
AI workload management. This blog will explore these updates in detail,
focusing on licensing and drivers, configuring catalog items, and new catalog
items.
Licensing and Drivers
One of the key areas of improvement in this release is the
simplification of licensing and driver management. Cloud Administrators can now
easily provide the necessary information to ensure the proper functioning of AI
Workstations and AI Kubernetes Clusters:
- NVIDIA
Client Configuration Token: This token is crucial for enabling the
full capabilities of the vGPU driver. It is passed to the provisioned AI
Workstation or AI Kubernetes Cluster, ensuring optimal performance.
- NVIDIA
vGPU Driver Location: Administrators can choose the source of the vGPU
driver:
- Cloud:
This option utilizes the NVIDIA Licensing Portal. Administrators need to
provide an API key to access the portal.
- Local:
For a self-hosted setup, administrators can specify a local URL for the
vGPU guest driver.
These streamlined processes reduce complexity and make it
easier for administrators to manage and deploy AI resources efficiently.
Configure Catalog Items
The latest enhancements in configuring catalog items focus
on making the workflow more intuitive and efficient. VMware by Broadcom has
introduced several features to achieve this:
- Targeted
Content Library: Administrators can now target a specific content
library to quickly locate the Deep Learning Virtual Machine Image (DLVM).
This feature limits the results to the contents of one library, making it
easier to find the desired image.
- Automatic
Filtering: If there are existing Kubernetes images, such as Tanzu
Kubernetes Releases (TKR), within the targeted content library, they will
be automatically filtered out. This ensures that only relevant images are
displayed, streamlining the selection process.
Moreover, VMware by Broadcom has added support for
air-gapped environments for non-RAG AI Workstation catalog items, including
PyTorch, TensorFlow, CUDA Samples, and Triton Inferencing Server. This is
achieved by enabling the configuration of a private registry within the
quickstart workflow, pointing to a self-hosted container registry holding the
NVIDIA container images. This feature is particularly beneficial for
environments with strict security requirements or limited internet
connectivity.
Additionally, support for HTTP or HTTPS Proxy Server
Configuration has been introduced. This helps customers without direct internet
access to download the vGPU driver from NVIDIA or pull down the non-RAG AI
Workstation containers mentioned earlier. This enhancement ensures that
organizations can deploy and manage AI resources even in restricted network
environments.
However, it is important to note that the RAG AI Workstation
and AI Kubernetes Cluster catalog items still require direct internet access
for deployment. These items are not yet supported in air-gapped environments.
Catalog Items
To improve the usability and maintainability of VMware by
Broadcom Private AI Automation Services item catalogs, several significant
changes have been made:
- Splitting
AI Workstation Catalog Items: The AI Workstation catalog has been
divided into three distinct items:
- AI
Workstation: This can optionally run PyTorch, TensorFlow, CUDA
Samples, or none.
- AI
RAG Workstation: Specifically designed for RAG-based applications.
- Triton
Inferencing Server: Dedicated to running Triton Inference Server.
All AI Workstation catalog items can run additional custom
cloud-init configurations if needed. This flexibility allows administrators to
tailor the workstations to meet specific requirements.
- New
AI Kubernetes RAG Cluster Catalog Item: This new catalog item
provisions a Kubernetes Cluster with preinstalled vGPU and RAG Operators.
It enables customers to run AI RAG-based applications like Chatbot
Applications. This addition significantly enhances the capabilities of the
AI Kubernetes Cluster, making it easier to deploy and manage advanced AI
workloads.
With these updates, the total number of Private AI
Automation Services catalog items in VMware Aria Automation 8.18.0 has
increased from 2 to 5:
- 3
AI Workstation Catalog Items: AI Workstation, AI RAG Workstation, and
Triton Inferencing Server.
- 2
AI Kubernetes Cluster Catalog Items: Standard AI Kubernetes Cluster
and AI Kubernetes RAG Cluster.
These changes enhance the overall usability,
maintainability, and flexibility of the AI Automation Services catalog,
providing administrators with more options to meet their organizational needs.
Summary
VMware Cloud Foundation serves as the core infrastructure
platform for VMware Private AI Foundation for NVIDIA (PAIF-N), delivering
modern private cloud infrastructure software that enables organizations to
leverage Artificial Intelligence (AI) applications effectively. This platform
is essential for staying ahead in today's rapidly evolving business landscape
and driving sustainable growth.
VMware Private AI Foundation for NVIDIA (PAIF-N) provides a
high-performance, secure, cloud-native AI software platform for provisioning AI
workloads based on NVIDIA GPU Cloud (NGC) containers. These containers support
deep learning, machine learning, and high-performance computing (HPC), offering
container models, model scripts, and industry solutions. This comprehensive
platform allows data scientists, developers, and researchers to focus on
building solutions and gathering insights faster.
IT administrators benefit from robust resource governance
and control through Consumption Policies and Role-based Access Control. These
features ensure that project members can efficiently utilize AI infrastructure
services while guaranteeing optimal and secure resource usage.
In conclusion, the enhancements in VMware Aria Automation
8.18.0 streamline AI workload management, improve usability, and expand catalog
options. These updates support organizations in driving sustainable growth with
AI technology, ensuring they remain competitive and innovative in today's
dynamic business environment.