Cloud Experts Documentation

Creating Images using Stable Diffusion on Red Hat OpenShift AI on ROSA cluster with GPU enabled

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

1. Introduction

Stable Diffusionexternal link (opens in new tab) is an AI model to generate images from text description. It uses a diffusion process to iteratively denoise random Gaussian noise into coherent images. This is a simple tutorial to create images using Stable Diffusion model using Red Hat OpenShift AI (RHOAI) , formerly called Red Hat OpenShift Data Science (RHODS), which is our OpenShift platform for AI/ML projects lifecycle management, running on a Red Hat OpenShift Services on AWS (ROSA) cluster, which is our managed service OpenShift platform on AWS, with NVIDIA GPU enabled.

Note that this guide requires a ROSA cluster with GPU enabled. The first half in this tutorial is installing service mesh operator, followed by installing RHOAI operator and creating DataScienceCluster instance. And the second half, we’ll be running Stable Diffusion model to create cat and dog images on RHOAI’s Jupyter notebook. In addition, the RHOAI operator version used in this tutorial is version 2.12.0 and please note that as RHOAI undergoes ongoing development and refinement, certain features and GUI may evolve or change over time.

Disclaimer: When using Stable Diffusion or other open-source image generation models, please be aware that while these tools include certain content filters and safety features, these are not foolproof. Therefore, it is your responsibility to use this tool in a safe manner, ensure the prompts you input are appropriate, and verify that the generated images are suitable for your intended audience. Neither the author of this tutorial nor the infrastructure providers can be held responsible for any inappropriate or unwanted results you may generate. By proceeding with this tutorial, you acknowledge that you understand the potential risks and agree to use the tool responsibly. Remember that the output of AI image generation models can sometimes be unpredictable and thus it is important to review all the generated images before sharing or using them in any context.

2. Prerequisites

2.1 Tools

2.2 Environment

  1. You will need a ROSA cluster (classic or HCP), if you don’t have one, you can follow the ROSA guide to create an HCP ROSA cluster.

    • I ran this tutorial on an HCP ROSA 4.16.8 cluster with m5.4xlarge node with 48 vCPUs and ~185Gi memory.
    • Please be sure that you have cluster admin access to the cluster.
  2. You will need a GPU enabled machine pool in your ROSA cluster. If you don’t have one, you can follow the Adding GPUs to a ROSA cluster guide to add GPUs to your cluster.

    • I also ran this tutorial using g5.4xlarge node with autoscaling enabled up to 4 nodes.

3. Setting up RHOAI

3.1 Installing OpenShift Service Mesh Operator

  1. Deploy the Operator

    cat << EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: servicemeshoperator
      namespace: openshift-operators
    spec:
      channel: stable
      installPlanApproval: Automatic
      name: servicemeshoperator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
    EOF
    

3.2 Installing RHOAI Operator and DataScienceCluster Instance

  1. Create a project for the RHOAI operator:

    oc new-project redhat-ods-operator
    
  2. Deploy the OpenShift AI Operator:

    cat << EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: redhat-ods-operator
      namespace: redhat-ods-operator
    spec:
      upgradeStrategy: Default
    ---
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: rhods-operator
      namespace: redhat-ods-operator
    spec:
      channel: fast
      installPlanApproval: Automatic
      name: rhods-operator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
    EOF
    
  3. Wait until the operator is installed

    oc wait --for=jsonpath='{.status.replicas}'=1 deployment \
      -n redhat-ods-operator rhods-operator
    

    If you’re on Linux and seeing error message like Error from server (NotFound): deployments.apps “rhods-operator” not found, then please wait a couple of minutes and rerun the above command again.

  4. Create a DataScienceCluster

    cat << EOF | oc apply -f -
    apiVersion: datasciencecluster.opendatahub.io/v1
    kind: DataScienceCluster
    metadata:
      name: default-dsc
    spec:
      components:
        codeflare:
          managementState: Managed
        kserve:
          managementState: Managed
          serving:
            ingressGateway:
              certificate:
                type: SelfSigned
            managementState: Managed
            name: knative-serving
        trustyai: {}
        ray:
          managementState: Managed
        kueue:
          managementState: Managed
        workbenches:
          managementState: Managed
        dashboard:
          managementState: Managed
        modelmeshserving:
          managementState: Managed
        datasciencepipelines:
          managementState: Managed
    EOF
    

    Wait for the DataScienceCluster to be ready

    oc wait --for=jsonpath='{.status.phase}'=Ready datasciencecluster \
      default-dsc
    
  5. Finally, log into the OpenShift AI console using your web browser and the output of this command

    oc -n redhat-ods-applications get route rhods-dashboard -o jsonpath='{.spec.host}'
    

3. Deploying Stable Diffusion Model

In this tutorial, we’ll use the Stable Diffusion 2.1external link (opens in new tab) model from Stability AI to generate images based on text prompts. We’ll generate three images based on prompts about cats and dogs, using 50 inference steps and a guidance scale of 7.5. These images are then displayed vertically using matplotlib, with each image titled by its corresponding prompt.

And now that we have the environment ready, let’s go to the RHOAI dashboard. From the navigator pane on the left hand side, select Applications, and click Enabled, which will then lead you to launch a Jupyter notebook. FYI, you could also take a look at the third section of our other guide here for more details on the console.

Click Launch application and then select TensorFlow 2024.1 notebook. You can leave the container size to Small. And then select NVIDIA GPU as the accelerator from the dropdown option.

NVIDIA-GPU

Click the Start server button and wait until the notebook is ready, and click Open in new tab. And once you’re routed to the Jupyter notebook, click Python 3.9 notebook button on top, and run the following script in a single cell.

# install the necessary dependencies and libraries
!pip install --upgrade diffusers transformers torch accelerate matplotlib datasets torchvision

import torch
from diffusers import StableDiffusionPipeline
from datasets import load_dataset
import random
from torchvision import transforms
from PIL import Image
import matplotlib.pyplot as plt
import gc

# clean up memory and reset CUDA cache
def cleanup_memory():
    gc.collect()
    torch.cuda.empty_cache()
    if torch.cuda.is_available():
        torch.cuda.reset_peak_memory_stats()

# load the Stable Diffusion model
def load_model(model_id):
    pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
    pipeline = pipeline.to("cuda" if torch.cuda.is_available() else "cpu")
    return pipeline

# generate the images
def generate_images(pipeline, prompts, num_images_per_prompt=1, num_inference_steps=50, guidance_scale=7.5):
    images = []
    for prompt in prompts:
        batch = pipeline(
            prompt, 
            num_images_per_prompt=num_images_per_prompt, 
            num_inference_steps=num_inference_steps,
            guidance_scale=guidance_scale,
            output_type="pil"
        )
        images.extend(batch.images)
        cleanup_memory()
    return images

# display the images
def display_images(images, prompts):
    rows = len(images)
    fig, axs = plt.subplots(rows, 1, figsize=(15, 5*rows))
    
    if rows == 1:
        axs = [axs]  
    
    for img, ax, prompt in zip(images, axs, prompts):
        ax.imshow(img)
        ax.set_title(prompt, fontsize=10)
        ax.axis('off')
    
    plt.tight_layout()
    plt.show()

# execute the script
if __name__ == "__main__":
    try:        
        pipeline = load_model('stabilityai/stable-diffusion-2-1')
                
        prompts = [
            "A cute cat",
            "A cute dog",
            "A cute cat and a cute dog sit next to each other"
        ]
        num_images_per_prompt = 1
        
        generated_images = generate_images(pipeline, prompts, num_images_per_prompt, num_inference_steps=50, guidance_scale=7.5)
        display_images(generated_images, prompts)
        
    except Exception as e:
        print(f"An error occurred: {str(e)}")
    finally:
        cleanup_memory()

Here are some pictures that I’ve gotten from my run (note that the pictures may vary every run):

cat

dog

catdog

Note that these prompts, e.g. “A cute cat”, “A cute dog”, and “A cute cat and a cute dog sit next to each other”, are just examples, and you can modify your prompts to your liking by modifying the prompts in the main function.

If you experience hung kernel or something similar, please restart/refresh RHOAI dashboard and relaunch the notebook. Alternatively, if you were using an HCP cluster, you might also want to add more nodes into the machine pool.

Please note that you may also have seen following warning messages which are informational and generally harmless:

  • The cache for model files in Transformers v4.22.0 has been updated…: This is just an informational message that can be safely ignored once the cache migration is complete.
  • Unable to register cuDNN/cuFFT/cuBLAS factory…: These messages indicate that these CUDA libraries are being initialized multiple times.
  • This TensorFlow binary is optimized to use available CPU instructions…: This is also just an informational message that TensorFlow installation is working but could potentially be optimized further.
  • TF-TRT Warning: Could not find TensorRT: This warning indicates that TensorRT is not available, which might affect performance but not functionality.

4. Future research

Note that this is a simple tutorial intended to guide you through the necessary environment setup once you have a ROSA cluster spun up and followed by a simple deployment of generating images using the Stable Diffusion model. If you happen to get unsatisfactory results, i.e. inaccurate images, there are many ways you can go about improving them, such as by adjusting the parameters and using more specific prompts.

In one of my runs, I noticed that the model generated an inaccurate image of a cat and a dog (for the third prompt) as follows.

dogdog

So here I adjusted the num_inference_steps from 50 to 75, guidance_scale from 7.5 to 8.5, and modified the last prompt into “A cute cat and a cute dog sitting next to each other, both faces and bodies are in the same image and background”. And thus, I got the following image as a result (note that results may vary).

catdog1

Increasing num_inference_steps will allow the model more iterations to refine the image, adjusting guidance_scale can lead to images that are more closely matching the prompt, and using more detailed prompts can help guide the model better.

However, please note that even with these optimizations, generating images with multiple specific elements can be tricky due to the inherent nature of generative models. You might still need to run the code multiple times to get the desired results.

Note that there are many other ways to improve the accuracy that I’m not going to delve further in this blog, such as using negative promptsexternal link (opens in new tab) to exclude what you don’t want to see in the image, fine-tuning the model, using another modelexternal link (opens in new tab) , increasing the batch size, etc. These are all potential topics for future research.

Interested in contributing to these docs?

Collaboration drives progress. Help improve our documentation The Red Hat Way.

Red Hat logo LinkedIn YouTube Facebook Twitter

Products

Tools

Try, buy & sell

Communicate

About Red Hat

We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Subscribe to our newsletter, Red Hat Shares

Sign up now
© 2023 Red Hat, Inc.