The importance of model monitoring

Automating model retraining and validation with Pipelines is a powerful approach to ensuring that AI models remain up to date without manual intervention. However, this automation also introduces a level of uncertainty, as new models are continuously generated and deployed without direct oversight. In such a scenario, enabling model monitoring is a crucial step to maintaining model reliability and performance over time.

Enable Trusty AI

TrustyAI in Red Hat OpenShift AI plays an important role in ensuring machine learning models deployed at the edge are reliable. By enabling TrustyAI, data scientists can monitor essential metrics such as bias and data drift, helping to identify unfair patterns during predictions and deviations that could degrade the model performance. Let’s start configuring it:

As mentioned during the installation of RHOAI, we ensured that the TrustyAI component was enabled during the operator deployment. This was the first step before being able to enable model monitoring in our cluster. Let’s double check before continuing:
- Open your OpenShift Console and navigate to Operators > Installed Operators to locate Red Hat OpenShift AI.
- Access the Data Science Cluster tab and click on the default-dcs resource.
- The DataScienceCluster contains all the RHOAI configuration. Open the YAML tab and scroll down until you spot the trustyai component.
- Verify that the managementState shows managed as show below. If not, change it manually.
  IMAGE_HERE
Now we are going to enable monitoring in our single-model serving platform. To do so, we need to create two resources. Click on the + icon that appears on the right side of the top menu bar to import the following YAML files. Click Create for each of them:

apiVersion: v1
kind: ConfigMap
metadata:
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
data:
  config.yaml: |
    prometheus:
      logLevel: debug
      retention: 15d

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: true

There’s one last resource we need to create. That’s the TrustyAI service inside our DataScience Project. Still in the Import YAML dashboard, search for your ai-edge-project namespace in the upper drop-down Project list.
Then, paste the following code and press Create again:

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: TrustyAIService
metadata:
  name: trustyai-service
spec:
  storage:
    format: "PVC"
    folder: "/inputs"
    size: "1Gi"
  data:
    filename: "data.csv"
    format: "CSV"
  metrics:
    schedule: "5s"
    batchSize: 5000

The importance of model monitoring

Enable Trusty AI

Data Collection through TrustyAI