Pipelines execution

Red Hat OpenShift AI gives the possibilities to create pipelines by dragging and connecting Notebooks using the Elyra extension. Another alternative, -and the one we will use- is by importing an existing pipeline in YAML format created using the KubeFlow Pipelines SDK. This programmatic platform uses Python to simplify the process of writing and building the pipeline.

Import Pipeline

Once the server is set up, we will create a pipeline to fully automate the process. This pipeline will collect new data from the BMS application and use it to retrain a new model. The performance of this new model will then be compared against the existing one. If it outperforms the current model, it will be uploaded to the S3 bucket in MicroShift. This process will be repeated for both the Stress Detection and Time to Failure prediction models.

In order to save some time we will import the pipeline instead of creating it manually. If you want to learn more about how to create pipelines, check the documentation.

First, open the Jupyter Environment with the Notebooks used to train the models and test the endpoints.
In the folder structure on the left, navigate to the notebooks > pipelines folder and locate the model-retraining.yaml pipeline file.
Right-click on it and select Download.
Save the file locally in your laptop.
Now, go back to your Red Hat OpenShift AI Dashboard.
Make sure you are in the Pipelines tab inside our project.
Select Import pipeline and complete the form as follows:
- Project: The ai-edge-project project should be selected by default.
- Pipeline name: Name it as Model retraining.
- Verify that the Upload a file check is marked.
- Click the Upload button and select the model-retraining.yaml from your laptop.
When completed, press Import pipeline.

You should see the different pipeline nodes and steps displayed in a graphical way, like this:

Run Pipeline

To test the pipeline functionality, we will perform a first manual execution. Follow these steps:

Once the pipeline is imported, click on the blue Actions button in the top-right corner.
Select Create run from the dropdown menu.
Configure the pipeline execution as follows:
- Project: Verify the ai-edge-project is selected.
- Experiment: Keep the Default one.
- Name: Type First run.
- Pipeline: Model retraining should be already selected.
- Pipeline version: Model retraining is pre-filled.

Now, check the following parameter values. You may need to change the InfluxDB URL based on your environment:

Parameter Value

Parameter	Value
aws_access_key_id	`minio`
aws_s3_bucket	`inference`
aws_s3_endpoint	`http://minio-microshift-vm.microshift-001.svc.cluster.local:30000`
aws_secret_access_key	`minio123`
influxdb_bucket	`bms`
influxdb_org	`redhat`
influxdb_token	`admin_token`
influxdb_url	`https://influx-db-microshift-001.{openshift_cluster_ingress_domain}`

aws_access_key_id

minio

aws_s3_bucket

inference

aws_s3_endpoint

http://minio-microshift-vm.microshift-001.svc.cluster.local:30000

aws_secret_access_key

minio123

influxdb_bucket

bms

influxdb_org

redhat

influxdb_token

admin_token

influxdb_url

https://influx-db-microshift-001.{openshift_cluster_ingress_domain}

When completed, click Create run.

At this point, the pipeline will start executing. The first node retrieves data from InfluxDB in the vehicle, then the data is processed and prepared for model training. Next, the Stress Detection and Time to Failure models are trained in parallel. Once training is complete, the following node evaluates whether the new models outperform the existing ones in the vehicle using fresh data. If not, the models are discarded. However, if the new models show better performance, the final nodes are executed to upload them to the inference bucket in MinIO on our autonomous vehicle.

When the execution finishes, all nodes will be marked with green checkmarks indicating a successful run, as shown in the image below: