Pipelines execution

Red Hat OpenShift AI gives the possibilities to create pipelines by dragging and connecting Notebooks using the Elyra extension. Another alternative, -and the one we will use- is by importing an existing pipeline in YAML format created using the KubeFlow Pipelines SDK. This programmatic platform uses Python to simplify the process of writing and building the pipeline.

Import Pipeline

Once the server is set up, we will create a pipeline to fully automate the process. This pipeline will collect new data from the BMS application and use it to retrain a new model. The performance of this new model will then be compared against the existing one. If it outperforms the current model, it will be uploaded to the S3 bucket in MicroShift. This process will be repeated for both the Stress Detection and Time to Failure prediction models.

In order to save some time we will import the pipeline instead of creating it manually. If you want to learn more about how to create pipelines, check the documentation.

  1. First, open the Jupyter Environment with the Notebooks used to train the models and test the endpoints.

  2. In the folder structure on the left, navigate to the notebooks > pipelines folder and locate the model-retraining.yaml pipeline file.

  3. Right-click on it and select Download.

    Download pipeline file
  4. Save the file locally in your laptop.

  5. Now, go back to your Red Hat OpenShift AI Dashboard.

  6. Make sure you are in the Pipelines tab inside our project.

  7. Select Import pipeline and complete the form as follows:

    • Project: The ai-edge-project project should be selected by default.

    • Pipeline name: Name it as Model retraining.

    • Verify that the Upload a file check is marked.

    • Click the Upload button and select the model-retraining.yaml from your laptop.

  8. When completed, press Import pipeline.

You should see the different pipeline nodes and steps displayed in a graphical way, like this:

Imported Pipeline

Run Pipeline

To test the pipeline functionality, we will perform a first manual execution. Follow these steps:

  1. Once the pipeline is imported, click on the blue Actions button in the top-right corner.

  2. Select Create run from the dropdown menu.

  3. Configure the pipeline execution as follows:

    • Project: Verify the ai-edge-project is selected.

    • Experiment: Keep the Default one.

    • Name: Type First run.

    • Pipeline: Model retraining should be already selected.

    • Pipeline version: Model retraining is pre-filled.

  4. Now, check the following parameter values. You may need to change the InfluxDB URL based on your environment:

    Parameter Value

    aws_access_key_id

    minio

    aws_s3_bucket

    inference

    aws_s3_endpoint

    http://minio-microshift-vm.microshift-001.svc.cluster.local:30000

    aws_secret_access_key

    minio123

    influxdb_bucket

    bms

    influxdb_org

    redhat

    influxdb_token

    admin_token

    influxdb_url

    https://influx-db-microshift-001.{openshift_cluster_ingress_domain}

  5. When completed, click Create run.

At this point, the pipeline will start executing. The first node retrieves data from InfluxDB in the vehicle, then the data is processed and prepared for model training. Next, the Stress Detection and Time to Failure models are trained in parallel. Once training is complete, the following node evaluates whether the new models outperform the existing ones in the vehicle using fresh data. If not, the models are discarded. However, if the new models show better performance, the final nodes are executed to upload them to the inference bucket in MinIO on our autonomous vehicle.

When the execution finishes, all nodes will be marked with green checkmarks indicating a successful run, as shown in the image below:

Pipeline execution complete