Tutorial

13 min read

Deploying serverless MLFlow on Google Cloud Platform using Cloud Run

At GetInData, we build elastic MLOps platforms to fit our customer’s needs. One of the key functionalities of the MLOps platform is the ability to track experiments and manage the trained models in the form of a model repository. Our flexible platform allows our customers to pick best of breed technologies to provide those functionalities, whether they are open source or commercial ones. MLFlow is one of the open source tools that goes really well in that position. In this blog post, I will show you how to deploy MLFlow on top of Cloud Run, Cloud SQL and Google Cloud Storage to obtain a fully managed, serverless service for experiment tracking and model repository. The service will be protected using OAuth2.0 authorization and SSO with Google accounts.

Prerequisites

access to the Google Cloud Platform
service account dedicated to run MLFlow in Cloud Run
OAuth 2.0 Client ID and Client Secrets for authorization
Docker installation (for building the image, but this can also be done in CI)

Target setup

The final setup described in this blog post will look like this:

getindata-big-data-blog-deploying-serverless-mlflow-google-cloud-platform-using-cloud-run

One of the most important aspects here is the use of the OAuth2-Proxy as a middle layer within the Cloud Run container. Using Cloud Run’s built-in authorization mechanism (“Require authentication. Manage authorized users with Cloud IAM.” option) is usable only when programmatic requests are being made. When interactive requests to the web based UI need to be made (like with the MLFlow web UI case), there is no built-in way yet to authenticate in the Cloud Run service.

Step 1: Setting appropriate permissions

The service account that will be used to run MLFlow in Cloud Run needs to have the following permissions configured:

Cloud SQL Client
Secret Manager Secret Accessor (optionally this can be set directly on each secret separately)
Storage Object Viewer

Step 2: Pre-configuring OAuth 2.0 Client

In order to integrate OAuth 2.0 authorization with Cloud Run, OAuth2-Proxy will be used as a proxy on top of MLFlow. OAuth2-Proxy can work with many OAuth providers, including GitHub, GitLab, Facebook, Google, Azure and others. Using a Google provider allows the easy integration of both SSO in the interactive MLFlow UI but also makes it easier for service-to-service authorization using bearer tokens. Thanks to this, MLFlow will be securely accessible from its Python SDK, e.g. when the model training job is executed.

The Client ID / Client Secret can be created by visiting https://console.cloud.google.com/apis/credentials/oauthclient and configuring the new client as a Web Application as shown below. During pre-configuration, the Authorized redirect URIs will be left blank. Later it will be filled with the URL of the deployed Cloud Run instance.

getindata-big-data-blog-oauth-client-mlflow-gcp

Once created, Client ID and Client Secret will be displayed. They must be stored securely for later configuration.

Step 3: Configuring Cloud SQL / Cloud Storage

Both Cloud SQL and Cloud Storage configurations are straightforward and they will be skipped in this blog post. Official GCP documentation about connecting Cloud SQL with Cloud Run covers all aspects of this setup well.

Depending on the load, Cloud SQL instances can be scaled up as needed. For small deployments, a standard instance with 1 vCPU should be sufficient.
It’s important that both Cloud SQL and GCS bucket will be in the same GCP region as the Cloud Run instance to minimize both latency and the costs of data transfers.
Cloud SQL needs to have a database named mlflow created before it can be used.
Any database flavour supported by MLFlow can be used.

It’s good practice to have a separate GCS bucket for use by MLFlow.

Step 4: Storing configuration in Secret Manager

Both MLFlow and OAuth2-Proxy require configuration that contains sensitive data (e.g. OAuth2.0 Client Secret, database connection string). This configuration will be stored in Secret Manager and then mounted in the Cloud Run service via environment variables and files. Two secrets should be created:

Secret with connection string to the Cloud SQL (for MLFlow)
Secret with OAuth2.0 proxy configuration file as per template below:

email_domains = [
    "<SSO EMAIL DOMAIN>"
]
provider = "google"
client_id = "<CLIENT ID>"
client_secret = "<SECRET>"
skip_jwt_bearer_tokens = true
extra_jwt_issuers = "https://accounts.google.com=32555940559.apps.googleusercontent.com"
cookie_secret = "<COOKIE SECRET IN BASE64>"

Cookie secret can be generated using head -c 32 /dev/urandom | base64.

Extra JWT issuers parameter is required for the service-to-service authorization to work properly.

Step 5: Preparing MLFlow Docker image

The Docker image for running MLFlow with OAuth2-Proxy will be based on GetInData’s public MLFlow Docker image from: https://github.com/getindata/mlflow-docker (as MLFlow does not provide one yet). There are two modifications required:

Installation of OAuth2-Proxy.
Installation of Tini entrypoint.

OAuth2-Proxy will be an authorization layer for the MLFlow, which will run in the background process of the container. Tini is used for managing the container’s entrypoint.

Dockerfile

FROM gcr.io/getindata-images-public/mlflow:1.22.0
ENV TINI_VERSION v0.19.0
EXPOSE 4130

RUN apt update && apt install -y curl netcat && mkdir -p /oauth2-proxy && cd /oauth2-proxy && \
    curl -L -o proxy.tar.gz https://github.com/oauth2-proxy/oauth2-proxy/releases/download/v6.1.1/oauth2-proxy-v6.1.1.linux-amd64.tar.gz && \
    tar -xzf proxy.tar.gz && mv oauth2-proxy-*.linux-amd64/oauth2-proxy . && rm proxy.tar.gz && \
    rm -rf /var/lib/apt/lists/*

ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini
RUN chmod +x /tini

COPY start.sh start.sh
RUN chmod +x start.sh

ENTRYPOINT ["/tini", "--", "./start.sh"]

start.sh

#!/usr/bin/env bash
set -e

mlflow server --host 0.0.0.0 --port 8080 --backend-store-uri ${BACKEND_STORE_URI} --default-artifact-root ${DEFAULT_ARTIFACT_ROOT} &
while ! nc -z localhost 8080 ; do sleep 1 ; done
/oauth2-proxy/oauth2-proxy --upstream=http://localhost:8080 --config=${OAUTH_PROXY_CONFIG} --http-address=0.0.0.0:4180 &

wait -n

Details on this entrypoint can be found in the Step 6 section below. Once the image is built it needs to be pushed to GCR in order for Cloud Run to deploy it.

Step 6: Deploying MLFlow with OAuth2-Proxy on Cloud Run

Once the Docker image is built, it can be deployed in Cloud Run.

Setting Cloud Run parameters

Standard parameters should be set as follows:

Container image URL - path to image pushed to GCR in Step 5.
Container port - 4180
Container command / container arguments - can be left empty as they are configured in the Dockerfile
CPU allocation and pricing - see below
Capacity - 1GiB memory and 1 vCPU is enough for a small workload. Can be scaled up as needed
Request timeout - 300s (default).
Max requests per container - 80 (default). Can be fine tuned depending on the load.
Autoscaling - Min. 0, Max. N (depending on the load).
Ingress - Allow all traffic
Authentication - Allow unauthenticated invocations - they will be authenticated by OAuth2-Proxy.
Connections / Cloud SQL connections - choose the Cloud SQL instance created earlier.
Security / Service account - choose the service account created for this deployment earlier.

Important note on the CPU allocation and pricing setting

From the code of the entrypoint it’s clear that there are two services running simultaneously within a single container. The front-end service is OAuth2-Proxy and the backend service is the actual MLFlow instance. Because of this configuration, the entrypoint needs to wait before the MLFlow server starts before proceeding to the next step, which is starting the OAuth2-Proxy. Without the wait loop, the service will not start properly and accessing the deployed Cloud Run service will result in Bad Gateway: Error proxying to upstream server:

getindata-big-data-blog-MLflow-GCP-Cloud-run

The reason why this happens is directly related to the configuration of the Cloud Run. Currently, when deploying a Cloud Run service there are two options for CPU allocation and pricing:

CPU is only allocated during request processing - you are charged per request and only when the container instance processes a request.
CPU is always allocated - you are charged for the entire lifecycle of the container instance.

The first option, the default one, which is “the most serverless one”, means that the deployed container will only receive CPU time when the request is executing. Because the MLflow is a background process, it will only get CPU time to start the server when the front-end process (which in this case is OAuth2-Proxy) is processing the HTTP request. As the requests to OAuth2-Proxy are short, the CPU will be preempted and the MLFlow server will be unable to start. Forcing the entrypoint for OAuth2-Proxy to wait before the MLFlow server starts prevents the race condition when the MLFlow server might start or not, depending on how many CPU cycles the container received during initialization.

The second option is more expensive as the CPU allocated for the container is always available. This option is a good fit for services that perform backend processing (e.g. consume Pub/Sub messages). While being more expensive, permanent allocation of the CPU would prevent the MLFlow server from being throttled and it’s process will be able to initialize correctly without the HTTP requests being made to the container (excluding the first one when scaling to 0). It’s more reliable to initialize all the processes within the container first before reporting ready status to the executor service (here: Cloud Run) though.

Configuring secrets

The secrets created in the Step 4 need to be mounted in the Cloud Run service. The configuration is shown below:

getindata-big-data-blog-mlflow-google-cloud-run-service

The connection string to the database (first secret) should be mounted as BACKEND_STORE_URI environment variable - it will be consumed by the MLFlow Server. The configuration for the OAuth2-Proxy should be mounted as volume and the path specified here should be passed in the OAUTH_PROXY_CONFIG environment variable (reference the setup.sh above to understand the passing of these parameters).

Step 7: Finishing OAuth2.0 configuration

In Step 2, the OAuth 2.0 client was pre-configured but the Authorized redirect URIs were left blank. Once the Cloud Run service is deployed, it’s URL will be available in the UI:

getindata-big-data-blog-oauth-configuration-machine-learining-gcp-mlflow

This URL needs to be specified in the OAuth 2.0 client configuration as follows:

https://mlflow-.a.run.app/oauth2/callback

so that the redirections will work properly.

Accessing serverless MLFlow deployment

Once deployed, MLFlow service can be accessed either from the browser or from a backend service.

Browser

The deployed Cloud Run service will trigger OAuth 2.0 flow in order to authorize the user.

Service-to-service

When using CI/CD jobs, curl or Python SDK for MLFlow, requests need to be authorized by passing the Authorization HTTP header with a Bearer token provided by the OAuth 2.0 token provider, which in this case is Google (note that not all providers supported by the OAuth2-Proxy allow you to obtain a server-to-server token).

In order to get the token, thecommand can be used. Make sure that the account requesting the token has appropriate permissions within the GCP project (Viewer should be enough).

curl -X GET https://<redacted>.a.run.app/api/2.0/mlflow/experiments/list -H "Authorization: Bearer $(gcloud auth print-identity-token)"
{
  "experiments": [
    {
      "experiment_id": "0",
      "name": "Default",
      "artifact_location": "gs://<redacted>/0",
      "lifecycle_stage": "active"
    },
    {
      "experiment_id": "1",
      "name": "testasdasd",
      "artifact_location": "gs://<redacted>/1",
      "lifecycle_stage": "active"
    }
  ]
}

Authorizing Python SDK only requires passing the generated token to the MLFLOW_TRACKING_TOKENenvironment variable, no source code changes are required (example taken from MLFlow docs):

MLFLOW_TRACKING_TOKEN=$(gcloud auth print-identitytoken)
MLFLOW_TRACKING_URI=https://<redacted>.a.run.app python sklearn_elasticnet_wine/train.py

Summary

I hope this guide helped you to quickly deploy scalable MLFlow instances on Google Cloud Platform. Happy (serverless) experiment tracking!

Special thanks to Mateusz Pytel and Mariusz Wojakowski for helping me with the research on this deployment.

Interested in ML and MLOps solutions? How to improve ML processes and scale project deliverability? Watch our MLOps demo and sign up for a free consultation.

machine learning

MLOps

MLFlow

oAuth

Google Cloud Platform

GCP

Last updated: 1 February 2022

Written by

Marcin Zabłocki

MLOps Architect

Like this post?
Spread the word

Want more? Check our articles

DATA Pill – the blue pill that (accidentally) works!

Ever felt overwhelmed by the flood of news about the latest technologies, tools, and trends in Data, AI, and ML? A new framework here, a revolutionary…

Tutorial

From 0 to MLOps with ❄️ Snowflake Data Cloud in 3 steps with the Kedro-Snowflake plugin

MLOps on Snowflake Data Cloud MLOps is an ever-evolving field, and with the selection of managed and cloud-native machine learning services expanding…

getindator create a high tech and dynamic illustration represen a37ec8de 4a50 49d5 95b5 ba7eaf847b88

Tutorial

Flink SQL - Changelog and Races

Managing data efficiently and accurately is a significant challenge in the ever-evolving landscape of stream processing. Apache Flink, a powerful…

llm cluster hugging face gke autopilot getindataobszar roboczy 1 4

Tutorial

Deploy open source LLM in your private cluster with Hugging Face and GKE Autopilot

Deploying Language Model (LLMs) based applications can present numerous challenges, particularly when it comes to privacy, reliability and ease of…

Tutorial

Different generations of CICD tools

What is CICD? It is an acronym for Continuous Integration Continuous Delivery / Deployment. CICD can be also described as the methodology focused on…

getindata google data studio bigquery usage costs

Tutorial

Google Data Studio on BigQuery - usage and cost control

Data Studio is a reporting tool that comes along with other Google Cloud Platform products to bring out a simple yet reliable BI platform. There are…

Check All

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.

What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.

Type the form or send a e-mail: hello@getindata.com

Deploying serverless MLFlow on Google Cloud Platform using Cloud Run

Prerequisites

Target setup

Step 1: Setting appropriate permissions

Step 2: Pre-configuring OAuth 2.0 Client

Step 3: Configuring Cloud SQL / Cloud Storage

Step 4: Storing configuration in Secret Manager

Step 5: Preparing MLFlow Docker image

Step 6: Deploying MLFlow with OAuth2-Proxy on Cloud Run

Setting Cloud Run parameters

Important note on the CPU allocation and pricing setting

Configuring secrets

Step 7: Finishing OAuth2.0 configuration

Accessing serverless MLFlow deployment

Browser

Service-to-service

Summary

Like this post?Spread the word

Want more? Check our articles

DATA Pill – the blue pill that (accidentally) works!

From 0 to MLOps with ❄️ Snowflake Data Cloud in 3 steps with the Kedro-Snowflake plugin

Flink SQL - Changelog and Races

Deploy open source LLM in your private cluster with Hugging Face and GKE Autopilot

Different generations of CICD tools

Google Data Studio on BigQuery - usage and cost control

Contact us

Interested in our solutions?Contact us!

Like this post?
Spread the word

Interested in our solutions?
Contact us!