Tutorial
13 min read

Deploying serverless MLFlow on Google Cloud Platform using Cloud Run

At GetInData, we build elastic MLOps platforms to fit our customer’s needs. One of the key functionalities of the MLOps platform is the ability to track experiments and manage the trained models in the form of a model repository. Our flexible platform allows our customers to pick best of breed technologies to provide those functionalities, whether they are open source or commercial ones. MLFlow is one of the open source tools that goes really well in that position. In this blog post, I will show you how to deploy MLFlow on top of Cloud Run, Cloud SQL and Google Cloud Storage to obtain a fully managed, serverless service for experiment tracking and model repository. The service will be protected using OAuth2.0 authorization and SSO with Google accounts.

Prerequisites

  • access to the Google Cloud Platform
  • service account dedicated to run MLFlow in Cloud Run
  • OAuth 2.0 Client ID and Client Secrets for authorization
  • Docker installation (for building the image, but this can also be done in CI)

Target setup

The final setup described in this blog post will look like this:

getindata-big-data-blog-deploying-serverless-mlflow-google-cloud-platform-using-cloud-run

One of the most important aspects here is the use of the OAuth2-Proxy as a middle layer within the Cloud Run container. Using Cloud Run’s built-in authorization mechanism (“Require authentication. Manage authorized users with Cloud IAM.” option) is usable only when programmatic requests are being made. When interactive requests to the web based UI need to be made (like with the MLFlow web UI case), there is no built-in way yet to authenticate in the Cloud Run service.

Step 1: Setting appropriate permissions

The service account that will be used to run MLFlow in Cloud Run needs to have the following permissions configured:

  • Cloud SQL Client
  • Secret Manager Secret Accessor (optionally this can be set directly on each secret separately)
  • Storage Object Viewer

Step 2: Pre-configuring OAuth 2.0 Client

In order to integrate OAuth 2.0 authorization with Cloud Run, OAuth2-Proxy will be used as a proxy on top of MLFlow. OAuth2-Proxy can work with many OAuth providers, including GitHub, GitLab, Facebook, Google, Azure and others. Using a Google provider allows the easy integration of both SSO in the interactive MLFlow UI but also makes it easier for service-to-service authorization using bearer tokens. Thanks to this, MLFlow will be securely accessible from its Python SDK, e.g. when the model training job is executed. 

The Client ID / Client Secret can be created by visiting https://console.cloud.google.com/apis/credentials/oauthclient and configuring the new client as a Web Application as shown below. During pre-configuration, the Authorized redirect URIs will be left blank. Later it will be filled with the URL of the deployed Cloud Run instance.

getindata-big-data-blog-oauth-client-mlflow-gcp


Once created, Client ID and Client Secret will be displayed. They must be stored securely for later configuration.

Step 3: Configuring Cloud SQL / Cloud Storage

Both Cloud SQL and Cloud Storage configurations are straightforward and they will be skipped in this blog post. Official GCP documentation about connecting Cloud SQL with Cloud Run covers all aspects of this setup well.

  • Depending on the load, Cloud SQL instances can be scaled up as needed. For small deployments, a standard instance with 1 vCPU should be sufficient.
  • It’s important that both Cloud SQL and GCS bucket will be in the same GCP region as the Cloud Run instance to minimize both latency and the costs of data transfers. 
  • Cloud SQL needs to have a database named mlflow created before it can be used.
  • Any database flavour supported by MLFlow can be used.

It’s good practice to have a separate GCS bucket for use by MLFlow.

Step 4: Storing configuration in Secret Manager

Both MLFlow and OAuth2-Proxy require configuration that contains sensitive data (e.g. OAuth2.0 Client Secret, database connection string). This configuration will be stored in Secret Manager and then mounted in the Cloud Run service via environment variables and files. Two secrets should be created:

  1. Secret with connection string to the Cloud SQL (for MLFlow)
  2. Secret with OAuth2.0 proxy configuration file as per template below:
email_domains = [
    "<SSO EMAIL DOMAIN>"
]
provider = "google"
client_id = "<CLIENT ID>"
client_secret = "<SECRET>"
skip_jwt_bearer_tokens = true
extra_jwt_issuers = "https://accounts.google.com=32555940559.apps.googleusercontent.com"
cookie_secret = "<COOKIE SECRET IN BASE64>"

Cookie secret can be generated using head -c 32 /dev/urandom | base64.

Extra JWT issuers parameter is required for the service-to-service authorization to work properly.

Step 5: Preparing MLFlow Docker image

The Docker image for running MLFlow with OAuth2-Proxy will be based on GetInData’s public MLFlow Docker image from: https://github.com/getindata/mlflow-docker (as MLFlow does not provide one yet). There are two modifications required:

  1. Installation of OAuth2-Proxy.
  2. Installation of Tini entrypoint.

OAuth2-Proxy will be an authorization layer for the MLFlow, which will run in the background process of the container. Tini is used for managing the container’s entrypoint.

Dockerfile

FROM gcr.io/getindata-images-public/mlflow:1.22.0
ENV TINI_VERSION v0.19.0
EXPOSE 4130

RUN apt update && apt install -y curl netcat && mkdir -p /oauth2-proxy && cd /oauth2-proxy && \
    curl -L -o proxy.tar.gz https://github.com/oauth2-proxy/oauth2-proxy/releases/download/v6.1.1/oauth2-proxy-v6.1.1.linux-amd64.tar.gz && \
    tar -xzf proxy.tar.gz && mv oauth2-proxy-*.linux-amd64/oauth2-proxy . && rm proxy.tar.gz && \
    rm -rf /var/lib/apt/lists/*

ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini
RUN chmod +x /tini

COPY start.sh start.sh
RUN chmod +x start.sh

ENTRYPOINT ["/tini", "--", "./start.sh"]

start.sh

#!/usr/bin/env bash
set -e

mlflow server --host 0.0.0.0 --port 8080 --backend-store-uri ${BACKEND_STORE_URI} --default-artifact-root ${DEFAULT_ARTIFACT_ROOT} &
while ! nc -z localhost 8080 ; do sleep 1 ; done
/oauth2-proxy/oauth2-proxy --upstream=http://localhost:8080 --config=${OAUTH_PROXY_CONFIG} --http-address=0.0.0.0:4180 &

wait -n

Details on this entrypoint can be found in the Step 6 section below. Once the image is built it needs to be pushed to GCR in order for Cloud Run to deploy it.

Step 6: Deploying MLFlow with OAuth2-Proxy on Cloud Run

Once the Docker image is built, it can be deployed in Cloud Run. 

Setting Cloud Run parameters

Standard parameters should be set as follows:

  • Container image URL - path to image pushed to GCR in Step 5.
  • Container port - 4180
  • Container command / container arguments - can be left empty as they are configured in the Dockerfile
  • CPU allocation and pricing - see below
  • Capacity - 1GiB memory and 1 vCPU is enough for a small workload. Can be scaled up as needed
  • Request timeout - 300s (default).
  • Max requests per container - 80 (default). Can be fine tuned depending on the load.
  • Autoscaling - Min. 0, Max. N (depending on the load).
  • Ingress - Allow all traffic
  • Authentication - Allow unauthenticated invocations - they will be authenticated by OAuth2-Proxy.
  • Connections / Cloud SQL connections - choose the Cloud SQL instance created earlier.
  • Security / Service account - choose the service account created for this deployment earlier.

Important note on the CPU allocation and pricing setting

From the code of the entrypoint it’s clear that there are two services running simultaneously within a single container. The front-end service is OAuth2-Proxy and the backend service is the actual MLFlow instance. Because of this configuration, the entrypoint needs to wait before the MLFlow server starts before proceeding to the next step, which is starting the OAuth2-Proxy. Without the wait loop, the service will not start properly and accessing the deployed Cloud Run service will result in Bad Gateway: Error proxying to upstream server:

getindata-big-data-blog-MLflow-GCP-Cloud-run

The reason why this happens is directly related to the configuration of the Cloud Run. Currently, when deploying a Cloud Run service there are two options for CPU allocation and pricing:

  • CPU is only allocated during request processing - you are charged per request and only when the container instance processes a request.
  • CPU is always allocated - you are charged for the entire lifecycle of the container instance.

The first option, the default one, which is “the most serverless one”, means that the deployed container will only receive CPU time when the request is executing. Because the MLflow is a background process, it will only get CPU time to start the server when the front-end process (which in this case is OAuth2-Proxy) is processing the HTTP request. As the requests to OAuth2-Proxy are short, the CPU will be preempted and the MLFlow server will be unable to start. Forcing the entrypoint for OAuth2-Proxy to wait before the MLFlow server starts prevents the race condition when the MLFlow server might start or not, depending on how many CPU cycles the container received during initialization.

The second option is more expensive as the CPU allocated for the container is always available. This option is a good fit for services that perform backend processing (e.g. consume Pub/Sub messages). While being more expensive, permanent allocation of the CPU would prevent the MLFlow server from being throttled and it’s process will be able to initialize correctly without the HTTP requests being made to the container (excluding the first one when scaling to 0). It’s more reliable to initialize all the processes within the container first before reporting ready status to the executor service (here: Cloud Run) though. 

Configuring secrets

The secrets created in the Step 4 need to be mounted in the Cloud Run service. The configuration is shown below:

getindata-big-data-blog-mlflow-google-cloud-run-service

The connection string to the database (first secret) should be mounted as BACKEND_STORE_URI environment variable - it will be consumed by the MLFlow Server. The configuration for the OAuth2-Proxy should be mounted as volume and the path specified here should be passed in the OAUTH_PROXY_CONFIG environment variable (reference the setup.sh above to understand the passing of these parameters).

Step 7:  Finishing OAuth2.0 configuration

In Step 2, the OAuth 2.0 client was pre-configured but the Authorized redirect URIs were left blank. Once the Cloud Run service is deployed, it’s URL will be available in the UI:

getindata-big-data-blog-oauth-configuration-machine-learining-gcp-mlflow

This URL needs to be specified in the OAuth 2.0 client configuration as follows:

 https://mlflow-.a.run.app/oauth2/callback

so that the redirections will work properly.

Accessing serverless MLFlow deployment

Once deployed, MLFlow service can be accessed either from the browser or from a backend service.

Browser

The deployed Cloud Run service will trigger OAuth 2.0 flow in order to authorize the user.

  1. getindata-deploying-serverless-mlflow-google-cloud-platform
  2. getindata-big-data-blog-accessing-serverless-mlflow-deployment
  3. getindata-blog-big-data-accesing-serverless-mlflow-deployment-gcp

Service-to-service

When using CI/CD jobs, curl or Python SDK for MLFlow, requests need to be authorized by passing the Authorization HTTP header with a Bearer token provided by the OAuth 2.0 token provider, which in this case is Google (note that not all providers supported by the OAuth2-Proxy allow you to obtain a server-to-server token).

In order to get the token, thecommand can be used. Make sure that the account requesting the token has appropriate permissions within the GCP project (Viewer should be enough). 

curl -X GET https://<redacted>.a.run.app/api/2.0/mlflow/experiments/list -H "Authorization: Bearer $(gcloud auth print-identity-token)"
{
  "experiments": [
    {
      "experiment_id": "0",
      "name": "Default",
      "artifact_location": "gs://<redacted>/0",
      "lifecycle_stage": "active"
    },
    {
      "experiment_id": "1",
      "name": "testasdasd",
      "artifact_location": "gs://<redacted>/1",
      "lifecycle_stage": "active"
    }
  ]
}


Authorizing Python SDK only requires passing the generated token to the MLFLOW_TRACKING_TOKENenvironment variable, no source code changes are required (example taken from MLFlow docs):

MLFLOW_TRACKING_TOKEN=$(gcloud auth print-identitytoken)
MLFLOW_TRACKING_URI=https://<redacted>.a.run.app python sklearn_elasticnet_wine/train.py

Summary

I hope this guide helped you to quickly deploy scalable MLFlow instances on Google Cloud Platform. Happy (serverless) experiment tracking!

Special thanks to Mateusz Pytel and Mariusz Wojakowski for helping me with the research on this deployment.


Did you like our post? If you want more, do not hesitate to download our free Ebook “MLOps: Power Up Machine Learning Process. Introduction to Vertex AI, Snowflake and dbt Cloud”.

machine learning
MLOps
MLFlow
oAuth
Google Cloud Platform
GCP
1 February 2022

Want more? Check our articles

cloud computing insurance
Tutorial

Cloud computing standard for the insurance industry

On June 16, 2021, the Polish Insurance Association published the Cloud computing standard for the insurance industry. It is a set of rules for the…

Read more
włdek blogobszar roboczy 1 4x 100
Tutorial

Artificial Intelligence regulatory initiatives of EU countries

AI regulatory initiatives of EU countries On April 21, 2021, the EU Commission adopted a proposal for a regulation on artificial intelligence…

Read more
getindata blog big data machine learning models tools comparation no text
Tutorial

Machine Learning model serving tools comparison - KServe, Seldon Core, BentoML

Intro Machine Learning is now used by thousands of businesses. Its ubiquity has helped to drive innovations that are increasingly difficult to predict…

Read more
deployingsecuremlfowonawsobszar roboczy 1 4
Tutorial

Deploying secure MLflow on AWS

One of the core features of an MLOps platform is the capability of tracking and recording experiments, which can then be shared and compared. It also…

Read more
big data blog getindata data enrichment flink sql http connector
Tutorial

Data Enrichment in Flink SQL using HTTP Connector For Flink - Part One

HTTP Connector For Flink SQL  In our projects at GetInData, we work a lot on scaling out our client's data engineering capabilities by enabling more…

Read more
kafka gobblin hdfs getindata linkedin
Tutorial

Data pipeline evolution at Linkedin on a few pictures

Data Pipeline Evolution The LinkedIn Engineering blog is a great resource of technical blog posts related to building and using large-scale data…

Read more

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.

The administrator of your personal data is GetInData Sp. z o.o. Sp.k with its registered seat in Warsaw (02-508), 39/20 Pulawska St. Your data is processed for the purpose of provision of electronic services in accordance with the  Terms & Conditions. For more information on personal data processing and your rights please see  Privacy Policy.

By submitting this form, you agree to our Terms & Conditions and Privacy Policy