Tutorial

11 min read

How we helped our client to transfer legacy pipeline to modern one using GitLab's CI/CD - Part 1

This blog series is based on a project delivered for one of our clients. We splited the content in three parts, you can find a table of content below. Dive in!

PART I

Problem description
General description of the solution
Problem 1: Limited job output size in GitLab
Problem 2: Limited duration of jobs running on shared runners

PART II

Problem 3: Building a container image in the job
Problem 4: The GitLab Registry token expires too quickly
Problem 5: In the paid GitLab.com plan we have a limit on the shared runners used time
Problem 6: User's names with national characters in GitLab

PART III

Problem 7: Passing on artifacts between CI/CD jobs
Problem 8: Starting docker build manually
Problem 9: We cannot rely on the error code returned by Puppet
Summary

Problem description

As part of a project for one of our clients, we had to solve the problem of building and deploying an application consisting of 30 services. About a year ago, this application was developed using the waterfall model, but the client decided to adapt its practices to the agile model. We have built a pipeline based on GitLab, which allows us to build all services as quickly as possible and upload them to the artifacts server.

Building one service consists of the following steps:

Download sources.
Compilation (creation of artifacts).
Uploading artifacts to the repository.

Before we modified (accelerated) the process of building the entire application, the components were built one by one. The whole process took about 40 hours. This meant that developers had to wait a long time for feedback.

Individual services that make up the application can be built in parallel (independently). Thanks to this parallelization of the building process, we have reduced this time to 4 hours. Building the entire application starts every day at 20:00 and ends at midnight. During the night everything will build up. And there is still time to do the tests.

We based our solution on GitLab as a service.

In this article we will describe what problems we encountered and how we solved them.

General description of the solution

This is how the pipeline looks in GitLab:

General description of the pipeline:

Phase 1: Building the container base image. It consists of only one job, which builds the base image of the container, containing all the dependencies needed to compile all services. This image is used for all other jobs in the pipeline.
Phase 2: Simultaneous compilation of all 30 services.
Phase 3: Uploading artifacts to the repository.

Problem 1: Limited job output size in GitLab

GitLab has a limit on the amount of output displayed from a job. After exceeding the limit, we see the following output (important elements are marked with a red, dotted line):

In the beginning we tried to limit the amount of output generated by job-building services by ignoring (egrep -v) unnecessary messages, but this is a time-consuming task and requires adding more rules (regular expressions) from time to time.

Ultimately, we decided to save the entire output to a file in the container file system and save it to Google Cloud Storage. Once implemented, such a solution does not require returning to it later (as opposed to the first solution). In addition, we have the full output which can be very big. The disadvantage of this solution is that during the CI/CD job, we can't see the output in real time. We only see the final fragment of the output when the process of building the service is completed. However, in practice this is not a huge disadvantage, because if the process of building the service continues, it usually means that everything is OK watching the output in real-time will not change the result.

Increasing the limit in the runner configuration in our case was not taken into account, because we used shared runners provided by gitlab.com as part of a paid package. When we ran out of shared runners time, we used runners running on our Kubernetes cluster (GCP). We prefer to use shared runners (when available) to optimize costs.

This problem is common, and other users have also encountered it: https://gitlab.com/gitlab-com/support-forum/issues/2790

What should you do to be able to store output files in Google Cloud Storage?

Create a new bucket: https://cloud.google.com/storage/docs/creating-buckets
Create a service account. You must create and download the key assigned to this account (JSON file).
We give the service account the privilege to modify the bucket.

We can connect to Google Cloud Storage using the gsutiltool. The slight inconvenience with this is that before we use this tool to manage files in our bucket, we have log in. The login process is interactive, which prevents it from being executed in CI/CD scripts.

However, you can work around this inconvenience, because interactive login creates disk configuration files that have all the necessary keys to connect to Google Cloud Storage and perform operations non-interactively. Just copy these files and place them, for example, inside the containers in which our CI/CD jobs are run.

Below is the instruction on how to perform the first login and copy the GCS key files. We will use a container image that already has the gsutilutility installed with all dependencies.

TERMINAL 1

$ docker run -ti --name gcloud-login-1 gcr.io/google.com/cloudsdktool/cloud-sdk:latest /bin/bash
docker$ cd /root
docker$ cat > service-account-credentials.json << EOF
... Paste credentials file here ...
EOF
docker$ gcloud auth activate-service-account --key-file=service-account-credentials.json
docker$ gsutil ls gs://__YOUR_BUCKET'S_NAME_HERE__
... some output ...
docker$ tar --exclude=.config/gcloud/logs -f gcs-cicd-credentials.tgz -C "${HOME}" -c .gsutil/ .config/gcloud/ service-account-credentials.json

Don't exit from the Docker container just yet. Open up second terminal and run:

Terminal 2

$ docker cp __CONTAINER'S_ID__:/root/gcs-cicd-credentials.tgz .

Now you can exit the container from the first terminal.

The gcs-cicd-credentials.tgz file should look like this:

gcs-cicd-credentials.tgz file and directory list

$ tar -vtf gcs-cicd-credentials.tgz
drwxr-xr-x root/root         0 2020-03-06 19:27 .gsutil/
-rw-r--r-- root/root         0 2020-03-06 19:27 .gsutil/credstore2.lock
-rw-r--r-- root/root      4465 2020-03-06 19:28 .gsutil/credstore2
drwxr-xr-x root/root         0 2020-03-06 19:26 .config/gcloud/
-rw-r--r-- root/root        37 2020-03-03 18:23 .config/gcloud/.last_survey_prompt.yaml
-rw------- root/root         5 2020-03-06 19:23 .config/gcloud/gce
-rw------- root/root     12288 2020-03-06 19:26 .config/gcloud/credentials.db
-rw------- root/root     12288 2020-03-06 19:26 .config/gcloud/access_tokens.db
drwxr-xr-x root/root         0 2020-03-06 19:26 .config/gcloud/legacy_credentials/
drwx------ root/root         0 2020-03-06 19:26 .config/gcloud/legacy_credentials/some-cicd-logs-upload-service-account@foobar.iam.gserviceaccount.com/
-rw------- root/root      1967 2020-03-06 19:26 .config/gcloud/legacy_credentials/some-cicd-logs-upload-service-account@foobar.iam.gserviceaccount.com/adc.json
-rw------- root/root       142 2020-03-06 19:26 .config/gcloud/legacy_credentials/some-cicd-logs-upload-service-account@foobar.iam.gserviceaccount.com/.boto
drwxr-xr-x root/root         0 2020-03-06 19:26 .config/gcloud/configurations/
-rw-r--r-- root/root        76 2020-03-06 19:26 .config/gcloud/configurations/config_default
-rw-r--r-- root/root         7 2020-03-06 19:26 .config/gcloud/active_config
-rw-r--r-- root/root         0 2020-03-06 19:26 .config/gcloud/config_sentinel
-rw-r--r-- root/root      2329 2020-03-06 19:25 service-account-credentials.json

We still need to download the configuration files created in such a way to the container in which we run the CI/CD jobs and install the gsutilprogram in it. For example, we can create a container image that already has these configuration files uploaded (unpack the gcs-cicd-credentials.tgz file in the user's home directory) and use it as the container base image in which CI/CD jobs are run.

Now we move to the last step of the whole process. In our case, we created a wrapper script that:

Runs the scripts to build services and saves their output to a file on disk.
After building the service, it displays the last lines of output.
Compresses the file with output.
Uploads the compressed file to Google Cloud Storage.
Displays the name of the file that has been uploaded to Google Cloud Storage.

The name of the service to build is given as the first command line argument to the wrapper.

Example of the content of such a script:

service-build-wrapper


#!/bin/bash -xe

logFileName=$(mktemp /tmp/log-${CI_PROJECT_NAME}-$(date +%Y%m%d-%H%M%S)-job-id-${CI_JOB_ID}-job-name-${CI_JOB_NAME}-XXXXXXXXXX)
touch "${logFileName}"

cat << EOF
#############################################################################################
Running job script. Please be patient. No output will be show until finish.
All output is saved to file on disk and at the end will be uploaded to Google Cloud Storage.
End of job output will be displayed here for your convenience.
It can take some time for the job to finish.
  -- $(date)
#############################################################################################
EOF

set +e
bash -xe "./build-${1}.sh" >> "${logFileName}" 2>&1
retcode="$?"
set -e
echo
echo "<job-output-end>"
tail -n 1000 "${logFileName}"
echo "</job-output-end>"
echo
echo "[I] Compressing log file"
bzip2 -9 "${logFileName}"
echo "[I] Uploading log file to Google Cloud Storage"
gsutil cp "${logFileName}.bz2" gs://foobar-gitlab-ci-cd-jobs-output/
echo "[I] Log file uploaded under name '$(basename ${logFileName}).bz2'"
rm "${logFileName}.bz2"
echo "[I] Date: $(date)"
echo "[I] Exiting with code '${retcode}'."
exit "${retcode}"

Thanks to this solution, we can store job logs as long as we need and we don't have to worry about their size.

Problem 2: Limited duration of jobs running on shared runners

Building some of the services in our project takes more than 2 hours. The default job duration limit is 1 hour. You can change it inthe CI/CD settings of the repository in GitLab. We observed that shared GitLab runners only allow the limit to be raised to 3 hours. In our case, this was enough, even the longest job took less than 3 hours.

In the CI/CD job that has timed out, we can see the following messages (marked with a red dotted line):

Message:The script exceeded the maximum execution time set for the job.

Message: ERROR: Job failed: execution took longer than 1h0m0s seconds.

If we need a limit of more than 3 hours, we must use our own runners. For example, you can use the Kubernetes cluster to run your own runners. GitLab has good integration with Kubernetes and you can quickly connect such a cluster to it. We observed that for runners running on Kubernetes, much higher time limits for the jobs are accepted (e.g. 8 hours).

The GitLab repository can run CI / CD jobs using both shared runners and our own runners at the same time. The runner on which the job will run is randomly selected. This allows you to optimize costs.

For the project in question, we chose to run additional runners on our own Kubernetes cluster.

Such a cluster can quickly be created using GKE and easily connected to GitLab.

We created the cluster as follows.

In the Google Cloud Platform, select Kubernetes Engine and click Create Cluster:

We go through the following wizard screens:

After creating the cluster, we check the number of working nodes:

We can connect to the cluster from the command line. The command that creates the .kube/config file with the appropriate content can be seen after clicking the Connect button.

$ gcloud container clusters get-credentials my-cluster-1 --zone us-aaa1-a --project my-project-1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for my-cluster-1.

To view the list of nodes:

$ kubectl get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
gke-my-cluster-1-default-pool-asdfghjk-1234   Ready    <none>   15m   v1.15.9-gke.24

We now move to GitLab and find a group to which we will add a new Kubernetes cluster. Below are screenshots with the important elements highlighted.

A wizard will appear in which we will have to fill out various fields. In the instructions at https://gitlab.com/help/user/project/clusters/add_remove_clusters.md#add-existing-cluster include the exact commands to use to get the information you need from the cluster.

Once the cluster is connected to GitLab, the Helm Tiller and GitLab Runner services must be installed. This is done by clicking the marked buttons:

We can run the following command in a separate terminal window and we will see real-time information about events in the cluster:

$ kubectl get events --all-namespaces --watch

In the Runner list available for the repository, a group Runner should appear with the tags cluster and Kubernetes:

big data

kubernetes

google cloud platform

Gitlab

CI/CD

Last updated: 28 July 2020

Written by

Maciej Korzeń

DevOps

Like this post?
Spread the word

Want more? Check our articles

power of big dataobszar roboczy 1 3x 100

Tutorial

Power of the Big Data: Industry

Welcome to the third part of the "Power of Big Data" series, in which we describe how Big Data tools and solutions support the development of modern…

Tutorial

Real-time ingestion to Iceberg with Kafka Connect - Apache Iceberg Sink

What is Apache Iceberg? Apache Iceberg is an open table format for huge analytics datasets which can be used with commonly-used big data processing…

Big Data Event

Truecaller, GetInData and Google’s contribution to Big Data Tech Warsaw Summit

GetInData, Google and Truecaller participate in the Big Data Tech Warsaw Summit 2019. It’s already less than two weeks to the 5th edition of Big Data…

getindator design a vibrant and engaging scene showcasing real 76ab8269 a013 4120 b722 f95e879d333c

Tutorial

Stream enrichment with Flink SQL

In today's world, real-time data processing is essential for businesses that want to remain competitive and responsive. The ability to obtain results…

managingmultipledatasourceobszar roboczy 1 4

Tutorial

Feature store - managing multiple data sources with Feast

As the effort to productionize ML workflows is growing, feature stores are also growing in importance. Their job is to provide standardized and up-to…

getindata pycaret bigqueryml train deploy machine learning model notext

Tutorial

PyCaret and BigQueryML Inference Engine. Is this the fastest way to train and deploy a machine learning model?

Streamlining ML Development: The Impact of Low-Code Platforms Time is often a critical factor that can make or break the success of any endeavor. In…

Check All

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.

What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.

Type the form or send a e-mail: hello@getindata.com

How we helped our client to transfer legacy pipeline to modern one using GitLab's CI/CD - Part 1

Table of Contents

PART I

PART II

PART III

Problem description

General description of the solution

Problem 1: Limited job output size in GitLab

Problem 2: Limited duration of jobs running on shared runners

Like this post?Spread the word

Want more? Check our articles

Power of the Big Data: Industry

Real-time ingestion to Iceberg with Kafka Connect - Apache Iceberg Sink

Truecaller, GetInData and Google’s contribution to Big Data Tech Warsaw Summit

Stream enrichment with Flink SQL

Feature store - managing multiple data sources with Feast

PyCaret and BigQueryML Inference Engine. Is this the fastest way to train and deploy a machine learning model?

Contact us

Interested in our solutions?Contact us!

Like this post?
Spread the word

Interested in our solutions?
Contact us!