Radio DaTa Podcast
8 min read

Data Journey with Alessandro Romano (FREE NOW) – Dynamic pricing in a real-time app, technology stack and pragmatism in data science.

In this episode of the RadioData Podcast, Adama Kawa talks with Alessandro Romano about FREE NOW use cases: data, techniques, signals and the KPIs used to develop the dynamic pricing ML model for a real-time mobile app. We will also talk about the feedback loop and technology stack.

We encourage you to listen to the whole podcast or, if you prefer reading, skip to the key takeaways listed below.

___________

Host: Adam Kawa, GetInData | Part of Xebia CEO

Since 2010, Adam has been working with Big Data at Spotify (where he proudly operated one of the largest and fastest-growing Hadoop clusters in Europe), Truecaller and as a Cloudera Training Partner. Nine years ago, he co-founded GetInData | Part of Xebia – a company that helps its customers to become data-driven and builds custom Big Data solutions. Adam is also the creator of many community initiatives such as the RadioData podcast, Big Data meetups and the DATA Pill newsletter.

Guest: Alessandro Romano, Senior Data Scientist


Alessandro Romano is a Senior Data Scientist at Kuehne+Nagel, who previously worked  for FREE NOW
. Alessandro started working as a Data Scientist 6 years ago. He studied Computer Science and Business Informatics – a mix of statistics, computer science and economics which could be  described today as a Data Science profile.

________________

FREE NOW and the FREE NOW use case

FREE NOW is a multi-mobility company that creates a service that enables the user to request different types of transportation such as a Taxi, car-sharing, electric scooters or a private taxi – depending on the region, and provides this all in a single application.

FREE NOW processes multiple data sources and data types, therefore being the top mobility service on the market. One of the first problems that Allesandro had to solve was the dynamic pricing of the drivers. The problem had to take into account the supply and demand of the drivers and provide the right price for the actual situation.

_________________

Key takeaways:

1. How did FREE NOW solve the dynamic pricing problem?

The problem was solved by processing the signals that the app gets from the environment in real time. The basic solution is about preserving the balance between the supply and demand of the drivers and less about engaging as many passengers as possible or as many drivers as possible. The solution is not mainly about increasing the revenue, but about balancing the  supply and demand when the difference between the two is high (e.g. high demand and low supply, or high supply and low demand). 

2. What are the typical datasources or datastreams used to calculate the price dynamically?

The typical data that is collected for  supply and demand is:

  • how many passengers are requesting a driver at one time (e.g. 10 passengers requesting a taxi),
  • how many drivers are available (e.g. 1 driver is available).

In this example, the demand is very high in comparison to the supply, so we raise the price so that only those passengers that really need a driver can afford it. In the other case, when the supply is high but the demand is low, we lower the price.

It seems simple, but it’s quite complicated underneath, because we enrich the basic information about the  supply and demand with e.g. weather data (what can impact the predictions).

If it is raining in London, then whoever is leaving the office is going to request a taxi, because no one wants to get wet. When there is heavy rain  we can be sure that everyone is going to book a taxi no matter the price, so the demand is high and this can be predicted by using additional weather data, for example.

3. What kinds of algorithms or quality metrics should you use to choose the right model to balance the supply and demand ?

There are a bunch of KPIs (Key Performance Indicators) that drive the process of selecting the right model. We also use accuracy, but it’s not as important as KPIs and testing the model online. There are a lot of experiments run where the models are tested against real-time data and there is also a lot of A/B testing involved. We try to see how the model interacts with the environment and whether it meets the expectations.

There is a feedback loop between the model and the environment: the model reads the environment and sets up the price, and this event changes the environment (the demand) which is a new environment for the model. This is a very complex problem to solve. The model has to react quickly to certain events and has to be stable in a constantly changing environment.

The feedback loop is quite fast, which enables FREE NOW to experiment with different pricing strategies and algorithms within minutes or hours. Additionally,  when talking about predictions, we can achieve immediate feedback and a comparison between e.g. the expected time of arrival and the real time of arrival, which makes FREE NOW unique in comparison to, for example, Spotify or other companies that deal with a large amount of data.

4. What are the most important KPIs that FREE NOW is using?

This depends on the business, but in the case of FREE NOW it’s important to check how many of the quotes (pricing requests for a ride) are converted into bookings, how many of those bookings we send to the drivers and how many of them are accepted by the drivers.

It might be that the KPI interpretation changes during the use of the model, or the KPIs change from quarter to quarter.

5. What is the Free Now technology stack that is used for experimenting, developing new machine learning models and deploying them in production at scale?

The most common tools that we use are:

  • Databricks – our main tool that is used for designing and implementing a new model keeps things in one place. We use PySpark and Jupyter Notebook to tackle the BigData and Machine Learning part.
  • PyCharm – this is the main tool  which is  best suited to production model development.
  • Airflow – for scheduling.
  • Tableau – the main tool for data analytics.
  • TensorFlow – for dynamic pricing solutions.

Whenever we want to execute the Databricks Notebook which contains a training pipeline, we use Databricks Operator which calls Databricks Notebook from Airflow and we build our training pipeline from there.

The cloud stack that is used in the background is AWS, although we don’t interact with it directly for most of the time.

6. Do you use MLFlow or Kedro for your Machine Learning projects?

We use MLFlow, mainly because it’s available out of the box in Databricks. You can track your experiments alongside your model and all the information that is the output of your notebook in one place, which is very helpful. 

7. Do you use Kafka or Flink for real-time streaming and processing?

We use those tools but we don’t maintain them, we use Kafka Streams for stream processing the data on the Kafka cluster.

8. What is unique about FREE NOW when compared to your competitors?

Having a multi-mobility app is a lot of fun. You can use it everyday for many car-sharing services.

When talking about uniqueness from the data science and data engineering perspective, then I can say that we have a great team with a lot of smart people, who contribute everyday to the whole project. This can be clearly visible from the inside. Regarding the outside, probably the CEO would be a better person to answer this question.

9. What are the trends that you currently see in the BigData and Data Science landscape that are worth keeping an eye on in the near future?

Regarding the technologies and trends, it seems that we haven't discovered a proper way of using neural networks and AI overall. There are a lot of stories about people trying to solve problems by following the trends and failing, because they thought that AI and NN would solve everything.

Right now there is more understanding about the fact that it does not have to be a neural network. There are lots of other technologies that come from statistics, computer science, etc. that can be successfully applied to a wide range of problems which don’t need a fancy solution like a neural network. Sometimes, even a simple algorithm like regression or an implementation of a function solves the problem.

Over recent years we’ve lost track of what the correct way of solving problems is. The new technologies that we started using do not solve everything. The technologies should be applied to the right classes of problems. Overengineered solutions are always hard to maintain.

___________________

These are just snippets from the entire conversation which you can listen to here: 

Subscribe to the Radio DaTa podcast to stay up-to-date with the latest technology trends and discover the most interesting data use cases! 

SUBSCRIBE

Data Science
dynamic pricing
real-time app
10 August 2023

Want more? Check our articles

getindata cover nifi ingestion nologo
Tutorial

Apache NiFi - why do data engineers love it and hate it at the same time? Blog Series Introduction

Learning new technologies is like falling in love. At the beginning, you enjoy it totally and it is like wearing pink glasses that prevent you from…

Read more
copy of copy of gid commit 2
Use-cases/Project

Real-Time Data Revolution: How Bank Millennium Transformed Customer Engagement and Fraud Prevention

The rapid growth of electronic customer contact channels has led to an explosion of data, both financial and behavioral, generated in real-time. This…

Read more
propozycja2
Tutorial

Deploying efficient Kedro pipelines on GCP Composer / Airflow with node grouping & MLflow

Airflow is a commonly used orchestrator that helps you schedule, run and monitor all kinds of workflows. Thanks to Python, it offers lots of freedom…

Read more
covid 19 pandemia
Use-cases/Project

Fighting COVID-19 with Google Cloud - quarantine tracking system

Coronavirus is spreading through the world. At the moment of writing this post (on the 26th of March 2020) over 475k people have been infected and…

Read more
highly available airflow cluster aws notext
Tutorial

Highly available Airflow cluster in Amazon AWS

These days, companies getting into Big Data are granted to compose their set of technologies from a huge variety of available solutions. Even though…

Read more
blogobszar roboczy 1
Success Stories

How we built a Modern Data Platform in 4 months for Volt.io, a FinTech scale-up

Money transfers from one account to another within one second, wherever you are? Volt.io is building the world’s first global real-time payment…

Read more

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.


What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.
Type the form or send a e-mail: hello@getindata.com
The administrator of your personal data is GetInData Poland Sp. z o.o. with its registered seat in Warsaw (02-508), 39/20 Pulawska St. Your data is processed for the purpose of provision of electronic services in accordance with the Terms & Conditions. For more information on personal data processing and your rights please see Privacy Policy.

By submitting this form, you agree to our Terms & Conditions and Privacy Policy