Radio DaTa Podcast
8 min read

Data Journey with Alessandro Romano (FREE NOW) – Dynamic pricing in a real-time app, technology stack and pragmatism in data science.

In this episode of the RadioData Podcast, Adama Kawa talks with Alessandro Romano about FREE NOW use cases: data, techniques, signals and the KPIs used to develop the dynamic pricing ML model for a real-time mobile app. We will also talk about the feedback loop and technology stack.

We encourage you to listen to the whole podcast or, if you prefer reading, skip to the key takeaways listed below.

___________

Host: Adam Kawa, GetInData | Part of Xebia CEO

Since 2010, Adam has been working with Big Data at Spotify (where he proudly operated one of the largest and fastest-growing Hadoop clusters in Europe), Truecaller and as a Cloudera Training Partner. Nine years ago, he co-founded GetInData | Part of Xebia – a company that helps its customers to become data-driven and builds custom Big Data solutions. Adam is also the creator of many community initiatives such as the RadioData podcast, Big Data meetups and the DATA Pill newsletter.

Guest: Alessandro Romano, Senior Data Scientist


Alessandro Romano is a Senior Data Scientist at Kuehne+Nagel, who previously worked  for FREE NOW
. Alessandro started working as a Data Scientist 6 years ago. He studied Computer Science and Business Informatics – a mix of statistics, computer science and economics which could be  described today as a Data Science profile.

________________

FREE NOW and the FREE NOW use case

FREE NOW is a multi-mobility company that creates a service that enables the user to request different types of transportation such as a Taxi, car-sharing, electric scooters or a private taxi – depending on the region, and provides this all in a single application.

FREE NOW processes multiple data sources and data types, therefore being the top mobility service on the market. One of the first problems that Allesandro had to solve was the dynamic pricing of the drivers. The problem had to take into account the supply and demand of the drivers and provide the right price for the actual situation.

_________________

Key takeaways:

1. How did FREE NOW solve the dynamic pricing problem?

The problem was solved by processing the signals that the app gets from the environment in real time. The basic solution is about preserving the balance between the supply and demand of the drivers and less about engaging as many passengers as possible or as many drivers as possible. The solution is not mainly about increasing the revenue, but about balancing the  supply and demand when the difference between the two is high (e.g. high demand and low supply, or high supply and low demand). 

2. What are the typical datasources or datastreams used to calculate the price dynamically?

The typical data that is collected for  supply and demand is:

  • how many passengers are requesting a driver at one time (e.g. 10 passengers requesting a taxi),
  • how many drivers are available (e.g. 1 driver is available).

In this example, the demand is very high in comparison to the supply, so we raise the price so that only those passengers that really need a driver can afford it. In the other case, when the supply is high but the demand is low, we lower the price.

It seems simple, but it’s quite complicated underneath, because we enrich the basic information about the  supply and demand with e.g. weather data (what can impact the predictions).

If it is raining in London, then whoever is leaving the office is going to request a taxi, because no one wants to get wet. When there is heavy rain  we can be sure that everyone is going to book a taxi no matter the price, so the demand is high and this can be predicted by using additional weather data, for example.

3. What kinds of algorithms or quality metrics should you use to choose the right model to balance the supply and demand ?

There are a bunch of KPIs (Key Performance Indicators) that drive the process of selecting the right model. We also use accuracy, but it’s not as important as KPIs and testing the model online. There are a lot of experiments run where the models are tested against real-time data and there is also a lot of A/B testing involved. We try to see how the model interacts with the environment and whether it meets the expectations.

There is a feedback loop between the model and the environment: the model reads the environment and sets up the price, and this event changes the environment (the demand) which is a new environment for the model. This is a very complex problem to solve. The model has to react quickly to certain events and has to be stable in a constantly changing environment.

The feedback loop is quite fast, which enables FREE NOW to experiment with different pricing strategies and algorithms within minutes or hours. Additionally,  when talking about predictions, we can achieve immediate feedback and a comparison between e.g. the expected time of arrival and the real time of arrival, which makes FREE NOW unique in comparison to, for example, Spotify or other companies that deal with a large amount of data.

4. What are the most important KPIs that FREE NOW is using?

This depends on the business, but in the case of FREE NOW it’s important to check how many of the quotes (pricing requests for a ride) are converted into bookings, how many of those bookings we send to the drivers and how many of them are accepted by the drivers.

It might be that the KPI interpretation changes during the use of the model, or the KPIs change from quarter to quarter.

5. What is the Free Now technology stack that is used for experimenting, developing new machine learning models and deploying them in production at scale?

The most common tools that we use are:

  • Databricks – our main tool that is used for designing and implementing a new model keeps things in one place. We use PySpark and Jupyter Notebook to tackle the BigData and Machine Learning part.
  • PyCharm – this is the main tool  which is  best suited to production model development.
  • Airflow – for scheduling.
  • Tableau – the main tool for data analytics.
  • TensorFlow – for dynamic pricing solutions.

Whenever we want to execute the Databricks Notebook which contains a training pipeline, we use Databricks Operator which calls Databricks Notebook from Airflow and we build our training pipeline from there.

The cloud stack that is used in the background is AWS, although we don’t interact with it directly for most of the time.

6. Do you use MLFlow or Kedro for your Machine Learning projects?

We use MLFlow, mainly because it’s available out of the box in Databricks. You can track your experiments alongside your model and all the information that is the output of your notebook in one place, which is very helpful. 

7. Do you use Kafka or Flink for real-time streaming and processing?

We use those tools but we don’t maintain them, we use Kafka Streams for stream processing the data on the Kafka cluster.

8. What is unique about FREE NOW when compared to your competitors?

Having a multi-mobility app is a lot of fun. You can use it everyday for many car-sharing services.

When talking about uniqueness from the data science and data engineering perspective, then I can say that we have a great team with a lot of smart people, who contribute everyday to the whole project. This can be clearly visible from the inside. Regarding the outside, probably the CEO would be a better person to answer this question.

9. What are the trends that you currently see in the BigData and Data Science landscape that are worth keeping an eye on in the near future?

Regarding the technologies and trends, it seems that we haven't discovered a proper way of using neural networks and AI overall. There are a lot of stories about people trying to solve problems by following the trends and failing, because they thought that AI and NN would solve everything.

Right now there is more understanding about the fact that it does not have to be a neural network. There are lots of other technologies that come from statistics, computer science, etc. that can be successfully applied to a wide range of problems which don’t need a fancy solution like a neural network. Sometimes, even a simple algorithm like regression or an implementation of a function solves the problem.

Over recent years we’ve lost track of what the correct way of solving problems is. The new technologies that we started using do not solve everything. The technologies should be applied to the right classes of problems. Overengineered solutions are always hard to maintain.

___________________

These are just snippets from the entire conversation which you can listen to here: 

Subscribe to the Radio DaTa podcast to stay up-to-date with the latest technology trends and discover the most interesting data use cases! 

SUBSCRIBE

Data Science
dynamic pricing
real-time app
10 August 2023

Want more? Check our articles

whitepaper data anlytics iot albert lewandowski getindata
Whitepaper

White Paper: Data Analytics for Industrial Internet of Things

About In this White Paper, we described what is the Industrial Internet of Things and what profits you can get from Data Analytics with IIoT What you…

Read more
mariusz blogobszar roboczy 1 4x 100
Tutorial

OAuth2-based authentication on Istio-powered Kubernetes clusters

You have just installed your first Kubernetes cluster and installed Istio to get the full advantage of Service Mesh. Thanks to really awesome…

Read more
getindata nifi flow cicd notext
Tutorial

NiFi Ingestion Blog Series. PART II - We have deployed, but at what cost… - CI/CD of NiFi flow

Apache NiFi, a big data processing engine with graphical WebUI, was created to give non-programmers the ability to swiftly and codelessly create data…

Read more
getindata cover nifi ingestion kafka poc notext
Tutorial

NiFi Ingestion Blog Series. PART V - It’s fast and easy, what could possibly go wrong - one year history of certain nifi flow

Apache NiFi, a big data processing engine with graphical WebUI, was created to give non-programmers the ability to swiftly and codelessly create data…

Read more
apache2xobszar roboczy 1 4
Tutorial

Introduction to GeoSpatial streaming with Apache Spark and Apache Sedona

We are  producing more and more geospatial data these days. Many companies struggle to analyze and process such data, and a lot of this data comes…

Read more
complex event processing apache flink
Tutorial

My experience with Apache Flink for Complex Event Processing

My goal is to create a comprehensive review of available options when dealing with Complex Event Processing using Apache Flink. We will be building a…

Read more

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.


What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.
Type the form or send a e-mail: hello@getindata.com
The administrator of your personal data is GetInData Poland Sp. z o.o. with its registered seat in Warsaw (02-508), 39/20 Pulawska St. Your data is processed for the purpose of provision of electronic services in accordance with the Terms & Conditions. For more information on personal data processing and your rights please see Privacy Policy.

By submitting this form, you agree to our Terms & Conditions and Privacy Policy