Big Data Event
A Review of the Big Data Technology Warsaw Summit 2024! Part 2: Private RAG-backed Data Copilot, Allegro and PLAY case studies
In this blogpost series, we share takeaways from selected topics presented during the Big Data Tech Warsaw Summit ‘24. In the first part, which you…
Read moreTutorial
Real-time ingestion to Iceberg with Kafka Connect - Apache Iceberg Sink
What is Apache Iceberg? Apache Iceberg is an open table format for huge analytics datasets which can be used with commonly-used big data processing…
Read moreTutorial
Apache Spark with Apache Iceberg - a way to boost your data pipeline performance and safety
SQL language was invented in 1970 and has powered databases for decades. It allows you not only to query the data, but also to modify it easily on the…
Read moreTutorial
Deploying MLflow on the Google Cloud Platform using App Engine
MLOps platforms delivered by GetInData allow us to pick best of breed technologies to cover crucial functionalities. MLflow is one of the key…
Read moreTutorial
dbt run real-time analytics on Apache Flink. Announcing the dbt-flink-adapter!
We would like to announce the dbt-flink-adapter, that allows running pipelines defined in SQL in a dbt project on Apache Flink. Find out what the…
Read moreUse-cases/Project
Geospatial analytics on Hadoop
A few months ago I was working on a project with a lot of geospatial data. Data was stored in HDFS, easily accessible through Hive. One of the tasks…
Read more