Single interface for all your Data Science needs

Give your Data Scientists a freedom to choose the tools they want to gain meaningful insights you need. Let them discover data sets that are the most relevant to their research and boost their productivity.

They get value from Data Science Platform:

How does the Data Science Platform work?


Data Source

Your IT systems exchange vast amount of information, that includes technical messages about opening a form on your website, network traffic information, sensor data, but also more meaningful information like new orders from your customer. \ You obviously have access to most of that information in dedicated systems, in a more aggregated manner and on-demand. However, what would you do if you had a chance to combine messages from different systems and react on the spot, just after they were generated? Event processing system are designed to analyse messages in real-time, enrich them with external information, combine into more complex events, analyze for patterns and trigger actions.


Realtime data stream

The business value of information decreases over time. It may be useful for your use case to analyse data in real time, so you can monitor your business activities and react on the spot.


External data sources

It may happen that you want to use data that is not available in your Data Lake. Our design allows you to access data from multiple systems, like external databases, files and data stores, within a single query. You do not need to copy data from different sources to use them in your report.


Data Lake

Data Lake is a place where your structured (like transactions from ecommerce system), semi-structured (e.g. XML or JSON files) and unstructured data (these can be image, but also documents) data is loaded and made accessible for reporting and analytics purposes. Data is stored in a secured manner, what means it can be only accessed by authorised users, and in optimized data structures for performance reasons.

Unified Data Science and ML


Data Science/ML Notebooks

Notebooks became a standard interface for Data Scientist to work with data. They are interactive web-based development environments where you can combine data from different sources, use various technologies and visualise output. Notebooks are very open and flexible - they can be configured to support wide range of workflows in data science, scientific computing and machine learning. Standard functionalities can be extended by existing or custom plugins. There is also a wide variety of visualisation libraries available for static and interactive plots. Notebooks give freedom to choose tools that are the most appropriate to the task, they structure research and make it easy to share with peers. The list of supported technologies is long, just to mention a few: Python, R, Julia, Ruby.


Interactive BI

Interactive BI allows to explore data verify hypothesis regarding data insights. Using interactive tool you will be able to connect to Data Lake or other data sources - they all create a federated data source that you can query no matter where data is physically stored. Data can be reported on demand or on a scheduled basis.


Data Discovery

Data Discovery component should be the first step in data analytics. Its main goal is to improve productivity of data analysts and data scientists. In simple words this is the catalogue of all available data sets that you can use in your work. Data sets are searchable, have descriptions, popularity score, quality metrics and domain knowledge experts defined. You can easily find the most promising data sets and check with your peers who have more experience working with it.



Security and access management tool allows to control user access to data and components of the environment. It provides audit capabilities for verifying who has access to specific resources.



Deployment automation with proper configuration management are key to ensure the high quality of software delivery and to reduce risk of production deployments. All our code is stored in version control system. We design tests to be a part of the Continuous Integration and Continuous Deployment pipelines.



Complex monitoring and observability solution gives detailed information on the state and performance of the components. You can also deploy metrics to observe application processing behaviour. Monitoring includes also alerting capabilities, needed for reliability and supportability.



Originally all of the components of Hadoop ecosystem were installed with Yarn as an orchestrator to achieve scalability and manage infrastructure resources. Nowadays Kubernetes is becoming a new standard for managing resources in distributed computing environments. We design our applications and workloads to work directly on Kubernetes.


Use cases

The need of having proper reporting in your business is rather indisputable. However having a unified access to all your data and being able to combine data in a single report from different sources might bring your analytics capabilities to a higher level. Access to proper technology will not only increase your Team productivity but also improve reliability and consistency of your reporting. It is also a foundation for becoming a data-driven organisation.

How does the Data Science Platform work?


Get Free White Paper

Take a look at some of the big data projects delivered by our big data expert team


How we work with customer?

We have a different way of working with clients, that allows us to build deep trust based partnerships, which often endure over years. It is based on a few powerful and pragmatic principles tested and refined over many years of our consulting and project delivery experience.

  • Big Data is a process

    Big Data is a process

    Big Data is not about technologies, but about employing culture of collecting, analyzing and using data in a structured way, in innovation-friendly environment. We can help you start this journey.

  • DataOps principles

    DataOps principles

    Our code is versioned, unit tested and, deployed using CI/CD. We also design unit tests for data to measure the its quality in large data sets

  • Open source or native cloud services

    Open source or native cloud services

    We build our solutions with openness in mind, so we extensively use open Source software, however in some cases we suggest to use managed services offered by public cloud providers

  • On-premise or in public cloud

    On-premise or in public cloud

    Our solutions are designed to be deployed on your local infrastructure, in hybrid cloud or fully in the public cloud.

  • Technology agnostic

    Technology agnostic

    Our solutions are designed to accommodate best practices and our vast experience in Big Data and are not based on specific technologies. This gives us a flexibility to adjust the design to the project specifics and current state-of-the-art to better serve the goal.

  • Hadoop distribution

    Hadoop distribution

    For our customers who want to stick to Open Source and free version of Hadoop, we have prepared our own distribution build out of the latest packages.

Ready to build your Data Science Platform?

Please fill out the form and we will come back to you as soon as possible to schedule a meeting to discuss about GID Platform

What did you find most impressive about GetInData?

GetInData is a relatively small agency with experienced professionals that enjoy and perform their job exceptionally well. Their attentiveness and code quality are impressive
We were super impressed with the quality of their work and the knowledge of their engineers. They have very high standards in terms of code quality, organisational skills and are always willing to contribute with their best. They also are very friendly and easy going people, what made our collaboration more fun.
They did a very good job in finding people that fitted in Acast both technically as well as culturally.

Let's start a project together

Type the form or send a e-mail:
By submitting this form, you agree to our  Terms & Conditions