Use-cases/Project
2 min read

Running Spark on Amazon Web Services (AWS)

running apache spark on aws
Source: Acast-Tech https://medium.com/acast-tech/

When you search thought the net looking for methods of running Apache Spark on AWS infrastructure you are most likely to be redirected to the documentation of AWS EMR (Elastic Map Reduce) service, which is Amazon's Hadoop distribution suited to run in AWS cloud environment. It's quite an easy way to deploy your data pipelines, but sometimes bootstrapping a huge cluster to perform simple ad-hoc analysis it's a cumbersome task. They say:

"to a man with a hammer everything looks like a nail" :)

and we felt into this trap with EMR once.

The article below describes two other ways of running Apache Spark jobs on AWS-managed infrastructure - AWS Glue and AWS Fargate - that we use on our clients' data warehousing projects. You will find there the key differences between these methods when it comes to flexibility and pricing, showing why there is no place for "one service fits all" approach in AWS world.

Check out!

big data
spark
AWS
Amazon Web Services
18 December 2019

Want more? Check our articles

aiobszar roboczy 1 4
Tutorial

EU Artificial Intelligence Act - where are we now

It's coming up to a year since the European Commission published its proposal for the Artificial Intelligence Act (the AI Act/AI Regulation).  The…

Read more
1 RsDrT5xOpdAcpehomqlOPg
Big Data Event

2³ Reasons To Speak at Big Data Tech Warsaw 2020 (February 27th, 2020)

Big Data Technology Warsaw Summit 2020 is fast approaching. This will be 6th edition of the conference that is jointly organised by Evention and…

Read more
datamass getindata adoption genai
Big Data Event

A Review of the Presentations at the DataMass Gdańsk Summit 2023

The Data Mass Gdańsk Summit is behind us. So, the time has come to review and summarize the 2023 edition. In this blog post, we will give you a review…

Read more
getindata big data blog apache spark iceberg
Tutorial

Apache Spark with Apache Iceberg - a way to boost your data pipeline performance and safety

SQL language was invented in 1970 and has powered databases for decades. It allows you not only to query the data, but also to modify it easily on the…

Read more
hfobszar roboczy 1 4
Tutorial

Automated Machine Learning (AutoML) with BigQuery ML. Start Machine Learning easily and validate if ML is worth investing in or not.

Machine learning is becoming increasingly popular in many industries, from finance to marketing to healthcare. But let's face it, that doesn't mean ML…

Read more
bloggobszar roboczy 1 4
Tutorial

What are Graph Neural Networks and why should you consider using them in your Recommendation System?

Introduction Graph Neural Networks (GNNs) have been one of the hottest topics in the AI world in recent years, with many potential business…

Read more

Contact us

Interested in our solutions?
Contact us!

Together, we will select the best Big Data solutions for your organization and build a project that will have a real impact on your organization.


What did you find most impressive about GetInData?

They did a very good job in finding people that fitted in Acast both technically as well as culturally.
Type the form or send a e-mail: hello@getindata.com
The administrator of your personal data is GetInData Poland Sp. z o.o. with its registered seat in Warsaw (02-508), 39/20 Pulawska St. Your data is processed for the purpose of provision of electronic services in accordance with the Terms & Conditions. For more information on personal data processing and your rights please see Privacy Policy.

By submitting this form, you agree to our Terms & Conditions and Privacy Policy