FAST SQL ON HADOOP

This two-day course teaches students how to efficiently analyze massive amounts of data available in Hadoop cluster.

NEXT TERM

NOT SCHEDULED If you are interested in, please contact us!
Duration
2 days training

Target audience
Data Analysts, BI Specialists

Technologies
e.g. Hive, Spark SQL, Presto, Impala, Avro, Parquet

Workshop overview

This two-day course teaches students how to efficiently analyze massive amounts of data available in Hadoop cluster.

During the course we simulate real-world scenarios. Every participant plays a role of data analyst who works for an imaginary company called StreamRock (inspired by Spotify – our favourite music streaming app). Students use popular open-source tools with SQL-like interface to quickly extract the knowledge hidden in the large data sets. The workshop consists of practical exercises that are executed on the Hadoop cluster running in the public cloud.

Data Analysts, BI Specialists and all people who are interested in iterating fast by using efficient SQL tools to extract knowledge from large datasets stored in a Hadoop cluster. Basic knowledge of SQL is assumed.

All you need to fully participate in our training program is a laptop with the web browser, Shell terminal (e.g. Putty) and the wi-fi connection. Our workshops are mostly technical (and some business), however you do not need to have previous experience with Big Data technologies.

The training provides a carefully prepared mix of theory, exercises, demos, discussions, quizzes and ? fun! We make sure that each participant is highly engaged in hands-on exercises, discussions and teamwork exercises.

Course agenda*

DAY 1


  • Introduction to use-case: StreamRock
  • Introduction to Hadoop
    • HDFS
    • YARN
  • File Formats
    • Text formats
    • Row-oriented format ? Apache Avro
    • Column-oriented formats ? Parquet and ORC
  • Apache Hive
    • Key concepts
    • Comparison with RDBMS
    • Hive Query Language
    • Hands-on exercises
    • Hive architecture
    • Execution engines: MapReduce, Tez, Spark
    • Useful features
    • Query optimisations techniques

DAY 2


  • Cloudera Impala
    • Typical use-cases
    • Comparison with Hive
    • Impala architecture
    • Hands-on exercises
  • Bonus – Facebook Presto
    • Comparison with Hive and Impala
    • Presto architecture
    • Demo
  • Spark SQL
    • Introduction to Spark
    • Key features
    • Integration with Hive
    • DataFrames
    • Hands-on exercises
  • Comparing Hive, Impala, Spark SQL and Presto
    • Benchmarks
    • When to use which

* GetInData reserves the right to make any changes and adjustments to the presented agenda.

Instructors

Our workshops and training programs are organized by experienced instructors with many years of real life Big Data experience. Get to know with our team!

More information

Please contact us for any questions on training courses, or if you would like to discuss a custom, on-site training course.

FEEDBACK FROM ATTENDEES

  • Hadoop Administrator Training
    Hadoop Administrator Training, Allegro

    I do highly value substantive content of the course as well as great preparedness and layout. Knowlege passed in a ordered, consistent and effective way. Participants involvement during workshop sessions is the best indicator of this positive training!

  • Big Data Workshop
    Big Data Workshop, Stepstone

    Big Data workshops were led by real professionalists, tools and materials prepared in a way allowing participants to get down to the brass tacks straightaway without losing time. Attendees not disturbing each other and evryone can work comfortably and effectively. One can notice striking knowledge of the host and the fact that it comes from real professional work experience.

  • IE Business School

    This is an excellent course and excellent teacher. Adam was well prepared, new the subject material, was good at transmitting his knowledge to us and had prepared exercises that added a lot of value to the sessions. I would rank this six if I could.

  • Hadoop Developer Training
    Hadoop Developer Training, Conficential

    Professionally prepared and led courses. Coaches with vast experience in the presented realm.

  • IE Business School

    Outstanding professor, the course was very well planned, he is very knowledgeable about what he taught. He talked about real-world cases and managed to get the whole class interested for 6 hours straight. Definitely one of the best courses that we have had in the masters.

OTHER BIG DATA TRAINING

pattern
http://getindata.com/wp-content/themes/blake/
http://getindata.com//
#FFD966
style1
scrollauto
Loading posts...
/home/kawaa/domains/kawaa.linuxpl.info/public_html/gd2/
#
off
none
loading
#
Sort Gallery
http://getindata.com/wp-content/themes/blake
on
off
Enter your email here
on
off