0%

Posts Tagged
‘hcatalog’

We share our knowledge happily

Homehcatalog

In the first part of this blog series I described a few challenges that I had to face to quickly implement a simple Hive query and schedule it periodically on the Hadoop cluster. These challenges include data cataloguing, data discovery, data lineage and process scheduling. I also explained how they can be addressed using existing […]

This blog series is based on the talk “Simplified Data Management and Process Scheduling in Hadoop” that we gave at Big Data Technical Conference in Poland in February 2015. Because the talk was very well received by the audience, we decided to convert it into blog series. In the first part we describe possible open-source […]

In this blog post, I describe a few surprising gotchas related to the import of a MySQL table into Hive using Sqoop 1.4.5 (the most recent version supported by vendors like Hortonworks or Cloudera at the time of writing this post). Real-world scenario In my simple (yet real-world) use-case, I have a MySQL table and […]

We are happy to share slides about HCatalog that come from Data Analyst Training delivered by GetInData. HCatalog allows users with different data processing tools (such as Apache Hive, Apache Pig, MapReduce) to share data on the Hadoop cluster in an easier way. The slides cover HCatalog’s primary motivation, goals, the most important features, currently […]

0
1
pattern
http://getindata.com/wp-content/themes/blake/
http://getindata.com//
#FFD966
style1
scrollauto
Loading posts...
/home/kawaa/domains/kawaa.linuxpl.info/public_html/gd2/
#
off
none
loading
#
Sort Gallery
http://getindata.com/wp-content/themes/blake
on
off
Enter your email here
on
off