Accessing Hadoop Data Using Hive

Take our free course

bdu-course-images-11

Accessing Hadoop Data Using Hive

with Aaron Ritchie

Audience:
Beginners using Hive

Time to complete:
5 hours

Available in:
English

Writing map/reduce programs to analyze your Big Data can get complex. Hive can help make querying your data much easier. Apache Hive, first created at Facebook, is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. This course will teach you how to get started with Hive 0.12 using IBM BigInsights 2.1.2.

Big Data University has been chosen by IBM as one of the issuers of badges as part of the IBM Open Badge program. Share your achievements through LinkedIn, Facebook, Twitter, and more!

Big Data University leverages the services of Pearson VUE Acclaim to assist in the administration of the IBM Open Badge program.  By enrolling into this course, you agree to Big Data University sharing your details with Pearson VUE Acclaim for the strict use of issuing your badge upon completion of the badge criteria.

The labs for this course were recently updated and you can now take it in our virtual lab environment in the cloud.

If you are looking for the older version of the course, visit here.

Course Syllabus

  • Understand what Apache Hive is, the Hive architecture, and Hive use cases.
  • Make basic configuration changes in a Hive installation.
  • Use DDL to create new Hive databases and tables with a variety of different data types.
  • Create partitioned tables that are optimized for hadoop.
  • Create and run a variety of useful DML queries against Hive.
  • Use built in Hive operators and functions to get work done.
  • Create your own user defined functions in Hive.
  • Use a variety of different file formats and records formats with Hive.

General Information

  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be taken as many times as you wish.
  • Labs can be performed on the Cloud using our virtual lab environment.
  • Students passing the course (by passing the final exam) will have immediate access to printing their online certificate of achievement. Your name in the certificate will appear exactly as entered in your profile in BigDataUniversity.com.
  • If you did not pass the course, you can take it again at any time.

Recommended skills prior to taking this course

  • Basic understanding of Apache Hadoop and BigData.
  • Working knowledge of SQL.
  • Basic Linux Operating System knowledge.

Grading Scheme

  • The minimum passing mark for the course is 60%, where the final test is worth 100% of the course mark.
  • You have 3 attempts to take the test.