CCA 159 Data Analyst is one of the well-recognized Big Data certification. This scenario based certification exam demands in depth knowledge of Hive, Sqoop as well as basic knowledge of Impala.
This comprehensive course covers all aspects of the certification with real world examples and data sets.
Overview of Big Data eco system
HDFS Commands
Creating Tables in Hive
Loading/Inserting data into Hive tables
Overview of functions in Hive
Writing Basic Queries in Hive
Joining Data Sets and Set Operations in Hive
Windowing or Analytics Functions in Hive
Importing data from MySQL to HDFS
Performing Hive Import
Exporting Data from HDFS/Hive to MySQL
Submitting Sqoop Jobs and Incremental Imports
and more
Here are the objectives for the certification.
Provide Structure to the Data
Use Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala.
Create tables using a variety of data types, delimiters, and file formats
Create new tables using existing tables to define the schema
Improve query performance by creating partitioned tables in the metastore
Alter tables to modify the existing schema
Create views in order to simplify queries
Data Analysis
Use Query Language (QL) statements in Hive and Impala to analyze data on the cluster.
Prepare reports using SELECT commands including unions and subqueries
Calculate aggregate statistics, such as sums and averages, during a query
Create queries against multiple data sources by using join commands
Transform the output format of queries by using built-in functions
Perform queries across a group of rows using windowing functions
Exercises will be provided to prepare before attending the certification. Intention of the course is to boost the confidence to attend the certification.
All the demos are given on our state of the art Big Data cluster. If you do not have multi node cluster, you can sign up for our labs and practice on our multi node cluster.