*** THIS COURSE IS NOT FOR BEGINNERS ***
If you are a Big Data Enthusistic then you must know about Hadoop. In this course, we will discuss every corner of Hadoop 3.0
What is Hadoop?
Hadoop is an Opensource Component which is a part of the Apache foundation, it is a Java-Based framework for data storage and processing of Large Datasets in a distributed environment using commodity hardware.
In this course you will learn :
Introduction to Big Data
Introduction to Hadoop
Introduction to Apache Hadoop 1x - Part 1
Why we need Apache Hadoop 3.0?
The motivation of Hadoop 3.0
Features of Hadoop 3.0
Other Improvements on Hadoop 3.0
Pre-requistics of Lab
Setting up a Virtual Machine
Linux fundamentals - Part 1
Linux Users and File Permissions
Packages Installation for Hadoop 3x
Networking and SSH connection
Setup the environment for Hadoop 3x
Inside Hadoop 3x directory structure
EC Architecture Extensions
Setting up Hadoop 3x Cluster
Cloning Machines and Changing IP
Formatting Cluster and Start Services
Start and Stop Cluster
HDFS Commands
Erasure Coding Commands
Running a YARN application
Cloning a machine for Commissioning
Commissioning a node
Decommissioning a node
Installing Hive on Hadoop
Working with Hive
Types of Hadoop Schedulers
Typical Hadoop Production Environment