Are you familiar with Hadoop ecosystem. If yes, then you're already aware about trouble faced while configuring all the setup done locally. HDInsight is the solution for all.
Microsoft HDInsight is the open source framework, which is easy to use as well as cost effective. It comes with Apache Hadoop, Apache Spark, Apache Kafka, Apache HBase, Apache Hive, Apache Storm, Machine Learning and many more.
Microsoft HDInsight supports Hadoop as well as a Spark ecosystem with latest version of the tools. It makes easy to integrate with other Azure services like Azure Blob storage, Azure Data lake analytics, Azure CosmosDB and so on.
It has a dedicated console where you'll able to monitor jobs running on HDInsight and different clusters. With HDInsight you can preferred productivity tools, including Visual Studio, Eclipse, IntelliJ, Jupyter and Zeppelin and also choose programming languages such as Scala, Python, R, JavaScript and .NET.
In this learning path, you'll learn to use HDInsight in Hadoop and Spark ecosystem.
HDFS: It stands for Hadoop Distributed File Sysytem, the backbone of Hadoop Ecosystem which makes it possible to store different types of large data.
YARN as the brain of your Hadoop Ecosystem. It performs all your processing activities by allocating resources and scheduling tasks.
MAPREDUCE is a core component of processing in a Hadoop Ecosystem
APACHE PIG used to analyze larger sets of data representing them as data flows
APACHE HIVE is a data warehousing component which performs reading, writing and managing large data sets in a distributed environment using SQL-like interface.
APACHE MAHOUT is to apply machine learning algorithms.
APACHE SPARK is a framework for real time data analytics in a distributed computing environment, written in Scala. It is 100x faster than Hadoop for large scale data processing by exploiting in-memory computations and other optimizations.
APACHE HBASE an open source NOSQL database, written in java and supports all types of data
APACHE ZOOKEEPER coordinates with h=Hadoop services in a distributed environment.
APACHE OOZIE work as a clock and alarm service inside Hadoop Ecosystem
APACHE FLUME ingests unstructured or semi structured data into HDFS
APACHE SQOOP ingest structured data into HDFS
APCHE AMBARI helps you in provision, manage and monitor clusters
Enroll now and learn to work in distributed environment with Microsoft HDInsight.