2.9  4 reviews on Udemy

Real World Spark 2 - Interactive Python pyspark Core

Build a Vagrant Python pyspark cluster and Code/Monitor against Spark 2 Core. The modern cluster computation engine.
Course from Udemy
 142 students enrolled
 en
Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment
Ability to automate the installation of software across multiple Virtual Machines
Code in Python against Spark. Transformation, Actions and Spark Monitoring

Note : This course is built on top of the "Real World Vagrant - Build an Apache Spark Development Env! - Toyin Akin" course. So if you do not have a Spark environment already installed (within a VM or directly installed), you can take the stated course above.

Spark’s python shell provides a simple way to learn the API, as well as a powerful tool to analyze data interactively. It is available in Python. Start it by running the following anywhere within a bash terminal within the built Virtual Machine 

pyspark

Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). RDDs can be created from collections, Hadoop InputFormats (such as HDFS files) or by transforming other RDDs

Spark Monitoring and Instrumentation

While creating RDDs, performing transformations and executing actions, you will be working heavily within the monitoring view of the Web UI.

Every SparkContext launches a web UI, by default on port 4040, that displays useful information about the application. This includes:

A list of scheduler stages and tasks A summary of RDD sizes and memory usage Environmental information. Information about the running executors


Why Apache Spark ...

Apache Spark run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Apache Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing. Apache Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells. Apache Spark can combine SQL, streaming, and complex analytics.

Apache Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.

Real World Spark 2 - Interactive Python pyspark Core
$ 89.99
per course
Also check at

FAQs About "Real World Spark 2 - Interactive Python pyspark Core"

About

Elektev is on a mission to organize educational content on the Internet and make it easily accessible. Elektev provides users with online course details, reviews and prices on courses aggregated from multiple online education providers.
DISCLOSURE: This page may contain affiliate links, meaning when you click the links and make a purchase, we receive a commission.

SOCIAL NETWORK