3.9  10 reviews on Udemy

Learning Path: Data Science With Apache Spark 2

Get started with Spark for large-scale distributed data processing and data science
Course from Udemy
 161 students enrolled
 en
Get to know the fundamentals of Spark 2.0 and the Spark programming model using Scala and Python
Know how to use Spark SQL and DataFrames using Scala and Python
Get an introduction to Spark programming using R
Develop a complete Spark application
Obtain and clean data before processing it
Understand the Spark machine learning algorithm to build a simple pipeline
Work with interactive visualization packages in Spark
Apply data mining techniques on the available datasets
Build a recommendation engine

The real power and value proposition of Apache Spark is its speed and platform to execute data processing and data science tasks. Sounds interesting? Let’s see how easy it is!

Packt’s Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.

Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists. Spark's unique use case is that it combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations to allow data scientists to tackle the complexities that come with raw unstructured datasets.

This Learning Path starts with an introduction tour of Apache Spark 2. We will look at the basics of Spark, introduce SparkR, then look at the charting and plotting features of Python in conjunction with Spark data processing, and finally take a thorough look at Spark's data processing libraries. We then develop a real-world Spark application. Next, we will help you become comfortable and confident working with Spark for data science by exploring Spark’s data science libraries on a dataset of tweets.

The goal of this course to introduce you to Apache Spark 2 and teach you its data processing and data science libraries so that you are equipped with the skills required from modern data scientists.

This Learning Path is authored by some of the best in their fields.

Rajanarayanan Thottuvaikkatumana

Rajanarayanan Thottuvaikkatumana, or Raj, is a seasoned technologist with more than 23 years of software development experience at various multinational companies. His experience includes architecting, designing, and developing software applications. He has worked on various technologies including major databases, application development platforms, web technologies, and big data technologies. Currently he is building a next generation Hadoop YARN-based data processing platform and an application suite built with Spark using Scala.

Eric Charles

Eric Charles has 10 years’ experience in the field of Data Science and is the founder of Datalayer, a social network for Data Scientists. His typical day includes building efficient processing with advanced machine learning algorithms, easy SQL, streaming and graph analytics. He also focuses a lot on visualization and result sharing. He is passionate about open source and is an active Apache Member. He regularly gives talks to corporate clients and at open source events. 

Learning Path: Data Science With Apache Spark 2
$ 94.99
per course
Also check at

FAQs About "Learning Path: Data Science With Apache Spark 2"

About

Elektev is on a mission to organize educational content on the Internet and make it easily accessible. Elektev provides users with online course details, reviews and prices on courses aggregated from multiple online education providers.
DISCLOSURE: This page may contain affiliate links, meaning when you click the links and make a purchase, we receive a commission.

SOCIAL NETWORK