0   reviews on Udemy

Learn Apache Spark in Python

Processing a million word text corpus using Pyspark and Window SQL
Course from Udemy
 10 students enrolled
 en
In this course you'll learn the physical components of a Spark cluster and the Spark computing framework.
You will build your own local standalone cluster.
You will write Spark code.
You will learn how to run Spark jobs.
You will create Spark tables and query them using SQL.
You will learn a process for creating successful Spark applications.
You will profile a Spark application.
You will tune a Spark application.
You will learn 30+ Spark commands.
You will use Spark SQL window functions.

In this course you'll learn the physical components of a Spark cluster, and the Spark computing framework. You’ll build your own local standalone install of Pyspark. You’ll write Spark code. You’ll learn how to run Spark jobs in a variety of ways. You’ll create Spark tables and query them using SQL. You will learn details of Spark internals. You will learn a process for creating successful Spark applications.


What am I going to get from this course?

  • Build Spark applications.

  • Tune Spark applications.

  • Profile Spark applications.

  • Tackle 3 coding projects.

  • Learn 30+ Spark commands.

  • Master Spark SQL window functions.

  • Step through 900 lines of Spark code.

  • Apply pro tips and best practices tested in production.


Why should you take this course?


Because your time is valuable and you are data driven. Lost opportunity is a major contributor to future regret. Apply these core underlying principles to your own life and build your own dreams.

Technology moves fast. Keeping up using only free materials from scattered sources is penny wise and pound foolish. Using only your own judgment to sift through the abundance of materials before you have learned what is relevant, and what is not, is backwards. You will be better able to judge the relevance of new content to your mission once you have a good foundation. You will be able to more rapidly consume -- and more importantly : synthesize -- new materials. Learning in a way that allows you to synthesize what you have learned is more effective than simply devouring all material indiscriminately.

Have you ever learned a new spoken language? Language instructors will tell you : it is better to learn an aspect of the language deeply and be able to apply it immediately. This will get your conversational skills up to speed the most rapidly. Then, your ability to utilize what you have learned, and with confidence, starts to have a compounding effect. Learning a programming language is analogous. There is a lot to learn and too little time to exhaustively cover the abundance of material that is available.

There are core principles that when leveraged effectively accelerate you rapidly through the ramp up phase.

The best learning regimen provides new skills and insights, but also solidifies what your present knowledge base. Your ability to leverage your current level of knowledge to build on it by learning is akin to a building your principal capital, namely, your knowledge base and skill set.

With the proper approach, learning has a compounding effect on knowledge. Once you have mastery of what you have learned in both breadth and depth you are positioned to put your knowledge to work for your mission, and convert it into a currency you can spend to improve your life and the lives of those around you.

New learners are entering this field every day. Learners that make effective use of their time quickly surpass learners who do not.

Be mindful of the most important resource you wield : your attention. Spend it wisely. Learners who know the same things as the all the others don't stand out from the crowd. They may do well enough in a rising tide that lifts all boats. The doers who have an edge will win a slot on the fastest ship in the finest fleet and do more than just ride the tide. They will accomplish what was previously unimagined. Investing in yourself is wise; being overly frugal about how you spend your precious time is foolish. Your attention is limited. Most importantly, it is not free. With every day that you put off your future you miss out on the compounding effect that future growth has on present capability. Meanwhile the gap between you and other learners that understand these lessons widens.

A small edge today blossoms into a bigger one next week, and a gaping lead by next month.


Use your time and attention wisely.


Don't scrimp on your biggest investment :


Yourself.





Cut through the clutter

If you want to ramp up quickly on Apache Spark, then these courses are for you.

There is an abundance of training material available for learning Spark.

Then why should you use this course?

Because your time is valuable. Wading through the tens of thousands of resources that are appropriate to your level of analysis while ensuring that you filter out ones that are outdated takes time and judgment. Time spent ramping up on the fundamentals has an opportunity cost that is far higher than the price of this course.

I've collected a comprehensive yet succinct lesson plan containing material gleaned from hundreds of sources, including pro tips from Spark contributors, insights from Spark consultants, many conference presentations, and first-hand experience applying this powerful tool to big data applications in production serving millions of users. If you want to cut through the clutter and ramp up quickly, then this course will help you accomplish that.



What you will learn in this course

We cover a range of topics, ranging from concepts, to architecture, to managing your development environment, as well as several use cases.

This course is not only for developers, but for managers and technical leads who don’t necessarily write code, but want to be able to perform code reviews or analyze the performance of a running application.



Prerequisites

All you will need is a standard development-grade laptop and an internet connection.

You should be comfortable at the unix command line.

Although this course is taught using the Python programming language, it is also suitable for users who plan to primarily utilize R and SQL.



What you will do in this course

  • Step through over 900 lines of code in 9 original exercises utilizing over 30 Spark commands

  • Learn best practices that are tested in production application

  • Utilize over 30 Spark commands, including :

    • read, select, count, join, where, groupBy, show, drop_duplicates, distinct, limit, subtract, withColumn, sort, split, explode, join, lower, length, withColumn, alias, UDFs, explain, sql, cache, persist, unpersist, catalog, listTables, isCached, createOrReplaceTempView, monotonically_increasing_id, cacheTable, clearCache, and others.



Applications you can tackle with Spark

  • Discover statistically improbable phrases in text

  • Perform anomaly detection on log data

  • Apply topic modeling to text

  • Build a recommender

  • Do trend analysis



What you can do with Spark

  • Develop on your laptop and migrate to a cluster later without changing your code.

  • Spin up a cluster to use for a short period of time and spin it down when finished.

  • Run applications in a shared long-running cluster environment that autoscales up and down in size as workloads demand.

  • Develop applications intended to run on a single machine that can later be migrated to a cluster without modifying the code.

  • Work within a dynamically typed or a statically typed language, including Python, Scala, Java, R, and SQL.

  • Access data stores including AWS S3, HDFS, Hive, HBase, Cassandra, or any Hadoop data source, as well as Kafka, Redshift, among others.

  • Develop and deploy using the same language and framework.

  • Develop and test directly on a cluster.

  • Deploy applications programmatically.

  • Manage clusters programmatically.



What you'll be able to do upon completing this course

Tackle nontrivial applications, including

  • Approximate K Nearest Neighbors

  • Alternating Least Squares

  • K-means clustering

  • Streaming

You'll also have a solid foundation for data engineering applications.


Given the flexibility of Spark you are only limited by the data available to you and your imagination.



Learn Apache Spark in Python
$ 29.99
per course
Also check at

FAQs About "Learn Apache Spark in Python"

About

Elektev is on a mission to organize educational content on the Internet and make it easily accessible. Elektev provides users with online course details, reviews and prices on courses aggregated from multiple online education providers.
DISCLOSURE: This page may contain affiliate links, meaning when you click the links and make a purchase, we receive a commission.

SOCIAL NETWORK