Classbaze

Disclosure: when you buy through links on our site, we may earn an affiliate commission.

Apache Spark 3 with Scala: Hands On with Big Data!

New! Updated for Spark 3.0! “Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data:Apache...
4.5
4.5/5
(8 reviews)
413 students
Created by

9.5

Classbaze Grade®

9.8

Freshness

8.6

Popularity

9.6

Material

New! Updated for Spark 3.0!
Platform: Skillshare
Video: 8h 50m
Language: English
Next start: On Demand

Best Apache Spark classes:

Classbaze Rating

Classbaze Grade®

9.5 / 10

CourseMarks Score® helps students to find the best classes. We aggregate 18 factors, including freshness, student feedback and content diversity.

Freshness

9.8 / 10
This course was last updated on 8/2021.

Course content can become outdated quite quickly. After analysing 71,530 courses, we found that the highest rated courses are updated every year. If a course has not been updated for more than 2 years, you should carefully evaluate the course before enrolling.

Popularity

8.6 / 10
We analyzed factors such as the rating (4.5/5) and the ratio between the number of reviews and the number of students, which is a great signal of student commitment.

New courses are hard to evaluate because there are no or just a few student ratings, but Student Feedback Score helps you find great courses even with fewer reviews.

Material

9.6 / 10
Video Score: 8.9 / 10
The course includes 8h 50m video content. Courses with more videos usually have a higher average rating. We have found that the sweet spot is 16 hours of video, which is long enough to teach a topic comprehensively, but not overwhelming. Courses over 16 hours of video gets the maximum score.
The average video length is 6 hours 47 minutes of 113 Apache Spark courses on Skillshare.
Detail Score: 10.0 / 10

The top online course contains a detailed description of the course, what you will learn and also a detailed description about the instructor.

Extra Content Score: 10.0 / 10

Tests, exercises, articles and other resources help students to better understand and deepen their understanding of the topic.

This course contains:

0 article.
0 resource.
0 exercise.
0 tests or quizzes.

In this page

About the course

New! Updated for Spark 3.0!

“Big data” analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including AmazonEBayNASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You’ll learn those same techniques, using your own Windows system right at home. It’s easier than you might think, and you’ll be learning from an ex-engineer and senior manager from Amazon and IMDb.

Spark works best when using the Scala programming language, and this course includes a crash-course in Scala to get you up to speed quickly. For those more familiar with Python however, a Python version of this class is also available: “Taming Big Data with Apache Spark and Python – Hands On”.

Learn and master the art of framing data analysis problems as Spark problems through over 20 hands-on examples, and then scale them up to run on cloud computing services in this course.

  • Learn the concepts of Spark’s Resilient Distributed Datastores

  • Get a crash course in the Scala programming language

  • Develop and run Spark jobs quickly using Scala

  • Translate complex analysis problems into iterative or multi-stage Spark scripts

  • Scale up to larger data sets using Amazon’s Elastic MapReduce service

  • Understand how Hadoop YARN distributes Spark across computing clusters

  • Practice using other Spark technologies, like Spark SQL, DataFrames, DataSets, Spark Streaming, and GraphX

By the end of this course, you’ll be running code that analyzes gigabytes worth of information – in the cloud – in a matter of minutes. 

We’ll have some fun along the way. You’ll get warmed up with some simple examples of using Spark to analyze movie ratings data and text in a book. Once you’ve got the basics under your belt, we’ll move to some more complex and interesting tasks. We’ll use a million movie ratings to find movies that are similar to each other, and you might even discover some new movies you might like in the process! We’ll analyze a social graph of superheroes, and learn who the most “popular” superhero is – and develop a system to find “degrees of separation” between superheroes. Are all Marvel superheroes within a few degrees of being connected to SpiderMan? You’ll find the answer.

This course is very hands-on; you’ll spend most of your time following along with the instructor as we write, analyze, and run real code together – both on your own system, and in the cloud using Amazon’s Elastic MapReduce service. 7.5 hours of video content is included, with over 20 real examples of increasing complexity you can build, run and study yourself. Move through them at your own pace, on your own schedule. The course wraps up with an overview of other Spark-based technologies, including Spark SQL, Spark Streaming, and GraphX.

Enroll now, and enjoy the course!

“I studied Spark for the first time using Frank’s course “Apache Spark 2 with Scala – Hands On with Big Data!”. It was a great starting point for me,  gaining knowledge in Scala and most importantly practical examples of Spark applications. It gave me an understanding of all the relevant Spark core concepts,  RDDs, Dataframes & Datasets, Spark Streaming, AWS EMR. Within a few months of completion, I used the knowledge gained from the course to propose in my current company to  work primarily on Spark applications. Since then I have continued to work with Spark. I would highly recommend any of Franks courses as he simplifies concepts well and his teaching manner is easy to follow and continue with!  “ – Joey Faherty

What can you learn from this course?

This class is full of many interesting hands-on activities, involving the analysis of movie ratings and connections between superheroes! But here’s one more challenge you can try after completing the course:

Write a Spark script that analyzes the one-million-rating dataset from MovieLens we used in the course. Let’s figure out what the worst movie ever made was!

But, we don’t want a movie that only has one rating, which happens to be one star, to be the “winner.” Start by producing a list of the movies sorted by average rating, which isn’t hard – but then sort that list by the number of ratings, so that movies that have a bad rating and also a large number of ratings are the ones that show up first.

You’ll probably still need to scroll past a lot of spurious 1-star results, however – so next, implement a filter that removes any movies that have fewer than, say, 10 ratings. That should filter out obscure films that we just don’t have enough data for. 10 is an arbitrary cutoff; you may find yourself playing with that number.

You’ll also face the challenge of the output being split up across the various cores that are processing this data. You can try just using “local” instead of “local[*]” to get around that, but it would be even better to devise a way to merge the results together – either with a script, or by keeping track of a global “winner” with a broadcast variable.

What looks to be the worst movie ever?

What you need to start the course?

Basic knowledge of Apache Spark is required to start this course, as this is an intermediate level course.

Who is this course is made for?

This course was made for intermediate-level students.

Are there coupons or discounts for Apache Spark 3 with Scala: Hands On with Big Data! ? What is the current price?

You can enrol in this course with a Skillshare subscription that costs $8/month, but you start with a FREE 7-day trial. You can also enrol in thousands of courses on a variety of topics with your subscription, including several Apache Spark courses.
The average price is $17.1 of 113 Apache Spark courses. So this course is -100% more expensive than the average Apache Spark course on Skillshare.

Will I be refunded if I'm not satisfied with the Apache Spark 3 with Scala: Hands On with Big Data! course?

There is no money-back guarantee with Skillshare, but you can start with a free one-week trial to learn without risk. With the subscription, you can download classes to your tablet or phone using the Skillshare app.

Are there any financial aid for this course?

At the moment we couldn't find any available scholarship forApache Spark 3 with Scala: Hands On with Big Data!, but you can access more than 30 thousand classes for $8/month on Skillshare, including this one!

Who will teach this course? Can I trust Frank Kane?

Frank Kane has created 6 courses that got 82 reviews which are generally positive. Frank Kane has taught 5,586 students and received a 4.5 average review out of 82 reviews. Depending on the information available, we think that Frank Kane is an instructor that you can trust.
Founder of Sundog Education, ex-Amazon
Machine Learning & Big Data, ex-Amazon
Browse all courses by on Classbaze.

9.5

Classbaze Grade®

9.8

Freshness

8.6

Popularity

9.6

Material

Platform: Skillshare
Video: 8h 50m
Language: English
Next start: On Demand

Classbaze recommendations for you