Classbaze

Disclosure: when you buy through links on our site, we may earn an affiliate commission.

Data Engineering using Kafka and Spark Structured Streaming

A comprehensive Data Engineering course on building streaming pipelines using Kafka and Spark Structured Streaming
4.4
4.4/5
(41 reviews)
850 students
Created by

9.2

Classbaze Grade®

9.4

Freshness

8.1

Popularity

9.4

Material

A comprehensive Data Engineering course on building streaming pipelines using Kafka and Spark Structured Streaming
Platform: Udemy
Video: 9h 36m
Language: English
Next start: On Demand

Best Apache Kafka classes:

Classbaze Rating

Classbaze Grade®

9.2 / 10

CourseMarks Score® helps students to find the best classes. We aggregate 18 factors, including freshness, student feedback and content diversity.

Freshness

9.4 / 10
This course was last updated on 10/2021.

Course content can become outdated quite quickly. After analysing 71,530 courses, we found that the highest rated courses are updated every year. If a course has not been updated for more than 2 years, you should carefully evaluate the course before enrolling.

Popularity

8.1 / 10
We analyzed factors such as the rating (4.4/5) and the ratio between the number of reviews and the number of students, which is a great signal of student commitment.

New courses are hard to evaluate because there are no or just a few student ratings, but Student Feedback Score helps you find great courses even with fewer reviews.

Material

9.4 / 10
Video Score: 9.0 / 10
The course includes 9h 36m video content. Courses with more videos usually have a higher average rating. We have found that the sweet spot is 16 hours of video, which is long enough to teach a topic comprehensively, but not overwhelming. Courses over 16 hours of video gets the maximum score.
The average video length is 4 hours 26 minutes of 48 Apache Kafka courses on Udemy.
Detail Score: 9.8 / 10

The top online course contains a detailed description of the course, what you will learn and also a detailed description about the instructor.

Extra Content Score: 9.5 / 10

Tests, exercises, articles and other resources help students to better understand and deepen their understanding of the topic.

This course contains:

2 articles.
0 resource.
0 exercise.
0 test.

In this page

About the course

As part of this course, you will be learning to build streaming pipelines by integrating Kafka and Spark Structured Streaming. Let us go through the details about what is covered in the course.
•First of all, we need to have the proper environment to build streaming pipelines using Kafka and Spark Structured Streaming on top of Hadoop or any other distributed file system. As part of the course, you will start with setting up a self-support lab with all the key components such as Hadoop, Hive, Spark, and Kafka on a single node Linux-based system.
•Once the environment is set up you will go through the details related to getting started with Kafka. As part of that process, you will create a Kafka topic, produce messages into the topic as well as consume messages from the topic.
•You will also learn how to use Kafka Connect to ingest data from web server logs into Kafka topic as well as ingest data from Kafka topic into HDFS as a sink.
•Once you understand Kafka from the perspective of Data Ingestion, you will get an overview of some of the key concepts of related Spark Structured Streaming.
•After learning Kafka and Spark Structured streaming separately, you will build a streaming pipeline to consume data from Kafka topic using Spark Structured Streaming, then process and write to different targets.
•You will also learn how to take care of incremental data processing using Spark Structured Streaming.
Course Outline
Here is a brief outline of the course. You can choose either Cloud9 or GCP to provision a server to set up the environment.
•Setting up Environment using AWS Cloud9 or GCP
•Setup Single Node Hadoop Cluster
•Setup Hive and Spark on top of Single Node Hadoop Cluster
•Setup Single Node Kafka Cluster on top of Single Node Hadoop Cluster
•Getting Started with Kafka
•Data Ingestion using Kafka Connect – Web server log files as a source to Kafka Topic
•Data Ingestion using Kafka Connect – Kafka Topic to HDFS a sink
•Overview of Spark Structured Streaming
•Kafka and Spark Structured Streaming Integration
•Incremental Loads using Spark Structured Streaming
Udemy based support
In case you run into technical challenges while taking the course, feel free to raise your concerns using Udemy Messenger. We will make sure that issue is resolved in 48 hours.

What can you learn from this course?

✓ Setting up self support lab with Hadoop (HDFS and YARN), Hive, Spark, and Kafka
✓ Overview of Kafka to build streaming pipelines
✓ Data Ingestion to Kafka topics using Kafka Connect using File Source
✓ Data Ingestion to HDFS using Kafka Connect using HDFS 3 Connector Plugin
✓ Overview of Spark Structured Streaming to process data as part of Streaming Pipelines
✓ Incremental Data Processing using Spark Structured Streaming using File Source and File Target
✓ Integration of Kafka and Spark Structured Streaming – Reading Data from Kafka Topics

What you need to start the course?

• Laptop with decent configuration
• Decent internet speed to watch the lessons
• Self Support lab (instructions will be provided as part of the course) or ITVersity labs
• Knowledge about Functional Programming (preferably Python or Scala)
• Knowledge or experience using Spark

Who is this course is made for?

• Experienced ETL Developers who want to learn Kafka and Spark to build streaming pipelines
• Experienced PL/SQL Developers who want to learn Kafka and Spark to build streaming pipelines
• Beginner or Experienced Data Engineers who want to learn Kafka and Spark to build streaming pipelines

Are there coupons or discounts for Data Engineering using Kafka and Spark Structured Streaming ? What is the current price?

The course costs $12.99. And currently there is a 48% discount on the original price of the course, which was $24.99. So you save $12 if you enroll the course now.
The average price is $15.9 of 48 Apache Kafka courses. So this course is 18% cheaper than the average Apache Kafka course on Udemy.

Will I be refunded if I'm not satisfied with the Data Engineering using Kafka and Spark Structured Streaming course?

YES, Data Engineering using Kafka and Spark Structured Streaming has a 30-day money back guarantee. The 30-day refund policy is designed to allow students to study without risk.

Are there any financial aid for this course?

Currently we could not find a scholarship for the Data Engineering using Kafka and Spark Structured Streaming course, but there is a $12 discount from the original price ($24.99). So the current price is just $12.99.

Who will teach this course? Can I trust Durga Viswanatha Raju Gadiraju?

Durga Viswanatha Raju Gadiraju has created 16 courses that got 9,322 reviews which are generally positive. Durga Viswanatha Raju Gadiraju has taught 224,446 students and received a 4.4 average review out of 9,322 reviews. Depending on the information available, we think that Durga Viswanatha Raju Gadiraju is an instructor that you can trust.
Technology Adviser and Evangelist
13+ years of experience in executing complex projects using vast array of technologies including Big Data and Cloud.

ITVersity, Inc. – a US based organization to provide quality training for IT professionals and we have the track record of training hundreds of thousands of professionals globally.

Building IT career for people with required tools such as high quality material, labs, live support etc to upskill and cross skill is paramount for our organization.

At this time our training offerings are focused on following areas:

* Application Development using Python and SQL

* Big Data and Business Intelligence

* Cloud

* Datawarehousing, Databases

Show more

9.2

Classbaze Grade®

9.4

Freshness

8.1

Popularity

9.4

Material

Platform: Udemy
Video: 9h 36m
Language: English
Next start: On Demand

Classbaze recommendations for you