Sale!

Taming Big Data with Apache Spark and Python – Hands On! – Udemy

(10 customer reviews)

$16

Category:

Description

What you’ll learn

  • Use DataFrames and Structured Streaming in Spark 3
  • Frame big data analysis problems as Spark problems
  • Use Amazon’s Elastic MapReduce service to run your job on a cluster with Hadoop YARN
  • Install and run Apache Spark on a desktop computer or on a cluster
  • Use Spark’s Resilient Distributed Datasets to process and analyze large data sets across many CPU’s
  • Implement iterative algorithms such as breadth-first-search using Spark
  • Use the MLLib machine learning library to answer common data mining questions
  • Understand how Spark SQL lets you work with structured data
  • Understand how Spark Streaming lets your process continuous streams of data in real time
  • Tune and troubleshoot large jobs running on a cluster
  • Share information between nodes on a Spark cluster using broadcast variables and accumulators
  • Understand how the GraphX library helps with network analysis problems

Show moreShow less

New! Updated for Spark 3, more hands-on exercises, and a stronger focus on DataFrames and Structured Streaming.

“Big data” analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You’ll learn those same techniques, using your own Windows system right at home. It’s easier than you might think.

Learn and master the art of framing data analysis problems as Spark problems through over 20 hands-on examples, and then scale them up to run on cloud computing services in this course. You’ll be learning from an ex-engineer and senior manager from Amazon and IMDb.

  • Learn the concepts of Spark’s DataFrames and Resilient Distributed Datastores

  • Develop and run Spark jobs quickly using Python

  • Translate complex analysis problems into iterative or multi-stage Spark scripts

  • Scale up to larger data sets using Amazon’s Elastic MapReduce service

  • Understand how Hadoop YARN distributes Spark across computing clusters

  • Learn about other Spark technologies, like Spark SQL, Spark Streaming, and GraphX

By the end of this course, you’ll be running code that analyzes gigabytes worth of information – in the cloud – in a matter of minutes. 

This course uses the familiar Python programming language; if you’d rather use Scala to get the best performance out of Spark, see my “Apache Spark with Scala – Hands On with Big Data” course instead.

We’ll have some fun along the way. You’ll get warmed up with some simple examples of using Spark to analyze movie ratings data and text in a book. Once you’ve got the basics under your belt, we’ll move to some more complex and interesting tasks. We’ll use a million movie ratings to find movies that are similar to each other, and you might even discover some new movies you might like in the process! We’ll analyze a social graph of superheroes, and learn who the most “popular” superhero is – and develop a system to find “degrees of separation” between superheroes. Are all Marvel superheroes within a few degrees of being connected to The Incredible Hulk? You’ll find the answer.

This course is very hands-on; you’ll spend most of your time following along with the instructor as we write, analyze, and run real code together – both on your own system, and in the cloud using Amazon’s Elastic MapReduce service. 7 hours of video content is included, with over 20 real examples of increasing complexity you can build, run and study yourself. Move through them at your own pace, on your own schedule. The course wraps up with an overview of other Spark-based technologies, including Spark SQL, Spark Streaming, and GraphX.

Wrangling big data with Apache Spark is an important skill in today’s technical world. Enroll now!

  • ” I studied “Taming Big Data with Apache Spark and Python” with Frank Kane, and helped me build a great platform for Big Data as a Service for my company. I recommend the course!  ” – Cleuton Sampaio De Melo Jr.

Who this course is for:

  • People with some software development background who want to learn the hottest technology in big data analysis will want to check this out. This course focuses on Spark from a software development standpoint; we introduce some machine learning and data mining concepts along the way, but that’s not the focus. If you want to learn how to use Spark to carve up huge datasets and extract meaning from them, then this course is for you.
  • If you’ve never written a computer program or a script before, this course isn’t for you – yet. I suggest starting with a Python course first, if programming is new to you.
  • If your software development job involves, or will involve, processing large amounts of data, you need to know about Spark.
  • If you’re training for a new career in data science or big data, Spark is an important part of it.

Course content

  • Getting Started with Spark
  • Spark Basics and the RDD Interface
  • SparkSQL, DataFrames, and DataSets
  • Advanced Examples of Spark Programs
  • Running Spark on a Cluster
  • Machine Learning with Spark ML
  • Spark Streaming, Structured Streaming, and GraphX
  • You Made It! Where to Go from Here.

10 reviews for Taming Big Data with Apache Spark and Python – Hands On! – Udemy

  1. Donny Phan

    Super practical. Lessons are catered towards anyone looking to find work in this industry. It felt very comprehensive and gave me a broad understanding of the programming spectrum

  2. Madhav raj Verma

    Thanks for your great effort. i am fully satisfied with this course the way you teach and your explanation are very clear ,The content you provide in your course no one can do this at this price.

  3. Sachin Gupta

    I really didn’t want to leave a low rating as Angela is a great teacher. The 1st half of this course was terrific. The 2nd half was terrible. Under the justification of “teaching students how to figure things out on their own”, pretty much all videos and all explanations were dropped. You were just told what to do, given links to documentation and told to figure it out on your own. I understand doing that to some degree, but to revert to that entirely for nearly half the content barely makes this a course. It’s just a list of things for you to learn, then you’re left on your own to learn them. The 2nd half was so bad, especially the data science component, that I didn’t bother finishing the course.

  4. Vincent Beaudet

    Amazing 40 days course.
    Angela is a great teacher.
    The other 60 days are all about web developement, interacting with web pages, on your own with little to no explanations. I did not expect that at all. I wanted to learn more about software and scripting.
    This left me disappointed , confused and i started to doubt myself. Not a fun experience after the amount of effort i’v put in this course.

    Exercices format and explanations for the first 40 days were worth it tho.

  5. Ben K

    Not just an introduction to python, but really helps you learn fundamental aspects of python and coding in general. Some parts may require some knowledge on the subject (data science comes to mind) and there is quite some web development in the course. So, a few areas were not completely to my liking (I would have liked to see it done differently), but this course deserves the 5 stars in my opinion.

  6. Omid Alikhel

    I found the method a bit difficult when a code is written and then changed back to something different, with no enough explanation of how something happened and where it came from or a step by step explanation of why something is happening, i have no doubt in the instructors talent, but we are beginners!

  7. Devang Jain

    The course is not updated and most of the solution codes don’t work and there are no video solutions towards the end

  8. Szymon Kozak

    I think that the course tutor is really good in giving right information to learn at the right time. Thanks to this fact, my understanding of coding in python after 29 days of learning is above my expectations.

  9. Begoña Ruiz Diaz

    Ha sido la mejor elección que podría haber hecho.

  10. Vaibhav Sachdeva

    I want to thank Angela for making such an amazing course. It really helped me explore more things with python.

Add a review

Your email address will not be published.