Spark, Hadoop, and Snowflake for Data Engineering
About this Course
e.g. This is primarily aimed at first- and second-year undergraduates interested in engineering or science, along with high school students and professionals with an interest in programmingGain the skills for building efficient and scalable data pipelines. Explore essential data engineering platforms (Hadoop, Spark, and Snowflake) as well as learn how to optimize and manage them. Delve into Databricks, a powerful platform for executing data analytics and machine learning tasks, while honing your Python data science skills with PySpark. Finally, discover the key concepts of MLflow, an open-source platform for managing the end-to-end machine learning lifecycle, and learn how to integrate it with Databricks. This course is designed for learners who want to pursue or advance their career in data science or data engineering, or for software developers or engineers who want to grow their data management skill set. In addition to the technologies you will learn, you will also gain methodologies to help you hone your project management and workflow skills for data engineering, including applying Kaizen, DevOps, and Data Ops methodologies and best practices. With quizzes to test your knowledge throughout, this comprehensive course will help guide your learning journey to become a proficient data engineer, ready to tackle the challenges of today\'s data-driven world.Created by: Duke University

Related Online Courses
This is the third course of a four-course series for cloud architects and engineers with existing Azure knowledge. It compares Google Cloud and Azure solutions and guides professionals on their... more
This comprehensive Multichannel Marketing Specialization equips you with the essential skills to excel in various facets of digital marketing, from social media and email marketing to mobile app... more
While prescription opioids serve an invaluable role for the treatment of cancer pain and pain at the end of life, their overuse for acute and chronic non-cancer pain as well as the increasing... more
This is a self-paced lab that takes place in the Google Cloud console. Use BigQuery to explore the NCAA dataset of basketball games, teams, and players. The data covers plays from 2009 and scores... more
In this lab you will create a serverless web app with Firebase, which allows users to upload information and make appointments with the fictional Pet Theory clinic.Created by: Google Cloud more