Building ETL and Data Pipelines with Bash, Airflow and Kafka
About this Course
Well-designed and automated data pipelines and ETL processes are the foundation of a successful Business Intelligence platform. Defining your data workflows, pipelines and processes early in the platform design ensures the right raw data is collected, transformed and loaded into desired storage layers and available for processing and analysis as and when required. This course is designed to provide you the critical knowledge and skills needed by Data Engineers and Data Warehousing specialists to create and manage ETL, ELT, and data pipeline processes. Upon completing this course you’ll gain a solid understanding of Extract, Transform, Load (ETL), and Extract, Load, and Transform (ELT) processes; practice extracting data, transforming data, and loading transformed data into a staging area; create an ETL data pipeline using Bash shell-scripting, build a batch ETL workflow using Apache Airflow and build a streaming data pipeline using Apache Kafka. You’ll gain hands-on experience with practice labs throughout the course and work on a real-world inspired project to build data pipelines using several technologies that can be added to your portfolio and demonstrate your ability to perform as a Data Engineer. This course pre-requisites that you have prior skills to work with datasets, SQL, relational databases, and Bash shell scripts.Created by: IBM
Level: Introductory

Related Online Courses
This course provides an introduction to the Java programming language. It gives students a foundational overview and history of Java, and students will learn about the language’s basic syntax. At t... more
IBM CICS is the trusted core of enterprise applications and transaction processing. You will experience writing, updating and running CICS applications as well as the new APIs, capabilities and... more
With the advent of systems like AWS Lambda, the term serverless gained much popularity. However, many people are still unsure what it is for, and how it can help them build applications faster than... more
ビッグデータやAI,いま,そういった言葉が世の中に満ち溢れています。それは,いろいろなことが計算に載るようになって,ビッグデータの利用や,それを使ったAI技術が本格化してきたからです。こうした潮流の中心となる「計算」の活用法を開発してきたのがコンピュータサイエンスという分野です。このコースでは,そのコンピュータサイエンスのエッセンスを学びます。ごく基礎的・入門的なところからはじめ,最先端のコンピュ... more
Port cities are dynamic environments. They face ever-changing challenges and demands from port activities under continually evolving economic and environmental circumstances. They also offer a rich... more