Data Cleaning in Snowflake: Techniques to Clean Messy Data
About this Course
in 2006, the British mathematician Clive Humby coined the phrase \"Data is the new Oil\". This analogy has been proven correct as data powers entire industries nowadays but if left unrefined, is effectively worthless. This 2.5 hours-long guided project is designed for business analysts & data engineers eager to learn how to Clean Messy Data in Snowflake Data Platform. By the end of the project, you will -Be able to identify common data quality issues then use SQL String functions to remove unwanted characters and split rows into multiple columns. -Extract dates from Text fields then use SQL date functions for comparisons and calculations. -Identify and correct missing and duplicated data then answer business questions using SQL statements. To achieve these objectives, we will work on a real example from the field, you will play the role of a Data Analyst in the marketing department, who has been tasked with answering a business question, but the customer data they have received presents several data quality challenges. Note: To be successful in this project you need to have Snowflake beginner knowledge such as Creating a trial account, Databases, Tables, and Virtual Warehouses. If you are not familiar with Snowflake and want to learn the basics, start with my previous Guided Project: Snowflake for Beginners: Make your First Snowsight Dashboard which will give you basic knowledge about Snowflake and will teach you how to create your trial account.Created by: Coursera Project Network
Related Online Courses
If your web hosting requirements aren\'t directly supported by the Azure Web app platform, you can leverage virtual machines to customize and control every aspect of ta webserver. In this course,... more
The purpose of this course is to provide you with an understanding of central bank policies and how such policies affect financial markets and the economy. The main aim of this course is to provide... more
In 2020 the world will generate 50 times the amount of data as in 2011. And 75 times the number of information sources (IDC, 2011). Being able to use this data provides huge opportunities and to... more
Welcome to the Database, Big Data, and DevOps Services in Google Cloud Platform (GCP) course! This course is designed to equip learners with comprehensive knowledge and practical skills in... more
The course begins with a discussion about data: how to improve data quality and perform exploratory data analysis. We describe Vertex AI AutoML and how to build, train, and deploy an ML model... more