Data Cleansing 101 - A Simplified Guide Developed For Business
Data Cleansing can be defined as the process of any Data Analyst, Data Scientist, Data Engineer or Data Architect reviewing and cleaning data to make it readable.
Data Cleansing can also be defined as part of Data Quality. It deals with the errors in the data, often by removing outliers that are not true parts of the dataset.
Data cleansing essentially deals with four aspects:
Identification
Identifying the records that are incorrect tensely or missing values.
Correction
Replacing correct values into identified records so that they can function properly
Transformation
Transforming data into a different format for future processing
Enrichment
Adding new data to make existing data more valuable
Data Cleansing is becoming increasingly imperative in the Data World because of the sheer size of Data collected by business. Data Cleansing ensures Data Integrity which means Data Quality.