Data Cleaning in Machine Learning: Guideline and Checklist

Melissa AU Team | Australia, Data Cleansing | , , ,

Machine Learning (ML) is helping businesses across industries solve problems and deliver tangible benefits. The chatbots customers interact with, to get more information on customized product recommendations on websites – these are examples of how ML helps businesses improve their customer service and simultaneously increase conversion rates. You can build ML models for many real-world applications but its usefulness hinges on one key factor – the quality of data being analysed.… Read More

Data Cleaning – The What, Why and How

Melissa AU Team | Data Cleansing | , ,

Deciding what to include and exclude in the new product range – deciding price points – estimating delivery timelines – all of this and more is based on analyzing vast amounts of data. In 2020, on average, every individual generated 1.7 megabytes of data per second. But, quantity isn’t everything. Poor data quality can cost businesses between $9.7 million and $14.2 million annually. Thus, as our reliance on data increases, so must the efforts to clean data.

Read more “Data Cleaning – The What, Why and How”

A Quick 7-Minute Tutorial on Cleaning Your Data in Excel

Blog Administrator | Address Correction, Address Quality, Address Validation, Address Verification, Data Quality, Listware for Excel | , , ,

In this short demo, learn how to use Listware for Excel – our free Excel add-in that cleans and enriches the contact information in your list. Learn how to check, verify, and update names, addresses, phone numbers and email addresses. Also find out how to add missing contact information to your contact record. Listware for Excel’s summary counts and color-coordinated cells make it easy to view and analyze your results, so you know exactly how your data’s been improved.
Read More

Record Linkage & Fuzzy Matching Part 2a (More on “Blocking” for Performance Improvement)

Blog Administrator | Uncategorized | , ,

Over at the LinkedIn Group run by Henrik Liliendahl Sorensen for Data Matching, Bill Winkler, principal researcher at the us census bureau has shared several reference papers on “blocking.” They are excellent and I wanted to share them with you.


According to Winkler “The following three papers are primarily concerned with ‘blocking.’ The third gives a methodology for estimating false negatives (false non-matches) in a narrow range of situations.”Read More