Record Linkage & Fuzzy Matching Part 2a (More on “Blocking” for Performance Improvement)

Over at the LinkedIn Group run by Henrik Liliendahl Sorensen for Data Matching, Bill Winkler, principal researcher at the us census bureau has shared several reference papers on "blocking." They are excellent and I wanted to share them with you.   According to Winkler "The following three papers are primarily concerned with 'blocking.' The third gives a methodology for estimating…

Continue Reading

Record Linkage and Fuzzy Matching Part 2

This blog series will address overall the steps necessary for efficient data/record processing that include a record linkage or fuzzy matching step.  In part 1, we covered the overall approach.   Today, we will cover the following steps:   1. Categorize 2. Split records   They are defined in academia as creating a "Blocking Index." (We will cover cleansing next;…

Continue Reading

Survey: Most Common BI Problem…Can You Guess What It Is?

Recent survey results uncovered by U.K.-based Business Application Research Center (BARC) surprised even its researchers! This year, for the first time time, the biggest company complaint on Business Intelligence (BI) issues wasn't slow query performance, company politics, or even a lack of end-user skills. Find out what an overwhelming majority of companies say is their biggest BI obstacle. www.melissadata.com/enews/articles/10212010/2.htm  …

Continue Reading

Record Linkage and Fuzzy Matching Part 1

This will be the first in a series of posts identifying similar records between two different sources or grouping of records from a single source, based on existing column string of values. We will define an approach, review actual implementations with various tools and vendor's products. There are many facets to review. I would like to start by drawing from…

Continue Reading

Melissa Data Launches Multiplatform API RightFielder Object

RightFielder Object is a multiplatform API that will intelligently parse and organize inconsistent or unformatted data into correctly formatted fields. Using proprietary parsing logic, RightFielder Object can analyze free-form data, recognize where fields begin and end, and assign the contents to the correct output property. RightFielder Object also identifies multiple types of data in the same field (e.g. city and state)…

Continue Reading

Revenues Jump When Big Companies Improve Data, Study Suggests

Small tweaks and incremental investments in data accessibility and intelligence can have big returns in revenues, growth and innovation, according to findings from a study of Fortune 1000 companies released today. Touting millions of dollars in added revenue possibilities and avenues of business, the study - entitled "Measuring the Business Impacts of Effective Data" - was conducted by Sybase, an…

Continue Reading

Melissa Data Report from Oracle OpenWorld 2010

This year, Oracle OpenWorld was bigger than ever. With the recent acquisition of Sun MicroSystems, this year's event also featured MySQL Sunday and the JavaOne conference. Held in downtown San Francisco at the Moscone Center, this conference featured tens of thousands of attendees and hundreds of exhibitors (one of which was us). This event wasn't all work though, as Oracle…

Continue Reading