Data Scrubbing Software Companies and Data Cleansing Tools
Over 265,036 professionals have used IT Central Station research.
Compare the best Data Scrubbing Software vendors based on product reviews, ratings, and comparisons.
All reviews and ratings are from real users, validated by our triple authentication process.
The total ranking of a product, represented by the bar length, is based on a weighted aggregate score.
The score is calculated as follows: The product with the highest count in each area gets the highest available score.
(20 points for Reviews; 16 points for Views, Comparisons, and Followers.)
Every other product gets assigned points based on its total in proportion to the #1 product in
that area. For example, if a product has 80% of the number of reviews compared to the product
with the most reviews then the product's score for reviews would be 20% (weighting factor) *
80% = 16. For Average Rating, the maximum score is 32 points awarded linearly based on our
rating scale of 1-10. If a product has fewer than ten reviews, the point contribution
for Average Rating is reduced (one-third reduction in points for products with 5-9 reviews;
two-thirds reduction for products with fewer than five reviews). Reviews that are more than 24 months old,
as well as those written by resellers, are completely excluded from the ranking algorithm.
The high value in this tool is its relatively low cost, ease of use, tight integration with SSIS, and attribute-level advanced survivor-ship logic. This vendor offers a large variety of components from on-prem to cloud SaaS as well as hybrid... more»
De-duplicates our customer data in a low-support and high performance process so that we able to reduce marketing costs and increase the quality of communication with customers. It replaced a weekly primitive de-duplication (best record)... more»
First a caution...the 2016 version is very buggy. I think we were one of their first customers on 2016 so we ended up being the first ones to discover a few pretty significant bugs. That said, after about 6 weeks working through the issues we... more»
We have a large volume of customers (millions) and this suite is the best priced for what we need. We use the cloud version for NCOA and the on-premise version for the normal batch processing from our POS systems. The two components we use... more»
Before we used DQSand an algorithm for matching and only updated valid address information quarterly. This caused returned mailing and upset customers. This suite has monthly updates to ensure we are up to date with recent new home... more»
Not in order of importance: 1. Ability to multi-thread. Currently we have to split the data into different paths to try to quicken processing time. In all fairness, we do have a very large customer master to match against. 2. Data types... more»
* Being able to open up large files * The extensive merge/purge capabilities We handle large amounts of data, in the terabytes. It's important to be able to have a program that can handle that large amount of data at one time, and effectively... more»
We had a project where we had purchased data files. We ran a standard name, address, and zip code, internal dedupe between the different files we had, and we were able to quickly notify our vendor that they had tens of thousands of... more»
One of the problems that we ran into this year was we probably spent over 40 hours finding and trying to drill down to where specific bugs were in the program, which was a tremendous waste of time for us. There were a couple of updates to... more»
We like having the ability to write our own utilities/software to process our records and store the final output the way we want. We use the objects inside of programs that we wrote in VB/VB.NET. This tool works better for us than using a... more»
When we first started using Melissa Data products, we needed it for cleaning up parcel addresses and adding Zip+4 to its latitude and longitude coordinates. At the time, we were actually using the products for verifying the addresses and... more»
The numerous components provided by Talend. With these components you’re able to create jobs quickly and efficiently. I also really like the fact that there are no out-of-the-box solutions regarding the development of jobs. Other vendors may... more»
It’s easy to monitor the processes. Every morning I’ll open the Talend Administration Center to check the status of the process. Within seconds I’m able to see which process ran successfully and which have failed and why they failed. We’re... more»
When we upgraded to Version 6.4.1, we tried using a GIT repository instead of a SVN repository. After a few incidents where things disappeared and changes were not saved, we decided to go back to a SVN repository.
We only use the address hygiene tool, so I can't really comment on their other tools. We did look at their email verification as well, because we also use another service for it. We do not switched to them, but it was not due to the... more»
We do hundreds of thousands to millions of customer deliveries. By using Melissa Data, we are able to scrub and verify, then better validate the end customer's address to ensure a more consistent delivery of products.
An area for improvement is where an end customer's address is not found in the Melissa Data database, even though it is a valid address. I would like a little more intelligence around handling addresses that are not in the system and... more»
The usual data cleansing: The suggestions it made for data formats (size, type, etc.) especially, and how quick it was. It was very fast in taking unstructured data, processing it and spitting out all the different data types, all the likely... more»
The product was dropped not long after I left the organization, I was told. The data team I was on moved to SSIS/SSRS, I heard. I suspect it didn’t fit in with the over-all goal of creating a data warehosue as quickly as possible. My manager... more»
The manual calculations and formulae. They were a bit complex. The formulae were a bit abstract. Not easy to understand. Not intuitive. I sat beside an SSIS guru and he took one look at them and said “Good luck Geoff”. I coded them all and... more»
Powerbuilder Consultant at a government with 10,001+ employees
Dec 18 2017
What do you think of Melissa Data Quality?
Primary Use Case
We use Personator to validate any address entered into our system. We also use their GeoPoints to get the most precise, rooftop level geocoding. We do not use the name, email, or phone validation built into Personator. We use Express Entry Service to fuzzy search; in case there are multiple matches for the address entered, we present them to the user to pick the right one.
• Improvements to My Organization
We used to use a third-party product that had to be installed and maintained onsite. Since we switched to Melissa Data web services, we do not need to maintain those servers and/or software. Also, we get the most up-to-date addresses from USPS.
• Valuable Features
Address validation, standardizing and formatting.
• Use of Solution
One to three...
It gives me an assessed value of the property in question. My partner and I are property investors, and it's good to get an assessed value. Other databases that we extract information from don't always give us that. So we're able to get an... more»
It really hasn't given us a phone number for the owner of the property, and that's one thing I'd really like to be getting. Either a phone number or email. One thing I would want to have, when you're doing a property search, you can do it... more»
We mainly communicate with our customers via email, so we primarily use it to find a phone number so we can contact them more efficiently. This allows us to talk to them and resolve their issues much more quickly.
We would appreciate it if there was a larger database so that we could find information more often. For example, we can search for 10 people and only find the information for three of them, if we are lucky. Increasing this success rate would... more»
It provides address standardization and NCOA, and deliverable addresses. This saves our clients time, and money in postage fees, by providing updated and undeliverable addresses.
It saves a huge amount of time. Before using this service, we used a vendor that manually ran our lists through this NCOA list, which might have taken one to three business days to return the file. This was a huge bottleneck in our process,... more»
The billing structure does not seem very accurate. We’ve had issues with miscounted batch records processed. We also ran into some data quality issues, but they have been rectified and we haven’t noticed any issues since.
Primary Use Case
We have a legacy system (Wins + DB2), which stores all our data. For reporting purposes (from SQL), we need to analyze data. We use it for making decisions, for example, if we want to display data elements in our reports based on if a column ever gets a value entered by user or what are distinct values that we are receiving for transformation purposes. We use it to check patterns, like zip code, state codes, and phone numbers. We also check data value frequency for business decision in mapping from one system to another.
• Improvements to My Organization
With its frequency function, we were able to pick a line of business to be addressed first in one of our conversion projects.
• Valuable Features
We have used value frequency and patterns. We have been...
Applied Data Science Entrepreneur at a consultancy with 1-10 employees
Dec 24 2017
What do you think of Melissa Data Quality?
Primary Use Case
Data merge, purge, quality, and append.
• Improvements to My Organization
I was able to dedupe millions of records in the past, and append the most recent email.
• Valuable Features
The flexibility of their products Services for all manner of data-driven organizations, no matter their size or budget.
• Room for Improvement
Better email append coverage (but every vendor struggles with this).
• Use of Solution
More than five years.
• Stability Issues
No issues. Everything worked as expected, and exceeded expectations.
• Scalability Issues
It depends on whether you can run extremely large lists on multiple servers. Any sort of dedupe or fuzzy match is processing intensive, regardless of the vendor.
• Customer Service and Technical...
We have, however benefited in additional ways from the service, in that our customer database is now significantly more accurate and reliable. This is particularly important when it comes to both cost savings and cost generating measures.... more»
I like the components provided by Data Quality, such as: * Address standardization * Fuzzy match * Schema compliance check as they pack lot of code, which is required to perform these standard data operations. * Doing the same by coding would... more»
* The report generation and using the report in DI job steps could be improved. * There are too many functions which could be streamlined. * The report generated often has too many pages to go through, if not loaded into a DB. * There are... more»
I have only just started to use it at my current company so no improvements yet. In my previous role, we used Listware to reduce returned mail and reduce mailing costs.
The world is moving/has moved to the cloud. I get that. But it would be nice to easily integrate the solution with our own internal systems/processes in a way that keeps IT happy. Right now I live at a company with (exceedingly) tight IT... more»
It's most useful in getting us names of homeowners. The phone numbers also make it convenient and easy to contact potential leads. This allows us to personalize postcards to advertise our services.
We have only been using this for about two months, but it has sped up our processing significantly. It makes data mining easy and fast. We don't have to spend an entire month gathering correct information on leads. With LWO, all we need is a... more»
My organization's requirement was to build a centralized, quality data repository by accumulating client data from various source systems. This tool helped in understanding the quality of the data available in each system. By understanding... more»
Although DQA can fetch data from most of the commonly used data sources, it has limited modifiers to get data, meaning that the number of technologies from which the data can be acquired is limited. For example, DQA does not support fetching... more»
Head of Data Partnerships at a tech services company with 11-50 employees
Jan 22 2018
What do you think of Melissa Data Quality?
Primary Use Case
We use Melissa Data Global Address Verification Web Service for address validation and parsing into individual attributes.
• Improvements to My Organization
By validating and parsing the addresses our customers submit to us, we have reduced the number of addressing errors encountered during our processing.
• Valuable Features
Address validation and parsing are most valuable to us, as we require the address segmented into its individual attributes for processing.
• Room for Improvement
Address validation and parsing in a few countries have room for improvement.
• Use of Solution
One to three years.
• Stability Issues
The solution has been very stable.
• Scalability Issues
No issues with scalability.
• Customer Service and Technical...
* The file fetch process is impeccable. * We are able to get emails from URLs very easily using this function when others fail. * tLogRows are also great for finding bad data.
Coming into the department with no knowledge of Talend, the interface has been user-friendly enough to allow me to come up to speed in four to five months on almost all its functions and use it like a pro.
NullPointerExceptions are going to be the death of me and are a big reason for our transition away from Talend. One day, it is fine with a 1000 blank rows, then the next day, it will find one blank cell and it breaks down. When we are dealing... more»
Sign in to see all reviews and comparisons. It's free!
By clicking Sign in with LinkedIn™ you agree to let IT Central Station use your LinkedIn profile
and email address in accordance with our Privacy Policy and agree to the Terms of Service.