Talend Data Quality Primary Use Case

SP
IT Manager at a insurance company with 10,001+ employees

Talend has different modules. Talend has Talend Data integration (DI), Talend Data Quality (DQ), Talend MDM, and Talend Data Mapper (TDM). We have Talend DI, Talend DQ, and TDM. Our use cases span across these modules. We don't use Talend MDM because we have a different solution for MDM. Our EDF team is using an Informatica solution for that.

We have a platform that deals with MongoDB, Oracle, and SQL Server databases. We also have Teradata and Kafka. The first use case was to ensure that when the data traverses from one application to another, there is no data loss. This use case was more around data reconciliation, and it was also loosely tied to the data quality.

The second use case was related to data consistency. We wanted to make sure that the data is consistent across various applications. For example, we are a healthcare company. If I'm just validating the claim system, I need to see how do I inject the data into those systems without any issues. 

The third use case was related to whether the data is matching the configurations. For example, in production, I want to see:

  • If there is any data issue or duplicate data?
  • Is the data coming from different states getting fed into the system and matching the configurations that have been set in our different engines, such as enrollment, billing, and all those things?
  • Is it able to process this data with our configuration?
  • Is it giving the right output?

The fourth use case was to see if I can virtually create data. For example, I want to test with some data that is not available in the current environment, or I'm trying to create some EDA files, which are 834 and 837 transaction files. These are the enrollment and claims processing files that come from different providers. If I want to test these files, do I have the right information within my systems, and who can give me that information.

The fifth use case was related to masking the information so that in your environment, people don't have access to certain data. For example, across the industry, people pull the data from production and then just push it into the lower environment and test, but because this is healthcare data, we have a lot of PHI and PII information. If you have your PHI and PII information in production and I am pulling that data, I have everything that is in production in the test environment. So, I know your address, and I know your residents. I can hack into your systems, and I can do anything. This is the main issue for us with HIPAA compliance. How do we mask that information so that in your environment, people don't have access to it?

These are different use cases on which we started our journey. Now, it is going more into the cloud, and we are using Talend to interact with various cloud environments in AWS. We are also interacting with Redshift and Snowflake by using Talend. So, it is expanding. We are using version 7.1, and we are migrating to version 7.3 very soon.

View full review »
WesamHabboub - PeerSpot reviewer
Chief Consultant at Insight360

We recently deployed it for one of our clients, who use it to enhance the quality of their government-related customer data. The primary focus is on ensuring compliance with government policies, and it serves as a crucial component in achieving data quality improvements.

View full review »
SV
Practice Manager

Data Quality is used to automate the quality control check on the data loaded from batch jobs. This includes BCA for field level data quality and cross table checks for key column mismatches.

The data is in Redshift and the load volume is around 10 million records per batch load over more than 100 tables in a Data Vault model.

This is for a short three month project. I have used it from dev phase until QA. This reduces the QA effort immensely by handling most of the test scenarios in a reusable way.

View full review »
Buyer's Guide
Talend Data Quality
March 2024
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,740 professionals have used our research since 2012.
SK
Software Developer at a tech consulting company with 51-200 employees

Talend Data Quality helps me find and fix problems in my data. It checks for errors and follows rules to ensure my data is accurate. If it finds issues, it works together with me and the data stewards to fix them. It is like a team effort to make sure my data is good quality from the start.

View full review »
JW
ETL/SQL Developer at a insurance company with 201-500 employees

We have a legacy system (Wins + DB2), which stores all our data.

For reporting purposes (from SQL), we need to analyze data. We use it for making decisions, for example, if we want to display data elements in our reports based on if a column ever gets a value entered by user or what are distinct values that we are receiving for transformation purposes.

We use it to check patterns, like zip code, state codes, and phone numbers.

We also check data value frequency for business decision in mapping from one system to another.

View full review »
UN
Data Scientest at a wellness & fitness company with 51-200 employees

The primary use case is for data ingestion. We current have HDP 2.6 installed on Ubuntu 16.04.

View full review »
HU
Practice Manager (Digital Solutions) at a computer software company with 201-500 employees

Our use cases vary, but mainly we are using it for implementing a master data management platform. We get data from multiple sources and create a golden ticket record that can be used for ingesting the data from that single source to any of the platforms. 

View full review »
DN
Data Consultant at a tech vendor with 11-50 employees

We’ve created an MDM-like system. The MDM hub is built on an Oracle Database. The system is retrieving data from different sources like files, a Microsoft SQL Server and Oracle DB. The data is being processed by our cleansing process. We’re using Talend DQ components, web services, and custom Java code to clean our data. Once the data is cleansed, we load it into the MDM hub where the records are matched and consolidated. The consolidated records are then written back to specific target sources.

View full review »
it_user827655 - PeerSpot reviewer
Principal Developer

We use it to load our big data system with S3 and Redshift. We also use it to process in HL7 from hospitals in real-time.

View full review »
it_user848511 - PeerSpot reviewer
VP of Professional Services at a tech services company with 51-200 employees
  • Fixing data by using regular expressions or synonyms and Data Stewardship.
  • Using data profiling to gauge the quality of the data before and after it’s used/needed.
  • Master Data Management - Authoring and matching survivorship, including Data Stewardship.
View full review »
it_user826299 - PeerSpot reviewer
Junior ETL Developer at a marketing services firm with 51-200 employees

We are a marketing and advertising company. We use this tool to fetch data from Google, Bing, and Adobe. We receive marketing data daily via email, FTP, and API, then process the data into MySQL tables.

View full review »
it_user826677 - PeerSpot reviewer
Technical Consultant

Data migration (database to database using direct DB access and commands or using web services).

View full review »
Buyer's Guide
Talend Data Quality
March 2024
Learn what your peers think about Talend Data Quality. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,740 professionals have used our research since 2012.