AWS Glue Primary Use Case

Ajaykumar Myana - PeerSpot reviewer
Senior Software Developer at a computer software company with 10,001+ employees

I had the source data, which was unstructured and non-fixable, and my responsibility was to convert it into structured data. For this task, I used PySpark as the programming language. With Python, I implemented the creation of a data frame using Glue jobs. Since Glue jobs are a serverless mechanism, I deployed my code into the Glue job, and that's how I got the job done.

View full review »
AmitMataghare - PeerSpot reviewer
Associate Director at a consultancy with 10,001+ employees

In my company, we use AWS Glue to build data engineering pipelines, so we ingest data from either S3 or other sources and put it back into Redshift, where we have a data lake or data warehouse.

View full review »
CE
Senior Software Engineer at a consumer goods company with 10,001+ employees

We are collecting some TV audience data and analyzing it.

View full review »
Buyer's Guide
AWS Glue
April 2024
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,740 professionals have used our research since 2012.
ParamShah - PeerSpot reviewer
Engineering Manager at Milestone Technologies

We use the solution to build tables on CSV data. We get data from some different sources, pull it in S3, and then create tables using Glue to get some metrics out of that data.

View full review »
Mbaye Babacar Gueye - PeerSpot reviewer
Owner at Bennen

One common use case is migrating data from one system to another.  So, mostly migrating data and data engineering, getting real-time or near-real-time data using Lambda functions and migrating big data from on-prem to the cloud for historical data before starting a project.

View full review »
Vimalathithan M - PeerSpot reviewer
Associate Director - Delivery (Technology DWH & Data Engineer) at MOBIUS KNOWLEDGE SERVICES PRIVATE LIMITED

Our primary use cases include pulling data from multiple sources and loading it into the central capacity for data transformation, integration, and processing.

View full review »
RajKumar23 - PeerSpot reviewer
Sr Associate at Cognizant

We use AWS Glue for data analytics.

View full review »
Syed Zakaulla - PeerSpot reviewer
Project Manager at Softway

We're using GPU 0.2 in ten verticals and wanted to use AWS Glue only for one purpose: to optimize Amazon Redshift. 

We have millions of data that we have to back up. Previously, we did it once every six months, but the client data have been very interactive, and we need spontaneous back and forth of data communication in real-time. In one second, we have almost one million records that come and go continuously. The client wanted to keep all data because they're using it for analytics and wanted to back up the data every second without delay. We tried to optimize Amazon Redshift and found out about AWS Glue, which comes with massive costs, but the client is willing to pay.

View full review »
Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC

We use the solution to do the usual type of transformations that before required ETL. It's mostly transformation-type purposes that we have, including transforming data from source to target. Also, we are replacing the usual ETLs with Glue, for example.

View full review »
Neelabh Sharma - PeerSpot reviewer
Data Engineer at Scania

We use AWS Glue for ETL batch processing purposes.

View full review »
UjjwalGupta - PeerSpot reviewer
Module Lead at Mphasis

We use AWS Glue for building ETL pipelines.

View full review »
Sunil Morya - PeerSpot reviewer
Consultant at a tech vendor with 10,001+ employees

Once you get the data and you don't know about the structure of the data, then Glue is very helpful to estimate the structure, including where is the structure, and it'll identify everything for you. It has one component that is called Glue Crawler that is quite useful for this task. It will go through segments of your data and try to guess their structure. It pops out the structure, and you can modify it according to your convenience.

It is good to basically perform the ETL when your files are stored in the S3 bucket. Glue supports other external sources also. That said, most of the time, we have basically given our proposal to clients if the data is available in S3.

View full review »
Liana Iuhas - PeerSpot reviewer
CEO at Quark Technologies SRL

My colleagues work with Spark, PySpark, and Scala as programming languages for writing complex aggregations. They have a repository in order to have a general view of all the sources and jobs on the platform and AWS Glue is very helpful.

View full review »
YC
Data Engineer at Tata Consultancy

Our company has five data engineers who use the solution for metadata catalogs and ETL pipelines that are built in S3 or EC2.  

View full review »
Murilo Hallgren - PeerSpot reviewer
Data Engineer at a consultancy with self employed

We are using AWS Glue for transforming firewalls synced to the Data Lake in the bronze zone. The ATL uses the solution to transform fields in the silver layer and later we will produce the gold zone. We are using the Delta Lake Architecture.

View full review »
DS
ECM CONSULTANT/ARCHITECT/SOFTWARE DEVELOPER, DELUXE MN at a tech services company with 5,001-10,000 employees

Glue is a NoSQL-based data ETL tool that has some advantages over IIS and ISAs. It is tailored and customized to use with SQL Server, which works very well in that platform.

If you want to use other data sources, the NoSQL concept makes it very easy, because missing data can be inserted as a new column or with null values.

That is not the case with many other tools. If you have on-premises tools, such as IIS, they don't manage missing data well.

View full review »
UK
Consultant - Business Operations at a computer software company with 10,001+ employees

Our company uses the solution for ETL data movement for our customers such as on-premises to cloud, cloud to cloud, and cloud to Snowflake. We also data catalog and schedule ETL jobs. We are able to monitor all jobs through AWS services. 

View full review »
Jorge Encinas - PeerSpot reviewer
Sr. Data Engineer at a tech services company with 5,001-10,000 employees

We used AWS Glue to build our data warehouse. We built prototypes to go all the way all across their warehouse platforms. From AWS Glue to Spreadsheets and then QuickSight, that's how we're building their warehouse.

View full review »
Senthil Kumar Veerasamy - PeerSpot reviewer
Senior Manager, Analytics at Azendian

We are implementing a solution in AWS for one of our customers. It is more of a data analytics solution. We wanted to process data from different sources and put it into a central repository that can be used for any analysis or predictive modeling.

View full review »
ShilpaShivapuram - PeerSpot reviewer
Principal Data Architect at Wells Fargo

I primarily use AWS Glue as a lightweight ETL to migrate our existing on-prem workloads to a cloud environment without looking at a lot of migration paths. 

View full review »
Shifa Shah - PeerSpot reviewer
Data engineer at nust

I constructed a straightforward ETL job using AWS Glue, wherein I had to load a couple of files in the Teradata database.

View full review »
Adriano Junior Gouveia Gonçalves - PeerSpot reviewer
Professor at a tech services company with 51-200 employees

I use AWS Glue to create a data lake using research data hosted in S3. 

View full review »
SP
Associate Consultant at a tech vendor with 10,001+ employees

Currently, we are utilizing AWS Glue for various ETL workloads, specifically in the life sciences domain. Our primary objective is to acquire data from various sources. Then, we store it in Redshift. This is where the complete use case of AWS Glue comes into the picture.

View full review »
Sainagaraju Vaduka - PeerSpot reviewer
Data solution architect at a pharma/biotech company with 5,001-10,000 employees

We are primarily using it for batch crossing and transformations.

View full review »
YB
Consultant Data junior at a computer software company with 51-200 employees

The primary use cases of AWS Glue in our organization are for implementing ETL processes and for data flow.

View full review »
BV
Manager at a construction company with 51-200 employees

Our primary use case is ETL.

View full review »
MA
Cloud Data Engineer at jems groupe

Our company is creating data warehousing in the cloud. Our team includes four data engineers, two data ops, and two data administrators. 

We use S3 to data lake or prepare data from two databases that are contained in MySQL and Oracle. For the migration, we use DMS.

Then, we use the solution to perform data transformation. For Oracle, we use Data Catalog and Data Crawler to create our catalog. Dev Endpoint is used to develop complex data transformations. We then migrate to Studio Notebook where we develop and schedule a complex Spark job. 

Finally, we load the transformed data to Redshift so our data analyst team can visualize it with QuickSight. 

View full review »
GV
Data Engineer at a computer software company with 501-1,000 employees

We use the solution to collect customers' data containing multiple files and convert it into a common database. Later, we send the database for SQL injection.

View full review »
Sashi Dhar - PeerSpot reviewer
Operations executive at Wipro Infotech

We are using it for day-to-day ETL jobs. It is being used to transfer data from Teradata to the cloud.

We are using its latest version.

View full review »
BR
CEO and Founder at HartB

It is a good tool for us. All the implementation in our company is done with AWS Glue. We use it to execute all the ETL processes. We have collected more or less five terabytes of information from the internet by now. We process all this data in our cloud platform and normalize the information. We first put it on a data lake that we have here on the AWS tool. After that, we use AWS Glue to transform all the information collected around the internet and put the normalized information into a data warehouse.

View full review »
Suraj Sachdeva - PeerSpot reviewer
Data Engineer | Developer at Sakshath Technologies

The key role of Glue is that it hosts our metadata before rolling out our actual data. This is the major advantage of using this solution and our clients client have been very satisfied with it.

View full review »
DB
Net Full-Stack developer at a tech services company with 201-500 employees

We use the solution as a level of loading data from the source systems.

View full review »
AS
Team Lead at a financial services firm with 5,001-10,000 employees

We are using it for file ingestion. Its primary role is to ingest a file from a vendor to a database.

View full review »
KM
Cloud Solution Architect at a tech services company with 1-10 employees

AWS Glue is a versatile tool and we mostly use it for "lift and shift" server migrations.

View full review »
Diksha  Hirole - PeerSpot reviewer
Data Engineer at a tech services company with 201-500 employees

I mainly use AWS Glue for ETL purposes and batch processing of data.

View full review »
Buyer's Guide
AWS Glue
April 2024
Learn what your peers think about AWS Glue. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
768,740 professionals have used our research since 2012.