AWS Glue - Serverless ETL

AWS Glue is a serverless ETL service for data integration, transformation, and cataloging in data lakes.

Analytics

Get Expert Support Documentation

Pricing Model Pay per DPU-hour

Availability All major regions

Data Sovereignty EU regions available

Reliability 99.9% availability SLA

What is AWS Glue?

AWS Glue is a serverless ETL (Extract, Transform, Load) service for data integration. The service automates discovering, preparing, and combining data for analytics and machine learning. Glue consists of three main components: Data Catalog, ETL Engine, and Glue Studio for visual ETL development.

Core Features

Data Catalog: Central metadata repository that automatically detects schemas and is compatible with Athena, Redshift, and EMR
Glue Crawlers: Automatic scanning of data sources and schema detection for S3, RDS, and JDBC databases
Glue ETL: Serverless Spark-based transformations in Python or Scala
Glue Studio: Visual ETL editor for drag-and-drop pipeline development
Glue DataBrew: No-code data preparation with over 250 pre-built transformations

Typical Use Cases

Data Lake Construction

Glue Crawlers scan various data sources and create a unified catalog. ETL jobs transform raw data into analyzable formats like Parquet and load them into S3-based data lakes.

Data Warehouse Integration

Data from operational systems is transformed and loaded into Amazon Redshift. Glue handles schema mapping, data type conversion, and incremental loads.

Machine Learning Data Preparation

DataBrew cleans and normalizes data for ML workflows. Missing values are handled, outliers detected, and features prepared for training.

Benefits

No infrastructure management: automatic scaling of Spark clusters
Pay-per-use billing by DPU hours
Integration with the entire AWS analytics stack
Reusable transformations and job bookmarks for incremental processing

Integration with innFactory

As an AWS Reseller, innFactory supports you with AWS Glue: building data lake architectures, developing ETL pipelines in Python/Scala, and integration with existing data warehouse systems.

Typical Use Cases

ETL

Data catalog

Data preparation

Data integration

Quick Links

Documentation Pricing Console

AWS Cloud Expertise

innFactory is an AWS Reseller with certified cloud architects. We provide consulting, implementation, and managed services for AWS.

Ready to start with AWS Glue - Serverless ETL?

Our certified AWS experts help you with architecture, integration, and optimization.

Schedule Consultation

↑

AWS Glue - Serverless ETL

What is AWS Glue?

Core Features

Typical Use Cases

Data Lake Construction

Data Warehouse Integration

Machine Learning Data Preparation

Benefits

Integration with innFactory

Typical Use Cases

Quick Links

AWS Cloud Expertise

Similar Products from Other Clouds

Microsoft Fabric - Azure Analytics & Big Data

Dataflow - Managed Stream and Batch Processing

BigQuery ML - Machine Learning with SQL

Managed Service for Apache Kafka - Managed Kafka Streaming

Dataproc Metastore - Managed Hive Metastore

Azure Stream Analytics - Azure Analytics & Big Data

Ready to start with AWS Glue - Serverless ETL?