Cost-Optimized ETL Modernization: Transitioning Traditional Workloads to Cloud-Native IDMC/IICS + AWS
Main Article Content
Abstract
Cloud-native data modernization overcomes the increasing drawback of the existing on-premise ETL tools and takes advantage of elastic compute, serverless execution, automatic scaling, and built-in observability. In this work, the cloud-native ETL system based on Amazon Web Services is introduced, where Amazon S3 is used as a scalable storage, Amazon Aurora PostgreSQL is employed as the main analytical and operational data store, Amazon Lambda is orchestrated with the help of it, and Amazon CloudWatch is recommended to monitor and govern it. Aurora PostgreSQL also supports a single platform in extract-load-transform (ELT) processing, transactional analytics, and near-real-time reporting to lower architectural complexity by removing the scale between the transactional and analytics databases. The proposed architecture was considered with the load of enterprises that included the 3 TB of daily ingestion and hacked 17 different and a heterogeneous source system that processed 2.4 billion rows daily. Performance benchmarking illustrates also verifiable statistical gains in data processing efficiency, query responsiveness and operational robustness over legacy on-premise implementations. Cost analysis also reveals considerable savings in infrastructure and operational costs via elastic compute scaling and pay-as-you-use resource use. These results indicate that Amazon Aurora PostgreSQL has a potential and scalable alternative to organizations that want to upgrade ETL pipelines and achieve both analytical and operational workloads. These findings provide a viable base to implement cloud-native frameworks that offer performance, cost effectiveness, and operational ease in enterprise data hubs.