Virtual Tech Gurus
Description
Job Summary
We are looking for a highly experienced Senior Data Engineer to lead the migration of legacy Cloudera (CDP-PC) data platforms to modern cloud-native architectures. The ideal candidate will have strong expertise in data engineering, ETL/ELT modernization, cloud platforms (AWS), and handling large-scale, sensitive data environments including VIP/restricted datasets.
Key Responsibilities
- Lead migration of Cloudera CDP-PC workloads to AWS-based data platforms.
- Design and develop scalable cloud-native data solutions using Amazon S3, Redshift, and Aurora PostgreSQL.
- Analyze and re-engineer existing ETL pipelines into modern ELT frameworks.
- Work with Ab Initio pipelines for data extraction and feed generation (existing system support).
- Build and maintain batch and near real-time data pipelines.
- Design optimized logical and physical data models for simplified and scalable architectures.
- Perform data mapping, lineage tracking, and dependency analysis across systems.
- Implement secure data handling practices, including record-level access controls for sensitive/VIP data.
- Handle and process large-scale datasets (75+ TB) across structured and unstructured sources.
- Integrate data from mainframe systems (DB2), Linux DB2, and external vendor feeds.
- Collaborate with business, BI, and downstream application teams to ensure data accuracy and availability.
Required Skills & Experience
Core Data Engineering
- Strong experience in ETL/ELT development and optimization
- Hands-on experience with large-scale data migration projects
- Expertise in batch and near real-time data processing
Cloud & Big Data
- Experience with AWS services (S3, Redshift, Aurora PostgreSQL)
- Strong background in Cloudera CDP / Hive / Parquet
- Experience handling large data volumes (50+ TB)
Tools & Technologies
- Ab Initio (must-have – existing pipeline support)
- Erwin Data Modeling Tool
- Hive, SQL, and distributed data processing frameworks
Data Architecture
- Strong skills in data modeling (logical & physical)
- Experience with data mapping, lineage, and cataloging
Streaming / Real-Time
- Experience building near real-time (NRT) or streaming pipelines
Data Security
- Experience handling restricted/VIP data
- Knowledge of data governance, masking, and access controls
JOBID: 12323
