Senior Data Engineer – Cloudera Migration & Cloud Data Platform

Virtual Tech Gurus
Published
April 3, 2026
Location
Remote, Remote
Category
Default

Description

Job Summary

We are looking for a highly experienced Senior Data Engineer to lead the migration of legacy Cloudera (CDP-PC) data platforms to modern cloud-native architectures. The ideal candidate will have strong expertise in data engineering, ETL/ELT modernization, cloud platforms (AWS), and handling large-scale, sensitive data environments including VIP/restricted datasets.

Key Responsibilities

  • Lead migration of Cloudera CDP-PC workloads to AWS-based data platforms.
  • Design and develop scalable cloud-native data solutions using Amazon S3, Redshift, and Aurora PostgreSQL.
  • Analyze and re-engineer existing ETL pipelines into modern ELT frameworks.
  • Work with Ab Initio pipelines for data extraction and feed generation (existing system support).
  • Build and maintain batch and near real-time data pipelines.
  • Design optimized logical and physical data models for simplified and scalable architectures.
  • Perform data mapping, lineage tracking, and dependency analysis across systems.
  • Implement secure data handling practices, including record-level access controls for sensitive/VIP data.
  • Handle and process large-scale datasets (75+ TB) across structured and unstructured sources.
  • Integrate data from mainframe systems (DB2), Linux DB2, and external vendor feeds.
  • Collaborate with business, BI, and downstream application teams to ensure data accuracy and availability.

Required Skills & Experience

Core Data Engineering

  • Strong experience in ETL/ELT development and optimization
  • Hands-on experience with large-scale data migration projects
  • Expertise in batch and near real-time data processing

Cloud & Big Data

  • Experience with AWS services (S3, Redshift, Aurora PostgreSQL)
  • Strong background in Cloudera CDP / Hive / Parquet
  • Experience handling large data volumes (50+ TB)

Tools & Technologies

  • Ab Initio (must-have – existing pipeline support)
  • Erwin Data Modeling Tool
  • Hive, SQL, and distributed data processing frameworks

Data Architecture

  • Strong skills in data modeling (logical & physical)
  • Experience with data mapping, lineage, and cataloging

Streaming / Real-Time

  • Experience building near real-time (NRT) or streaming pipelines

Data Security

  • Experience handling restricted/VIP data
  • Knowledge of data governance, masking, and access controls

JOBID: 12323

Apply
Drop files here browse files ...
Are you sure you want to delete this file?
/