Data Engineer With Iam Modernization Apply
Role Summary
We are looking for a Data Engineer with strong expertise in the Hadoop ecosystem, ETL development, and data transformation logic, focused on modernizing IAM data flows. This role involves terminating legacy batch SQL jobs, re-pointing feeds via NDM, and pushing IAM data into a Cyber Data Lake built on Hadoop. The engineer will design and implement push-based, near real-time ingestion pipelines with transformation logic applied during ingestion, enabling scalable, secure, and audit-ready IAM datasets.
Key Responsibilities
- Modernization & Migration
- Decommission existing batch SQL jobs and migrate to modern ingestion architecture.
- Re-point upstream and downstream feeds using NDM for secure data transfers.
- Onboard IAM datasets into a Cyber Data Lake (Hadoop) with optimized storage formats (Parquet/ORC) and partitioning.
- Pipeline Development & Transformation
- Build ETL/ELT pipelines using Spark/Hive to perform transformations during ingestion (schema mapping, normalization, deduplication).
- Implement push-based near real-time ingestion (event-driven or micro-batch) instead of scheduled pulls.
- Apply complex IAM-specific transformation logic for identities, accounts (human & non-human), roles, entitlements, and policies.
- Data Quality & Observability
- Define and automate data quality checks (completeness, accuracy, referential integrity).
- Implement monitoring, logging, and alerting for ingestion pipelines and NDM transfers.
- Performance & Optimization
- Tune Spark jobs, Hive queries, and storage strategies for scale and cost efficiency.
- Optimize resource allocation and implement backpressure controls for streaming ingestion.
- Security & Compliance
- Enforce least privilege and secure handling of sensitive IAM attributes (PII).
- Maintain metadata, lineage, and data dictionaries; ensure compliance with audit requirements.
- Client Collaboration
- Work onsite with client IAM teams, application owners, and auditors to clarify requirements and deliver modernization milestones.
- Maintain detailed documentation (ERDs, flow diagrams, runbooks).
Required Qualifications
- 5 8 years of experience in Data Engineering, with exposure to IAM data and modernization projects.
- Strong hands-on experience with Hadoop ecosystem: HDFS, Hive, Spark (SQL/Scala/PySpark).
- Proven experience in ETL/ELT design, data transformation logic, and pipeline optimization.
- Experience terminating legacy batch SQL jobs and migrating to modern ingestion patterns.
- Practical knowledge of NDM for secure data transfers.
- Expertise in push-based ingestion and near real-time data processing.
- Understanding of IAM concepts: identities, service/non-human accounts, roles, entitlements, policies.

