Responsibilities:
- Design and implement scalable data architectures (Data Lakehouse, Mesh, or Fabric) using cloud-native services (AWS, GCP, or Azure).
- Evaluate and integrate emerging technologies.
- Establish engineering best practices, including CI/CD for data (DataOps), version control, and automated testing.
- Build and optimize batch and real-time streaming pipelines (Kafka, Flink, Spark) that process high volume of data.
- Implement self-healing pipelines and metadata-driven ingestion frameworks to reduce manual boilerplate.
- Manage cloud costs by optimizing query performance, storage tiering, and compute resource allocation (FinOps).
- Drive the roadmap for data engineering sprints, ensuring on-time delivery of critical data products.
