About the role
Title: Principal Data Engineer – Databricks**
Key Requirements
-
10+ years of overall data engineering experience
-
8+ years of experience in enterprise Data Warehouse and Data Lake platforms
-
5+ years of hands-on experience with Databricks and Spark at scale
-
Strong experience in modernizing legacy Cloudera platforms (CDH/CDP, Hive, HBase, Impala, Spark) to Databricks Lakehouse
-
Redesign ingestion, transformation, and consumption patterns from HDFS-based architecture to cloud object storage and Delta Lake
-
Refactor legacy Hive/Impala logic into PySpark and Spark SQL ELT pipelines
-
Ensure data reconciliation, audit integrity, and consistency during migration
-
Design and govern enterprise Data Warehouse and Data Lake/Lakehouse architectures
-
Implement layered architecture including Raw/Landing, Curated/Conformed, and Semantic/Consumption layers
-
Modernize traditional EDW platforms into scalable lakehouse architectures
-
Strong experience in finance and risk data models including General Ledger, Sub-ledger, financial hierarchies, and risk exposure models (credit, liquidity, market risk)
-
Enable reporting use cases including aggregation, drill-down, and drill-back capabilities
-
Build and manage semantic/consumption layers for BI, reporting, and analytics
-
Define business metrics, dimensions, hierarchies, and KPIs
-
Experience with Databricks SQL, Delta tables, and dbt or similar frameworks
-
Develop and optimize large-scale data pipelines using PySpark, Spark SQL, and Delta Lake
-
Implement Medallion architecture (Bronze, Silver, Gold layers)
-
Optimize workloads using Z-ORDER, OPTIMIZE, caching, and cluster configurations
-
Implement data governance, data quality frameworks, reconciliation controls, and exception handling
-
Establish data lineage and metadata management
-
Ensure data security, access control, and compliance standards
-
Experience with cloud platforms such as AWS or Azure
-
Experience with CI/CD pipelines using Git, Terraform, Jenkins, or Azure DevOps
-
Familiarity with orchestration tools such as Airflow or Databricks Workflows
-
Experience with dbt is a plus
-
Act as a technical authority and lead architecture decisions
-
Mentor and guide senior engineers and establish engineering standards
-
Strong stakeholder management with finance, risk, analytics, and governance teams
-
Ability to translate complex data structures into business-ready insights
Nice to Have
- Experience in BFSI, Capital Markets, or regulatory reporting
- Exposure to SAP Finance, Oracle Financials, or S/4HANA
- Experience supporting AI/ML workloads
- Databricks or cloud certifications
Impact
- Lead Cloudera to Databricks transformation initiatives
- Shape enterprise finance and risk data platforms
- Support regulatory, management, and analytical reporting systems