Data Engineer
Listed on 2026-03-13
-
Software Development
Data Engineer
VA1‑3 Commercial Pl #1500‑Norf, 3 Commercial Pl, 15th Floor, Norfolk, VA 23510, USA
The Data Engineer will design, build, and maintain batch ETL pipelines on a modern Databricks Lakehouse platform, delivering high-quality data solutions that support critical banking functions. You will take ownership of pipelines end-to-end, from ingestion through transformation, quality assurance, and delivery to downstream consumers, including Power BI dashboards and analytical models.
The ideal candidate combines strong technical depth in Spark and Delta Lake with a natural orientation toward documentation, process improvement, and clear communication. This role requires the ability to work autonomously, prioritize effectively across competing demands, and contribute to the ongoing maturation of the team’s engineering practices.
Responsibilities- Data Pipeline Development:
Design, build, and maintain batch ETL pipelines that ingest data from diverse source systems into the Databricks environment. Own the full pipeline lifecycle including ingestion, transformation, serving, monitoring, and incident resolution. - Data Quality and Integrity:
Implement automated validation, reconciliation checks, and data quality gates across pipelines. Ensure data meets standards for accuracy, completeness, timeliness, and consistency. Maintain historical data for auditability and compliance. - Performance Optimization:
Optimize data processing performance on Databricks through efficient Spark SQL, partitioning strategies, and Delta Lake table maintenance. - Data Modeling for Analytics:
Design dimensional models (star schema, aggregation tables) that serve Power BI dashboards and self‑service analytics effectively. Prepare the semantic layer for AI‑powered analytics capabilities, including Databricks Genie Rooms, through clean business logic, well‑documented table relationships, and intuitive naming conventions. - Process and Documentation:
Establish and maintain runbooks, deployment procedures, coding standards, and operational documentation. Contribute to code review practices, automated quality checks, and repeatable processes that enable the team to scale. - Governance and Compliance:
Adhere to enterprise data governance policies and implement security best practices for sensitive financial data. Enforce access controls, encryption, and data lineage tracking across pipelines in accordance with banking regulations. - Cross‑Team
Collaboration:
Work with data architects, analysts, and business stakeholders to gather requirements and translate business needs into scalable data solutions. Communicate technical constraints and timelines clearly to non‑technical partners. - Mentoring and Knowledge Sharing:
Support the development of junior and mid‑level engineers through code review, pairing, and in‑context coaching. Contribute to a culture of continuous learning and shared technical ownership. - Continuous Improvement:
Identify and implement improvements to enhance pipeline stability, efficiency, and scalability. Evaluate and adopt emerging Databricks features and industry best practices as appropriate. - Adhere to applicable federal laws, rules, and regulations including those related to Anti‑Money Laundering (AML) and the Bank Secrecy Act (BSA).
- Other duties as assigned.
- Experience:
Bachelor’s degree in Computer Science or related field (or equivalent practical experience). 5+ years of experience as a data engineer in complex, large‑scale data environments, preferably in the cloud. - Databricks and Spark Proficiency:
Strong hands‑on expertise with Databricks and the Apache Spark ecosystem (PySpark, Spark SQL) for building and optimizing large‑scale data pipelines. Production experience with Delta Lake tables and Lakehouse architectural patterns. - Delta Lake Operations:
Working experience with OPTIMIZE, VACUUM, Z‑ordering, MERGE INTO for upserts, and time travel for debugging and auditing. Ability to articulate practical differences between Delta Lake and raw Parquet. - Programming and SQL:
Proficient in Python (including PySpark) for data processing. Strong SQL skills for complex querying and transformation. Emphasis on…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).