Data Engineer Job Raleigh area,North Carolina USA,Software Development

VA1‑3 Commercial Pl #1500‑Norf, 3 Commercial Pl, 15th Floor, Norfolk, VA 23510, USA

The Data Engineer will design, build, and maintain batch ETL pipelines on a modern Databricks Lakehouse platform, delivering high-quality data solutions that support critical banking functions. You will take ownership of pipelines end-to-end, from ingestion through transformation, quality assurance, and delivery to downstream consumers, including Power BI dashboards and analytical models.

The ideal candidate combines strong technical depth in Spark and Delta Lake with a natural orientation toward documentation, process improvement, and clear communication. This role requires the ability to work autonomously, prioritize effectively across competing demands, and contribute to the ongoing maturation of the team’s engineering practices.

Responsibilities

Data Pipeline Development:
Design, build, and maintain batch ETL pipelines that ingest data from diverse source systems into the Databricks environment. Own the full pipeline lifecycle including ingestion, transformation, serving, monitoring, and incident resolution.
Data Quality and Integrity:
Implement automated validation, reconciliation checks, and data quality gates across pipelines. Ensure data meets standards for accuracy, completeness, timeliness, and consistency. Maintain historical data for auditability and compliance.
Performance Optimization:
Optimize data processing performance on Databricks through efficient Spark SQL, partitioning strategies, and Delta Lake table maintenance.
Data Modeling for Analytics:
Design dimensional models (star schema, aggregation tables) that serve Power BI dashboards and self‑service analytics effectively. Prepare the semantic layer for AI‑powered analytics capabilities, including Databricks Genie Rooms, through clean business logic, well‑documented table relationships, and intuitive naming conventions.
Process and Documentation:
Establish and maintain runbooks, deployment procedures, coding standards, and operational documentation. Contribute to code review practices, automated quality checks, and repeatable processes that enable the team to scale.
Governance and Compliance:
Adhere to enterprise data governance policies and implement security best practices for sensitive financial data. Enforce access controls, encryption, and data lineage tracking across pipelines in accordance with banking regulations.
Cross‑Team

Collaboration:

Work with data architects, analysts, and business stakeholders to gather requirements and translate business needs into scalable data solutions. Communicate technical constraints and timelines clearly to non‑technical partners.
Mentoring and Knowledge Sharing:
Support the development of junior and mid‑level engineers through code review, pairing, and in‑context coaching. Contribute to a culture of continuous learning and shared technical ownership.
Continuous Improvement:
Identify and implement improvements to enhance pipeline stability, efficiency, and scalability. Evaluate and adopt emerging Databricks features and industry best practices as appropriate.
Adhere to applicable federal laws, rules, and regulations including those related to Anti‑Money Laundering (AML) and the Bank Secrecy Act (BSA).
Other duties as assigned.

Minimum Required Skills and Competencies

Experience:

Bachelor’s degree in Computer Science or related field (or equivalent practical experience). 5+ years of experience as a data engineer in complex, large‑scale data environments, preferably in the cloud.
Databricks and Spark Proficiency:
Strong hands‑on expertise with Databricks and the Apache Spark ecosystem (PySpark, Spark SQL) for building and optimizing large‑scale data pipelines. Production experience with Delta Lake tables and Lakehouse architectural patterns.
Delta Lake Operations:
Working experience with OPTIMIZE, VACUUM, Z‑ordering, MERGE INTO for upserts, and time travel for debugging and auditing. Ability to articulate practical differences between Delta Lake and raw Parquet.
Programming and SQL:
Proficient in Python (including PySpark) for data processing. Strong SQL skills for complex querying and transformation. Emphasis on…


Increase/decrease your Search Radius (miles)



Job Posting Language