Senior Data Engineer; GCP/Databricks
Listed on 2026-01-12
-
IT/Tech
Data Engineer, Big Data, Data Analyst, Data Science Manager
Senior Data Engineer
We are looking for a Senior Data Engineer to design, develop, and optimize our data infrastructure on Google Cloud Platform (GCP). You will architect scalable pipelines using Databricks, Big Query, Google Cloud Storage, Apache Airflow, dbt, Dataflow, and Pub/Sub, ensuring high availability and performance across our ETL/ELT processes. You will leverage great expectations to enforce data quality standards. The role also involves building our Data Mart (Data Mach) environment, containerizing services with Docker and Kubernetes (K8s), and implementing CI/CD best practices.
A successful candidate has extensive knowledge of cloud-native data solutions, strong proficiency with ETL/ELT frameworks (including dbt), and a passion for building robust, cost‑effective pipelines.
Key Responsibilities Data Architecture & Strategy- Define and implement the overall data architecture on GCP, including data warehousing in Big Query, data lake patterns in Google Cloud Storage, and Data Mart (Data Mach) solutions.
- Integrate Terraform for Infrastructure as Code to provision and manage cloud resources efficiently.
- Establish both batch and real‑time data processing frameworks to ensure reliability, scalability, and cost efficiency.
- Design, build, and optimize ETL/ELT pipelines using Apache Airflow for workflow orchestration.
- Implement dbt (Data Build Tool) transformations to maintain version‑controlled data models in Big Query, ensuring consistency and reliability across the data pipeline.
- Use Google Dataflow (based on Apache Beam) and Pub/Sub for large‑scale streaming/batch data processing and ingestion.
- Automate job scheduling and data transformations to deliver timely insights for analytics, machine learning, and reporting.
- Implement event‑driven or asynchronous data workflows between microservices.
- Employ Docker and Kubernetes (K8s) for containerization and orchestration, enabling flexible and efficient microservices‑based data workflows.
- Implement CI/CD pipelines for streamlined development, testing, and deployment of data engineering components.
- Enforce data quality standards using Great Expectations or similar frameworks, defining and validating expectations for critical datasets.
- Define and uphold metadata management, data lineage, and auditing standards to ensure trustworthy datasets.
- Implement security best practices, including encryption at rest and in transit, Identity and Access Management (IAM), and compliance with GDPR or CCPA where applicable.
- Integrate with Looker (or similar BI tools) to provide data consumers with intuitive dashboards and real‑time insights.
- Collaborate with Data Science, Analytics, and Product teams to ensure the data infrastructure supports advanced analytics, including machine learning initiatives.
- Maintain Data Mart (Data Mach) environments that cater to specific business domains, optimizing access and performance for key stakeholders.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).