Senior Data Engineer Job San Francisco area,California USA,IT/Tech

About Baselayer

Trusted by 2,200+ financial institutions, Baselayer is the intelligent business identity platform that helps verify any business, automate KYB, and monitor real‑time risk. Baselayer’s B2B risk solutions and identity graph network leverage state and federal government filings and proprietary data sources to prevent fraud, accelerate onboarding, and lower credit losses.

About the Role

We are looking for a Data Engineer to design, build, and operate the data infrastructure that powers Baselayer’s analytics and machine learning capabilities. You will own robust, scalable pipelines that ingest, transform, and validate structured and unstructured data from internal systems and external sources, with a strong focus on reliability, observability, and data quality.

This is a hands‑on role for someone who thrives in complexity, cares deeply about correctness, and wants to work close to AI and product workflows in a regulated domain.

What You’ll Do

Design, build, and maintain robust ETL and ELT pipelines that power analytics and machine learning use cases

Own and improve the architecture and tooling for storing, processing, and querying large‑scale datasets in cloud data platforms

Implement orchestration and automation for data workflows using tools such as Airflow, dbt, or similar

Build and maintain reusable data models to enable faster experimentation and reliable reporting

Implement data quality checks, observability, and alerting to ensure integrity and reliability across environments

Partner with Data Science, ML Engineering, Product, and Engineering to ensure reliable data delivery and feature readiness for modeling

Optimize warehouse and query performance, scalability, and cost as data volumes grow

Maintain clear documentation, runbooks, and operational processes for pipelines and datasets

Partner with security and compliance stakeholders to ensure pipelines and access controls meet regulatory and internal standards

About You

You want to learn fast, take ownership, and build systems that other teams can rely on. You are not just doing this for the win. You are doing it because you have something to prove and want to be great.

You care about data integrity and reliability, you enjoy turning messy inputs into clean systems, and you are comfortable operating without a playbook. You are curious about AI and ML infrastructure and want to build the foundation that powers it.

Required Experience and Skills

4 to 12 years of experience in data engineering or analytics engineering

Strong Python and SQL skills, with experience building production‑grade data workflows

Experience building and maintaining ETL or ELT pipelines and working with cloud data warehouses or analytics databases

Familiarity with orchestration, workflow scheduling, and transformation tooling (for example Airflow, dbt, Dagster, Prefect, or similar)

Comfort working with both structured and unstructured data and designing scalable data architectures

Strong understanding of data quality, testing, observability, and operational best practices

Ability to communicate clearly across technical and non-technical audiences

What Sets You Apart

Experience working in regulated environments or with sensitive identity, risk, fraud, compliance, or financial services data

Experience integrating external data sources and APIs, including government or registry data

Familiarity with near‑real‑time or streaming data patterns

Highly feedback‑oriented with a desire for continuous improvement

Strong bias toward ownership and building systems that scale

Work Location

Hybrid in SF, in office 3 days per week

Compensation and Benefits

Salary range of $135,000 to $220,000

Equity package

Unlimited vacation

Fully paid health insurance, dental, and vision

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language