AI Data Engineer; ML Data Pipelines

Remote / Online - Candidates ideally in
Saint Paul, Ramsey County, Minnesota, 55199, USA

Listing for: Empowers Staffing Inc.

Remote/Work from Home position
Listed on 2026-03-05

Job specializations:

IT/Tech
Data Engineer, Machine Learning/ ML Engineer
Engineering
Data Engineer

Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR

Position: AI Data Engineer (ML Data Pipelines)

Work Experience Python, SQL, Spark, Databricks, Airflow, Feature Engineering, Data Pipelines, Data Quality, Great Expectations, AWS, Azure, GCP, Kafka
Required Skills
- Airflow
- AWS
- +20
Remote Job

Job Description

This is a remote position.

We are seeking an AI Data Engineer to design and build production-grade data pipelines that power machine learning systems. This role focuses on creating scalable ingestion, transformation, and feature engineering workflows that support model training, evaluation, and real‑time inference.

You will work closely with Data Scientists, Machine Learning Engineers, and Platform teams to ensure high‑quality, reliable, and efficient data flows across cloud environments. The ideal candidate understands both traditional data engineering and the unique data needs of ML systems.

Key Responsibilities

Design and build scalable data pipelines for ML workflows
Develop feature engineering and data preparation processes
Implement batch and real‑time data ingestion systems
Ensure data quality, validation, and monitoring
Collaborate with ML engineers to support model training and deployment
Integrate pipelines with orchestration tools (Airflow or similar)
Optimize pipeline performance and cloud cost efficiency
Maintain documentation and version control of data workflows

Requirements

4+ years of experience in Data Engineering
Strong Python and SQL skills
Experience building data pipelines for ML or analytics systems
Hands‑on experience with Spark, Databricks, or similar distributed processing frameworks
Experience with orchestration tools (Airflow or similar)
Experience in AWS, Azure, or GCP environments
Familiarity with data quality validation and monitoring frameworks
Understanding of feature engineering and model data lifecycle

Preferred Qualifications

Experience with streaming systems (Kafka, Kinesis, Pub/Sub)
Experience supporting model deployment and MLOps workflows
Experience with feature stores or vector databases
Familiarity with ML frameworks (Tensor Flow, PyTorch)

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language