Senior Kafka Platform Engineer Job Chicago area,Illinois USA,IT/Tech

Overview

We're looking for a Senior Kafka Platform Engineer to design, automate, and scale a mission-critical event-streaming platform. In this role, you'll own the core Kafka environment—from brokers and storage through security, automation, and observability—while driving modern, Kubernetes-based deployment patterns. You'll build self-service tooling, define reliability standards, and collaborate closely with engineering teams to ensure robust, performant, and secure streaming capabilities. The ideal candidate brings deep Kafka expertise, strong automation skills, and a cloud-native engineering mindset.

Key Responsibilities

Kafka Platform Ownership: Architect, deploy, and operate production-grade Kafka clusters (self-managed or cloud-hosted), overseeing upgrades, scaling strategies, capacity modeling, and multi-AZ/region resiliency.
Kubernetes & Automation: Run Kafka on Kubernetes using Operators, Helm, and Git Ops; build automation frameworks and guardrails using IaC to support repeatable, compliant, zero-downtime deployments.
Ecosystem Services: Manage and optimize Kafka Connect, Schema Registry, and replication technologies (Mirror Maker 2, Cluster Linking); define connector standards and enable self-service provisioning.
Reliability Engineering: Establish SLOs, own incident response, maintain runbooks, conduct postmortems, and develop automated remediation and resilience patterns.
Observability: Build and maintain monitoring for metrics, logs, traces, consumer lag, partition health, and capacity insights using tools such as Prometheus, Grafana, Burrow, Cruise Control, or Open Telemetry.
Security & Compliance: Implement encryption, authentication, authorization, secrets management, network policies, and audit controls for secure data-in-motion.
Streaming Best Practices: Guide application teams on topic strategy, partitioning, retention and compaction tuning, idempotency, ordering guarantees, schema evolution, DLQs, and exactly-once semantics.
Cross-Functional Collaboration: Partner with application, data, platform, and SRE teams to provide tooling, documentation, enablement, and architectural guidance.
Technical Leadership: Mentor engineers, help shape platform strategy, and contribute to long-term standards and roadmap decisions.

Core Skills

Kafka Expertise: Extensive hands-on experience operating Kafka in production environments at scale, including brokers, controllers, replication, ISR dynamics, rebalancing, storage tiers, and failure recovery.
Kubernetes

Skills:

Strong background operating stateful systems on Kubernetes using Operators, Helm, CRDs, and cloud-native patterns.
Automation: Proficiency with IaC tools (e.g., Terraform), Git Ops workflows (Argo CD or Flux), and CI/CD tooling for full lifecycle automation.
Programming: Strong scripting and development experience in Python, Go, or Java; plus solid Bash and Linux fundamentals (networking, file systems, JVM tuning).
Observability & Tuning: Expertise in Kafka performance troubleshooting, capacity planning, monitoring stacks, and alerting workflows.
Security: Hands-on experience with TLS/mTLS, SASL/OAuth, ACL/RBAC, and secret-management solutions such as Vault.
Ecosystem Components: Experience with Kafka Connect, Schema Registry, Mirror Maker 2/Cluster Linking; familiarity with Cruise Control.
Cloud: Knowledge of AWS, Azure, or GCP networking, IAM, and managed streaming services such as Confluent Cloud or AWS MSK.
Operational Excellence: Demonstrated ability to write runbooks, lead incidents, and drive platform improvements.

Preferred Qualifications

Experience with stream-processing frameworks (Kafka Streams, Flink, Spark Structured Streaming).
Background running Strimzi or Confluent for Kubernetes in production.
Knowledge of CDC technologies and connector operations at scale (e.g., Debezium).
Experience designing multi-region architectures, cluster-linking strategies, and disaster-recovery processes.

Locations

Chicago, IL
New York, NY

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language