×
Register Here to Apply for Jobs or Post Jobs. X

Senior Kafka Platform Engineer

Job in Chicago, Cook County, Illinois, 60290, USA
Listing for: Selby Jennings
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability, Data Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Overview

We're looking for a Senior Kafka Platform Engineer to design, automate, and scale a mission-critical event-streaming platform. In this role, you'll own the core Kafka environment—from brokers and storage through security, automation, and observability—while driving modern, Kubernetes-based deployment patterns. You'll build self-service tooling, define reliability standards, and collaborate closely with engineering teams to ensure robust, performant, and secure streaming capabilities. The ideal candidate brings deep Kafka expertise, strong automation skills, and a cloud-native engineering mindset.

Key Responsibilities
  • Kafka Platform Ownership: Architect, deploy, and operate production-grade Kafka clusters (self-managed or cloud-hosted), overseeing upgrades, scaling strategies, capacity modeling, and multi-AZ/region resiliency.
  • Kubernetes & Automation: Run Kafka on Kubernetes using Operators, Helm, and Git Ops; build automation frameworks and guardrails using IaC to support repeatable, compliant, zero-downtime deployments.
  • Ecosystem Services: Manage and optimize Kafka Connect, Schema Registry, and replication technologies (Mirror Maker 2, Cluster Linking); define connector standards and enable self-service provisioning.
  • Reliability Engineering: Establish SLOs, own incident response, maintain runbooks, conduct postmortems, and develop automated remediation and resilience patterns.
  • Observability: Build and maintain monitoring for metrics, logs, traces, consumer lag, partition health, and capacity insights using tools such as Prometheus, Grafana, Burrow, Cruise Control, or Open Telemetry.
  • Security & Compliance: Implement encryption, authentication, authorization, secrets management, network policies, and audit controls for secure data-in-motion.
  • Streaming Best Practices: Guide application teams on topic strategy, partitioning, retention and compaction tuning, idempotency, ordering guarantees, schema evolution, DLQs, and exactly-once semantics.
  • Cross-Functional Collaboration: Partner with application, data, platform, and SRE teams to provide tooling, documentation, enablement, and architectural guidance.
  • Technical Leadership: Mentor engineers, help shape platform strategy, and contribute to long-term standards and roadmap decisions.
Core Skills
  • Kafka Expertise: Extensive hands-on experience operating Kafka in production environments at scale, including brokers, controllers, replication, ISR dynamics, rebalancing, storage tiers, and failure recovery.
  • Kubernetes

    Skills:

    Strong background operating stateful systems on Kubernetes using Operators, Helm, CRDs, and cloud-native patterns.
  • Automation: Proficiency with IaC tools (e.g., Terraform), Git Ops workflows (Argo CD or Flux), and CI/CD tooling for full lifecycle automation.
  • Programming: Strong scripting and development experience in Python, Go, or Java; plus solid Bash and Linux fundamentals (networking, file systems, JVM tuning).
  • Observability & Tuning: Expertise in Kafka performance troubleshooting, capacity planning, monitoring stacks, and alerting workflows.
  • Security: Hands-on experience with TLS/mTLS, SASL/OAuth, ACL/RBAC, and secret-management solutions such as Vault.
  • Ecosystem Components: Experience with Kafka Connect, Schema Registry, Mirror Maker 2/Cluster Linking; familiarity with Cruise Control.
  • Cloud: Knowledge of AWS, Azure, or GCP networking, IAM, and managed streaming services such as Confluent Cloud or AWS MSK.
  • Operational Excellence: Demonstrated ability to write runbooks, lead incidents, and drive platform improvements.
Preferred Qualifications
  • Experience with stream-processing frameworks (Kafka Streams, Flink, Spark Structured Streaming).
  • Background running Strimzi or Confluent for Kubernetes in production.
  • Knowledge of CDC technologies and connector operations at scale (e.g., Debezium).
  • Experience designing multi-region architectures, cluster-linking strategies, and disaster-recovery processes.
Locations
  • Chicago, IL
  • New York, NY
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary