More jobs:
Job Description & How to Apply Below
We are seeking a highly skilled and experienced Senior Data Engineer to join our data engineering team. In this role, you will be responsible for designing, implementing, and optimizing real-time data pipelines that process terabytes of data.
The ideal candidate will have 3+ years of hands-on experience in data engineering and strong expertise in modern data platforms such as Databricks, PySpark, Delta Lake, Amazon S3, and Kafka.
This role offers an opportunity to work on cutting-edge technologies in a fast-paced environment with a strong focus on performance optimization, scalability and reliability.
Key Responsibilities:
• Design, build, and maintain robust, scalable, and efficient real-time data pipelines using Databricks, PySpark, Kafka, and Delta Lake.
• Architect and implement data ingestion pipelines for high-volume streaming and batch data into Amazon S3 and Delta Lake.
• Optimize data pipelines and workflows for performance, scalability, and cost-efficiency.
• Process and analyze terabytes of structured and unstructured data to enable near real-time decision-making.
• Collaborate closely with stakeholders to define data requirements and ensure data integrity, security, and availability.
• Implement advanced data transformations, deduplication, and enrichment logic.
• Continuously improve data engineering best practices, automation, and reliability.
• Monitor, troubleshoot, and resolve issues in data pipelines to ensure high availability.
Experience:
• 3+ years of hands-on experience in data engineering, building and operating large-scale data pipelines in production environments.
Must Have
Skills:
• Proven experience with Databricks and PySpark for large-scale data processing.
• Strong expertise in Delta Lake for real-time and batch data workloads.
• In-depth knowledge of Apache Kafka for real-time data streaming.
• Hands-on experience with AWS S3 or equivalent cloud storage solutions.
• Solid understanding of distributed computing concepts and performance tuning.
• Experience processing and managing terabytes of data in production systems.
• Strong background in ETL/ELT design, data modeling, and pipeline optimization.
• Proficiency in writing clean, efficient, and maintainable Python (PySpark) code.
Nice-to-Have
Skills:
• Familiarity with Dev Ops practices, CI/CD pipelines, and Kubernetes.
• Knowledge of data security, governance, and compliance best practices.
• Experience with monitoring and alerting tools such as Prometheus, Grafana, or AWS Cloud Watch.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×