More jobs:
Job Description & How to Apply Below
Responsibilities
Data Engineering and Processing:
• Develop and manage data pipelines using PySpark on Databricks.
• Implement ETL/ELT processes to process structured and unstructured data at scale.
• Optimize data pipelines for performance, scalability, and cost-efficiency in Databricks. Databricks Platform Expertise:
• Experience in Perform Design, Development & Deployment using Azure Services (Data Factory, Databricks, PySpark, SQL).
• Develop and maintain scalable data pipelines and build new Data Source integrations to support increasing data volume and complexity.
• Leverage the Databricks Lakehouse architecture for advanced analytics and machine learning workflows.
• Manage Delta Lake for ACID transactions and data versioning.
• Develop notebooks and workflows for end-to-end data solutions. Cloud Platforms and Deployment:
• Deploy and manage Databricks on Azure (e.g., Azure Databricks).
• Use Databricks Jobs, Clusters, and Workflows to orchestrate data pipelines.
• Optimize resource utilization and troubleshoot performance issues on the Databricks platform. CI/CD and Testing:
• Build and maintain CI/CD pipelines for Databricks workflows using tools like Azure Dev Ops, Git Hub Actions, or Jenkins.
• Write unit and integration tests for PySpark code using frameworks like Pytest or unittest. Collaboration and Documentation:
• Work closely with data scientists, data analysts, and IT teams to deliver robust data solutions.
• Document Databricks workflows, configurations, and best practices for internal use.
Qualifications:
• 4+ years of experience in data engineering or distributed systems development.
• Strong programming skills in Python and PySpark.
• Hands-on experience with Databricks and its ecosystem, including Delta Lake and Databricks SQL.
• Knowledge of big data frameworks like Hadoop, Spark, and Kafka.
• Proficiency in setting up and managing Databricks Work spaces, Clusters, and Jobs.
• Familiarity with Databricks MLflow for machine learning workflows is a plus.
• Expertise in deploying Databricks solutions Azure (e.g., Data Lake, Synapse).
• Knowledge of Kubernetes for managing containerized workloads is advantageous.
• Experience with both SQL (e.g., Postgre
SQL, SQL Server) and No
SQL databases (e.g., Mongo
DB, Cosmos DB).
• Bachelor’s Degree in Computer Science, Data Engineering, or a related field is preferred.
• Relevant certifications in Databricks, PySpark, or cloud platforms are highly desirable.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×