Data Engineer Job Bangalore area,Bengaluru Karnataka India,IT/Tech

Location: Bengaluru

Responsibilities
Data Engineering and Processing:

• Develop and manage data pipelines using PySpark on Databricks.

• Implement ETL/ELT processes to process structured and unstructured data at scale.

• Optimize data pipelines for performance, scalability, and cost-efficiency in Databricks. Databricks Platform Expertise:

• Experience in Perform Design, Development & Deployment using Azure Services (Data Factory, Databricks, PySpark, SQL).

• Develop and maintain scalable data pipelines and build new Data Source integrations to support increasing data volume and complexity.

• Leverage the Databricks Lakehouse architecture for advanced analytics and machine learning workflows.

• Manage Delta Lake for ACID transactions and data versioning.

• Develop notebooks and workflows for end-to-end data solutions. Cloud Platforms and Deployment:

• Deploy and manage Databricks on Azure (e.g., Azure Databricks).

• Use Databricks Jobs, Clusters, and Workflows to orchestrate data pipelines.

• Optimize resource utilization and troubleshoot performance issues on the Databricks platform. CI/CD and Testing:

• Build and maintain CI/CD pipelines for Databricks workflows using tools like Azure Dev Ops, Git Hub Actions, or Jenkins.

• Write unit and integration tests for PySpark code using frameworks like Pytest or unittest. Collaboration and Documentation:

• Work closely with data scientists, data analysts, and IT teams to deliver robust data solutions.

• Document Databricks workflows, configurations, and best practices for internal use.

Qualifications:

• 4+ years of experience in data engineering or distributed systems development.

• Strong programming skills in Python and PySpark.

• Hands-on experience with Databricks and its ecosystem, including Delta Lake and Databricks SQL.

• Knowledge of big data frameworks like Hadoop, Spark, and Kafka.

• Proficiency in setting up and managing Databricks Work spaces, Clusters, and Jobs.

• Familiarity with Databricks MLflow for machine learning workflows is a plus.

• Expertise in deploying Databricks solutions Azure (e.g., Data Lake, Synapse).

• Knowledge of Kubernetes for managing containerized workloads is advantageous.

• Experience with both SQL (e.g., Postgre

SQL, SQL Server) and No

SQL databases (e.g., Mongo

DB, Cosmos DB).

• Bachelor’s Degree in Computer Science, Data Engineering, or a related field is preferred.

• Relevant certifications in Databricks, PySpark, or cloud platforms are highly desirable.


Increase/decrease your Search Radius (miles)



Job Posting Language