More jobs:
Job Description & How to Apply Below
Responsibilities
Develop solutions leveraging cloud big data technology to ingest, process and analyze large, disparate data sets to exceed business requirements.
Develop data lake solution to store structured and unstructured data from internal and external sources and provide technical guidance to help migrate colleagues to modern technology platform.
Contribute and adhere to CI/CD processes, development best practices and strengthen the discipline in Data Engineering Org.
Develop systems that ingest, cleanse and normalize diverse datasets, develop data pipelines from various internal and external sources and build structure for previously unstructured data.
Using PySpark and Spark SQL, extract, manipulate, and transform data from various sources, such as databases, data lakes, APIs, and files, to prepare it for analysis and modeling.
Perform the unit testing, system integration testing, regression testing and assist with user acceptance testing.
Consults with the business to develop documentation and communication materials to ensure accurate usage and interpretation of JLL data.
Implement data security best practices, including data encryption, access controls, and compliance with data protection regulations. Ensure data privacy, confidentiality, and integrity throughout the data engineering processes.
Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Experience & Education
Minimum of 2 years of experience as a data developer using Python, PySpark, Spark Sql, ETL knowledge, SQL Server, ETL Concepts.
Bachelors degree in Information Science, Computer Science, Mathematics, Statistics or a quantitative discipline in science, business, or social science.
Experience in Azure Cloud Platform, Databricks, Azure storage.
Effective written and verbal communication skills, including technical writing.
Excellent technical, analytical and organizational skills.
Technical Skills & Competencies
Experience handling un-structured, semi-structured data, working in a data lake environment, leveraging data streaming and developing data pipelines driven by events/queues
Hands on Experience and knowledge on real time/near real time processing and ready to code
Hands on Experience in PySpark, Databricks, and Spark Sql.
Knowledge on json, Parquet and Other file format and work effectively with them
No
Sql Databases Knowledge like Hbase, Mongo, Cosmos etc.
Preferred Cloud Experience on Azure or AWS
Python-spark, Spark Streaming, Azure SQL Server, Cosmos DB/Mongo DB, Azure Event Hubs, Azure Data Lake Storage, Azure Search etc.
Team player, Reliable, self-motivated, and self-disciplined individual capable of executing on multiple projects simultaneously within a fast-paced environment working with cross functional teams.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×