Data Engineer Job Prakāshamnagar Telangana India,IT/Tech

Location: Prakāshamnagar

Roles & Responsibilities :
Design, develop, and maintain data solutions for data generation, collection, and processing
Be a key team member that assists in design and development of the data pipeline
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks
Collaborate with multi-functional teams to understand data requirements and design solutions that meet business needs
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
Implement data security and privacy measures to protect sensitive data
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
Collaborate and communicate effectively with product teams
Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions
Identify and resolve complex data-related challenges
Adhere to standard methodologies for coding, testing, and designing reusable code/component
Explore new tools and technologies that will help to improve ETL platform performance
Participate in sprint planning meetings and provide estimations on technical implementation
Work with data engineers on data quality assessment, data cleansing and data analytics
Share and discuss findings with team members practicing SAFe Agile delivery model
Work as a Data Engineer for a team that uses Cloud and Big Data technologies to design, develop, implement and maintain solutions to support the R&D functional area.
Overall management of the Enterprise Data Lake on AWS environment to ensure that the service delivery is cost effective and business SLAs around uptime, performance and capacity are met.
Proactively work on challenging data integration problems by implementing optimal ETL patterns, frameworks for structured and unstructured data.
Automate and Optimize data pipeline and framework for easier and cost-effective development process.
Advice and support project teams (project managers, architects, business analysts, and developers) on cloud platforms (AWS, Databricks preferred), tools, technology, and methodology related to the design, build scalable, efficient and maintain Data Lake and other Big Data solutions
Experience developing in an Agile development environment, and comfortable with Agile terminology and ceremonies.
Stay up to date with the latest data technologies and trends.
What we expect of you
We are all different, yet we all use our unique contributions to serve patients.

Basic Qualifications:

Masters degree and 1 to 3 years of Computer Science, IT or related field experience OR
Bachelors degree and 3 to 5 years of Computer Science, IT or related field experience OR
Diploma and 7 to 9 years of Computer Science, IT or related field experience
Have 3-5 years of experience in the Pharmaceutical Industry
Have 3-5 years of experience in Mulesoft development
Hands-on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, Spark

SQL), workflow orchestration, performance tuning on big data processing
Hands-on experience with various Python/R packages for EDA, feature engineering, and machine learning model training
Proficiency in data analysis tools (eg. SQL) and experience with data visualization tools
Excellent problem-solving skills and the ability to work with large, complex datasets
Solid understanding of data governance frameworks, tools, and standard methodologies.
Knowledge of data protection and pharmaceutical regulations and compliance requirements (e.g., GxP, GDPR, CCPA)
Demonstrated hands-on experience with AWS cloud platform and its technologies like EC2, RDS, S3, Redshift, and IAM roles.
Extensive hands-on experience of working on Data Ingestion methods such as Batch, API and Streaming.
Knowledge of data protection regulations and compliance requirements (e.g., GDPR, CCPA).
Demonstrated experience of performing Data Integrations using Mulesoft.
Solid understanding of ETL, Data Modeling and Data Warehousing concepts.
Ability to work independently with little supervision
Ability to effectively present information to collaborators, and respond to questions to their questions

Preferred Qualifications:

Knowledge for clinical data in the pharmaceutical industry
Knowledge of CT.gov and EUCTR.gov portals
Knowledge of the Disclose application from Citeline

Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development
Solid understanding of data modeling, data warehousing, and data integration concepts
Knowledge of Python/R, Databricks, Sage Maker, cloud data platforms
Proficiency with Data Orchestration tools like Kubernetes, Docker…


Increase/decrease your Search Radius (miles)



Job Posting Language