Expert Data Scientist
Listed on 2026-01-12
-
IT/Tech
Data Scientist, Data Analyst, Machine Learning/ ML Engineer, AI Engineer
Department Overview
Electric T&D Engineering is responsible for the electric system engineering and planning, asset strategy, and risk management across transmission, distribution, and substation asset families. This centralized, risk-informed approach allows PG&E to manage electric risk, asset and system health, interconnections, and performance by using consistent standards, work methods, prioritization, and program sponsorship, while leveraging lessons learned from inspections and asset data to inform asset management decisions.
Accountable for asset planning and strategy, standards and work methods, and asset data management for Electric.
PG&E is seeking an experienced data science professional to join the Electric Compliance Assurance, Analysis & Intake (A&I) Team as a Data Scientist, Expert. This role will focus on developing advanced machine learning (ML) and predictive modeling solutions to improve electric and safety incident classification, reduce CPUC late reporting violations, predict compliance issues, and accelerate root cause evaluations.
The position will design and implement models for automated incident classification, predictive violation risk, root-cause clustering, and compliance monitoring. A strong candidate will be highly analytical, innovative, and collaborative, with expertise in data science techniques and a passion for applying AI/ML to real-world compliance challenges.
Headquarter location is PG&E Oakland General Office and may require occasional travel across PG&E's service territory.
A reasonable salary range is:
- Minimum Base Salary (Bay Area) $
- Mid Base Salary (Bay Area) $
- Maximum Base Salary (Bay Area) $
- Automated Incident Classification:
Develop NLP-based and supervised learning models (e.g., logistic regression, Naive Bayes) to classify electric incidents as reportable vs. non-reportable; implement multiclass classification to identify safety incident type, severity, and other attributes. - Predictive Violation Risk:
Build predictive models to assess risk of CPUC Notice of Violation (NOV) for new incidents; analyze historical violations and identify patterns by asset type, region, and reporting type; apply multi-label classification to assign (CPUC General Order) GO rules to incidents. - Root-Cause Clustering & Archetypes:
Use unsupervised learning to discover recurring incident archetypes that inform preventative programs and corrective actions; generate cluster profiles, stability scores, and mappings from clusters to corrective actions. - Cause Evaluation Modeling:
Apply decision trees, random forests, and gradient boosting to recommend evidence-based root causes. - Compliance Drift & Model Monitoring:
Establish continuous monitoring for deployed models to detect input data drift, label shift, performance decay, and regulatory changes. - Technical Development &
Collaboration:
Extract, transform, and load data from multiple PG&E systems for feature engineering; write modular, reusable Python code for data science workflows; partner with sponsor departments and subject matter experts to ensure models align with business needs; present findings and recommendations to senior leadership and act as peer reviewer for complex models. - Active participation in the external data science/AI/ML community of practice (e.g., volunteering in professional organizations, conference presentations, publications, or similar activities).
- Competency with data science standards and processes (model evaluation, optimization, feature engineering) and best practices for implementation.
- Knowledge of industry trends and current issues in data science as demonstrated through peer-reviewed publications, conference presentations, or open-source contributions.
- Proficiency with commonly used data science and/or operations research programming languages, packages, and tools for building ML models and algorithms.
- Ability to explain technical concepts in breadth and depth, including statistical inference, ML algorithms, software engineering, and model deployment pipelines.
- Mastery in clearly communicating complex technical details and insights to colleagues and stakeholders.
- Strong…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).