Data Technician
Lancaster, Lancashire, LA1, England, UK
Listed on 2025-12-25
-
IT/Tech
Data Engineer, Data Analyst, Data Scientist, Data Entry
Data Technician
Department: Data Management
Employment Type: Permanent - Full Time
Location: Lancaster UK
DescriptionWithin Yordas Group, we store a lot of data about chemical substances. Our main database (Hive) contains over 294,000 substances which have been included in over 2,300 individual lists and regulations. A great many of the information sources for this data have come from government and industry, who are required to make this sort of data public. They do so in a variety of ways - often PDF documents, HTML tables, and searchable databases - with inconsistent standards of quality, usability, and implicit structure.
The first stage of the Extract, Transform and Load pipeline is to get all of the relevant substance data out of these documents in a way that can be handed off for transformation and loading onto our substances database. It is the role of Regulatory Data Technicians to carry out this vital first stage accurately and with a good level of insight into the data they are presented with.
We are also in the process of developing a much more sophisticated regulatory database to manage this data, with greatly improved scope to capture regulatory and substance data in a structured way. The ambition is to have a system that can service both our customer-facing Helix software, and integration projects that require a more granular, ordered approach to regulatory data. We will require Data Technicians to work within and give feedback on an enhanced Extract, Transform, and Load pipeline for new data, as we populate a new database to new standards.
At its core, the role will involve the extraction of data from regulatory documents and the interpretation and representation of its structure. Although what we need from a particular source and general guidelines on the output will be set, it will be up to the Technician to say what is in the document and present the data how they think is best.
Support and advice will be available within the data team, but Technicians will be encouraged to use any tools and methods at their disposal to achieve their task.
Within this role, you will work alongside regulatory experts to interpret and understand the scope of regulations. Although knowledge of substance regulation is not required for this role, it will be important to be able to rapidly acquire broad knowledge of particular areas of the industry as they arise, and make decisions about approaches to data‑handling based on that knowledge.
This role would be particularly suitable for candidates with data processing experience, an appetite for experimenting with new methods and packages, and a talent for being able to ‘see through’ complex data structures.
Role and ResponsibilitiesThe core duties of the role centre around supporting the management of the pipeline for new regulatory/substance data, in particular:
- Working with new data sources, following rules for the extraction of the data, including output format and handling requirements
- Collaborating with data project owners to establish job priorities, and support them in ensuring that the input and output documents associated with extraction jobs are organised and managed correctly
- Working with Subject Matter Experts to understand the key data present in new source documents
- Performing quality control of the output of new extractions
- Performing transformation of raw structured data extractions to established upload formats and standards
- Working with systems developers to refine the data load process and contribute to product development
Other tasks
- Assisting colleagues in the wider team with data manipulation tasks
- Contribute to discussion about approaches and best practices within the data team
- Designing reusable tools and automated methods to assist all aspects of data management
Essential qualities and skills
- Experience with Python, R, or other languages suitable for data work
- Use of data extraction and data handling packages (e.g.: Pandas, Numpy)
- Experience with Excel or Google Sheets
- Logical and analytical skills
- Strong attention to detail and accuracy
- Ability to work to specifications with a questioning attitude
- A Grade 4/C in Maths and English GCSE or equivalent
- Interest in sustainability and chemical regulations
Desirable skills and qualities
- Experience with ‘relational’ structured data sets (e.g. SQL, Maria
DB, Postgre
SQL, etc. databases) - Experience with web scraping methods and packages
- Advance Level qualifications in any of the following:
Chemistry;
Mathematics;
Computer Science - Working to or have achieved an Undergraduate degree in a STEM field, though not essential.
- Remote working experience
Dependent on location, we offer an excellent range of staff benefits, including:
- Pension Scheme and Medical Benefits
- Generous holidays
- Professional Development
- Social Culture
- Flexible working
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: