Data Engineer – ETL
Ref: BBBH66854_1778770553Data Engineer – ETL
Whitehall Resources are currently looking for a Data Engineer – ETL.
This role will be Inside of IR35, so you will be required to use an FCSA Accredited Umbrella Company.
Key Requirements:
– This is a greenfield opportunity to establish and lead the quality engineering approach for business critical data pipelines (ETL/ELT) within a fast-moving, data-driven environment.
– The organization operates in an industry where data accuracy, consistency, and timeliness are essential to analytics, reporting, and downstream systems, making this role highly impactful.
– You will work on end-to-end data projects, covering data ingestion, transformation, and delivery into outputs such as APIs, Excel reports, and XML files.
– Your role will ensure that these pipelines are robust, reliable, and production ready, enabling confident business decision-making.
– The position provides hands on access to modern data technologies, including Databricks for large-scale data processing, Python-based automation (Pytest, Pandas), structured validation tools (OpenPyXL, lxml, xmlschema), and CI/CD implementation using GitHub Actions.
– You will define best practices and build scalable automation frameworks, rather than inherit existing processes.
– Overall, you will play a key role in building trust in data, reducing manual validation effort, and enabling faster, high quality data delivery across the organization.
Key responsibilities:
– Take full ownership of data pipeline testing strategy, defining scope, priorities, and standards.
– Design and implement a scalable Pytest based automation framework for ETL/data validation.
– Develop robust SQL based validation checks (reconciliation, duplicates, nulls, business rules).
– Automate validation of Excel outputs using Pandas and OpenPyXL, ensuring structural and data accuracy.
– Validate XML outputs using lxml and xmlschema, including schema compliance and business level rules.
– Own and automate API testing workflows using Postman and Newman.
– Integrate all testing workflows into GitHub Actions, enabling CI/CD driven quality gates.
– Build clear, actionable reporting and logging, making failures easy to diagnose and debug.
– Collaborate closely with data engineers to improve testability, catch defects early, and accelerate resolution.
– Mentor and guide junior testers, establishing best practices, code standards, and team workflows.
Key Experience:
– Strong hands-on experience in testing data pipelines / ETL/ELT systems in complex data environments.
– Advanced SQL skills, with the ability to write complex queries for data validation, debugging, and reconciliation.
– Proven expertise in test automation using Python with Pytest, including building reusable frameworks (fixtures, utilities, modular design).
– Experience using Pandas and OpenPyXL for validating data outputs such as Excel files (structure, values, comparisons).
– Hands-on experience validating XML data using lxml and xmlschema, including schema (XSD) validation and business rules.
– Solid experience in API testing using Postman and automation using Newman.
– Practical experience implementing CI/CD pipelines using GitHub Actions for automated test execution and quality gates.
– Familiarity with Databricks or similar modern data platforms for large-scale data processing and validation.
– Strong understanding of data quality principles, including data integrity, transformation validation, and reconciliation techniques.
– Experience in building QA processes from scratch or leading data testing initiatives, with ability to define standards and mentor others.
Desirable skills:
– Exposure to modern data stacks / cloud environments supporting ETL/ELT pipelines.
– Experience with data quality tools (e.g., Great Expectations).
– Knowledge of data orchestration tools (e.g., Airflow, ADF).
– Experience in performance and large-volume data testing.
– Familiarity with end-to-end data validation across APIs, files (Excel/XML), and pipelines.
– Prior experience in setting up or scaling QA practices for data platforms.
All of our opportunities require that applicants are eligible to work in the specified country/location, unless otherwise stated in the job description.
Whitehall Resources are an equal opportunities employer who value a diverse and inclusive working environment. All qualified applicants will receive consideration for employment without regard to race, religion, gender identity or expression, sexual orientation, national origin, pregnancy, disability, age, veteran status, or other characteristics.
