Data Catalogue Technical Lead
AstraZeneca is a global biopharmaceutical business that focuses on the discovery, development and commercialisation of prescription medicines for some of the world's most serious diseases. At AstraZeneca, we're proud to have a workplace culture that encourages innovation and collaboration. Here, you would express diverse perspectives, contribute to an energised environment and provide creative ideas - and be rewarded for this!
We are recruiting for a Data Catalogue Technical Lead to be based in one of our hub sites (Cambridge UK, Gaithersburg MD, Gothenburg Sweden).
As part of AstraZeneca’s Science Data Foundation (SDF) program we are building out a data catalogue for our Research & Development teams. SDF exists to give our scientists access to data and tools at pace, accelerating their work on life saving medicines. Through SDF we are making our data Findable, Accessible, Interoperable and Re-usable (FAIR). This is being achieved through the creation of a distributed data architecture - our data catalogue will be the unifying architectural component. The data catalogue will: be a registry of internal and external data products, digitise access governance and automate access provision. As such the Data Catalogue will be foundational in making our data products: findable, accessible and re-usable.
It is our desire to build a scalable solution to support the development of data communities and valuable data products. To make this solution scalable we need to provide the services and tools to our partners in the R&D in a secure, compliant, stable and balanced way.
As the Data Catalogue Technical Lead you will own the engineering effort to implement a data catalogue as a capability for R&D and IT. This will involve owning the technical design and development processes to meet product owner requirements and architectural strategy!
We have chosen Collibra as our data catalogue technology. Our teams will be making use a range of data engineering products to acquire, ingest and curate metadata into the data catalogue (including Talend, Mulesoft and AWS Glue). You will be heavily involved in leading the work of a team of Collibra developers and data engineers across the whole SDLC. It would be advantageous if you have previous team leadership experience.
The team will be cataloguing of a wide variety of data sources including: Omics, Imaging, clinical studies, DMTA cycle systems, AI/ML model outputs, literature, sensor data, and external data sources. This will include implementing metadata models, building governance workflows, automating granting of access and building out APIs.
You will need a collaborative delivery approach to be successful. We prefer to use Agile but choose the appropriate approach for the project. So, experience of a variety of delivery management methodologies would come in useful. You will provide technical leadership throughout our software development lifecycle, from the initial development of a technical design based on a blueprint, right through to hypercare. Do you have a real passion for delivering well engineered data and analytics solutions that can help improve patient lives? If you do, this will make you stand out from other applicants.
Essential skills and experience
- You will have experience of building and running a data catalogue
- Technical leadership in a data domain,
- You will be able to demonstrate an ability to understand business needs and translate them into a solution,
- You will be able to craft and document development best practices,
- You will need great interpersonal skills & a collaborative approach to delivery.
Desirable skills and experience
- It is highly desirable that you have experience developing and managing a Collibra solution
- Experience configuring and managing a SaaS system,
- Technical team leadership,
- Experience running a highly available system,
- Experience of metadata best practices and design principles,
- Awareness of legal issues surrounding data re-use, especially in a pharmaceutical organisation (e.g. PII, GxP, primary & secondary use of data),
- Experience of big data, ETL & cloud techniques and tools (we currently use Talend, Mulesoft, Redshift (inc. Spectrum), Glue, EMR, HIVE, PIG, Spark, S3, SQS, SNS),
- You have experience of technical leadership in data and analytics,
- Building and maintaining APIs over data services,
- Experience working with systems integrators,
- You are likely to have experience of Agile practices, potentially having been a SCRUM Master.
AstraZeneca is an equal opportunity employer. AstraZeneca will consider all qualified applicants for employment without discrimination on grounds of disability, sex or sexual orientation, pregnancy or maternity leave status, race or national or ethnic origin, age, religion or belief, gender identity or re-assignment, marital or civil partnership status, protected veteran status (if applicable) or any other characteristic protected by law.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorisation and employment eligibility verification requirements.