Skip to main content
Search roles

Data Lake Platform Lead

Location Gothenburg, Västra Götaland County, Sweden Cambridge, England, United Kingdom Job ID R-048460 Date posted 07/19/2019

AstraZeneca is a global, innovation-driven biopharmaceutical business that focuses on the discovery, development and commercialization of prescription medicines for some of the world's most serious diseases. But we're more than one of the world's leading pharmaceutical companies. At AstraZeneca, we're proud to have a unique workplace culture that encourages innovation and teamwork. Here, employees are empowered to express different perspectives and are made to feel valued, energized and rewarded for their ideas and creativity.

Science and Enabling Units IT is a global IT capability supporting Drug Research, Drug Development, Product & Portfolio Strategy, Medical Affairs, Finance, HR, Compliance, Legal and Global Business Services. We are organized around seven key capability areas: Business Partnering, Solution Delivery, Architecture, Application Support, Data & Analytics, Change & Operations, operating out of sites across the US, UK, Sweden, India and Mexico.

Data & Analytics provides analytics and data insight services and solutions critical to the Data & AI/ML emerging strategy and mission of S&EUIT and AZ. D&A is organized into teams specializing in Information Architecture, Data Engineering, Visual Engineering, Knowledge Management, Data Science, Data Analysis and Information Governance.

As a senior Information Architect, you will be responsible for developing the IA strategy and roadmap for Data Catalogs and Metadata, integrating with wider architecture roadmaps. In addition, they will be accountable for ensuring all data/Tech investments within the function comply with the roadmap for Data Catalogs and Metadata.


AstraZeneca has made a significant investment in the design and delivery of a Data Lake environment .The environment has been very successful at integrating data from across the organization and integrating for reporting within ODS’s and Data Marts. A data catalogue that describes the data sources and which organizes metadata describing the content and lineage of data is essential to the ongoing management and evolution of this environment.

This role is focused on ensuring that we have the right strategies, designs and processes in place to collect, lead and exploit metadata describing the content of the Data Lake.

This includes metadata and standards that support data management and data governance, monitoring compliance to metadata standards and driving the exploitation of the metadata within IT and business solutions. Dedicatedly responsible for developing Information Architecture standard methodology – tooling, methods, standards, skills, documentation and training within the area of Data Catalogs and metadata management.

Whilst predominately focusing on the Data Lake, you should also provide leadership to the definition and management of metadata gained elsewhere, including supporting the definition of scientific metadata standards, which will be used to define the content of scientific data sets.

You will deliver tangible value to partners through the optimal design and collection of metadata. A key feature will be fully exploiting the metadata to ensure AZ data is Findable, Accessible, Interoperable and Reusable (FAIR) across our systems, ready for either direct use by consumers or AI machines.

Key Responsibilities

  • You will be Instrumental in the implementation of strategies for the collection and exploitation of metadata supporting the Data Lake.
  • New ways of establishing an ongoing population of the Data Catalogue describing data sources ingested. Ensuring that metadata, describing the content of the data lake, is current, complete and has integrity.
  • Collection and reporting of Data Lineage and Provenance metadata, showing the source and destination for all data held within the data lake environment, and when and how the data was loaded.
  • Providing a definition of solutions to link Data Lake content with business data governance, IT governance and engineering activities, including the use of business data glossaries.
  • New ways to load and fully utilise conceptual, logical and physical data models defined within IDERA Data Architect.
  • Standards for the identification of sensitive content, such as personal information, and definition of governance processes to ensure such content used and led appropriately.
  • Security – ensures security status for all data in lake is clear, processes adhered to for access, in line with AZ and Data owner specifications.
  • Ensuring data load (batch/delta) is functioning and running at appropriate update frequency
  • Developing new solutions and processes to ensure that the Data Catalog and associated metadata is fully searchable and, where appropriate, accessible via API’s.
  • Defining the framework to describe scientific metadata – including the identification of individual metadata terms via data glossaries and supporting metadata information
  • Defining the format for metadata standards and the process for reviewing approving at appropriate architecture and business data governance teams.
  • New ways to exploit externally defined semantic ontologies, to provide external identity for data concepts.
  • Identification of how to utilise Data Catalogs, and metadata, to simplify and optimise the design and architecture of new solutions.
  • Establishment of IT architecture processes, to make sure that metadata capture is considered as part of the design of new solutions.
  • Set the strategy and roadmap for Data Catalogs and Metadata Management, aligned with business strategy and data requirements. Ensure alignment of such IA strategies and roadmaps with Enterprise IA standards and roadmaps.
  • You will lead projects that focus on the use or extensions of Data Catalogs or Metadata solutions, delivering IA blueprints and design, in line with functional IA roadmaps
  • Being an active member of architecture and data governance boards - representing metadata management.
  • Establish strong working relationship with business groups to develop an in-depth understanding of business priorities, data requirements & use as well as early insight into changing needs.

Essential Knowledge, Skills and Experience

  • You will have key experience in data/information architecture or related discipline (solution architecture, data engineering).
  • You ideally will have domain knowledge (processes & data): Pharma R&D, Finance, HR, Compliance etc
  • BSc / MSc in Computer Science
  • Key experience and knowledge of a range of BI & analytics architectures: traditional warehousing, Distributed computing (Hadoop), NoSQL, virtualization, data streaming etc.
  • Confirmed experience with metadata architectures and Data Cataloging tools.
  • Experience of working with data scientists and their methods: understanding of how data needs to be prepared for use by data scientists.
  • Experience of delivering IA within IT projects delivered through Agile and Waterfall methodologies.
  • Critical Thinking – seeing opportunities through evolving technology/methodology and formulating into tangible roadmaps with clear business value.
  • Business Leadership – ability to partner and collaborate across both IT and Science teams to inform on the best Information Architecture.
  • Influencing and innovation skills
  • Superb communication and facilitation skills


AstraZeneca is an equal opportunity employer. AstraZeneca will consider all qualified applicants for employment without discrimination on grounds of disability, sex or sexual orientation, pregnancy or maternity leave status, race or national or ethnic origin, age, religion or belief, gender identity or re-assignment, marital or civil partnership status, protected veteran status (if applicable) or any other characteristic protected by law. AstraZeneca only employs individuals with the right to work in the country/ies where the role is advertised

AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorisation and employment eligibility verification requirements.

We’ll keep you up to date

Sign up to be the first to receive job updates.

Interested InSearch for a category and select one from the list of suggestions. Search for a location and select one from the list of suggestions. Finally, click “Add” to create your job alert.

  • Information Technology, Gothenburg, Västra Götaland County, SwedenRemove
  • Information Technology, Cambridge, England, United KingdomRemove
  • Data and AI, Gothenburg, Västra Götaland County, SwedenRemove
  • Data and AI, Cambridge, England, United KingdomRemove

Glassdoor logo Rated four stars on Glassdoor

Great culture, great work assignments, supportive management. Rotation opportunity within the company. They value inclusion and diversity.