Projects

We are currently advertising DTP and CASE studentships for October 2019 start. The deadline for applications is the 26th November 2018.

27 / 09 / 2018

Open Science needs Open Data: Automated semantic description of biological data through Machine Learning (DAVEY_E19DTP)

how to apply

Biological research is undergoing a data revolution, where huge amounts of data are being generated every day. This is happening alongside increasing demands from funders and publishers to make these data available. Making data Findable, Accessible, Interoperable, and Reusable (FAIR) requires a great deal of a researcher's effort to ensure that their data is described well enough so that others can search for and reuse it. Although more and more metadata standards are being produced to help researchers describe their data, the use of these standards is not widespread or easy. Biological data needs to be liberated

This project will give the applicant an exciting opportunity to address the problem of badly-described life science datasets by developing new machine learning algorithms to automate the annotation of life science data with ontology terms. This project aims to develop modern computational methods to help biologists to better describe their data. This will hopefully improve the quality of data, allowing other researchers to access FAIR data more easily, and with a greater amount of metadata that can power meaningful data integration.

The student will learn a wide variety of scientific approaches and methodologies to Machine Learning, data management, ontologies, and community software development. No prior Machine Learning experience is necessary, but we are looking for a highly motivated applicant who has strong interest in computer science and would enjoy working with large datasets on state-of-the-art high-performance computing environments, and with the wider ontology community. We welcome applicants from all backgrounds, particularly those from under-represented groups, to join a diverse, inclusive, and friendly group of computational researchers.