Projects & Participants

Andrew West (CC BY-NC-SA 2.0)

Participants join this program with a project that they either are already working on or want to develop during this program.

For this round of the Openseeds program, we are happy to have 71 participants with 34 projects.

Projects

Multilingual Open Science: Creating Open Educational Resources in Central Asian Languages

By: Zarena Syrgak

Mentored by: Alejandro Coca Castro, Stephen Klusza

Status: not graduated

Keywords: equity, diversity, inclusion, open scholarship, multilingual open science, epistemic/linguistic, justice

The project Multilingual Open Science aims to create an open educational resource about FAIR research in Central Asian (CA) languages. The goal is to popularise and raise local researchers’ awareness of open science practices and platforms and thus to tackle the existing structural barriers (e.g., linguistic, epistemic, etc.) in accessing the information and disseminating the research from Central Asia. Currently, the local scholarship is located - geographically, linguistically, epistemically - in the global periphery which hinders the knowledge production, use, and dissemination in the region. Many scholars, as the result, become trapped in fraudulent research and publication practices. Moreover, there are cases when openly accessible educational resources are offered to researchers as a marketable product. So, to address these and other issues related to inaccessibility of knowledge production infrastructure, this project aims to create an openly accessible, fair, and multilingual educational resource about open science with the particular focus on Central Asia. To this end, I am planning to use Jupyter Notebook and create a platform similar to The Turing Way.

Training Latin American scientists’ community to create high-quality open journals with good editorial standards

By: Camila Gómez, Danae Carelis Davila Espinoza, Alvaro Andre Vargas Aguilar, Nelson Franco Condori Salluco, Jose Luis Villca Villegas

Mentored by: Alexander Martinez Mendez

Status: graduated

Keywords: training and education, research community, editorial community, open journals

Our initiative is called “Training Latin American scientists community to create high-quality open journals with good editorial standards”. We intend to build a community of native spanish speaking scientists who teach and learn about creating editorial boards and open access scientific journals that meet international high editorial quality standards. We intend to organise hands-on workshops about different steps in the editorial process of creating scientific open access journals, with experienced trainers and with open source tools. Some of the topics we would like to cover are:

  • Acquiring ISSN code for online journals,
  • Acquiring plug-ins for displaying online journal view statistics,
  • Implementing DOIs for scientific articles,
  • Implementing Ithenticate to stop plagiarism in articles submitted to the OJS system.

These workshops would merge synchronic and asynchronic activities throughout a period of 6 weeks, so that trainees get to learn all the necessary concepts through, familiarize with the required software and conceptualize and brainstorm solutions to the local needs for the journals they intend to develop, with peer support and mentoring.

Building a Cloud-SPAN community of practice

By: Evelyn Greeves

Mentored by: Anne Fouilloux

Status: graduated

Keywords: Training and education, Environmental biotechnology, Omics, Community

Cloud-SPAN trains researchers, and the research software engineers that support them, to run specialised analyses for environmental omics datasets on cloud-based high-performance computing infrastructure. It is a collaboration between the University of York and The Software Sustainability Institute funded by the UKRI Innovation Scholars award (Project Reference: MR/V038680/1). The primary objective of the project is to generate training materials and opportunities which are open, accessible and reusable (FAIR) for all researchers. We aim to build an engaged community of practice around the participation in, and the development and maintenance of, the materials. The community-building element would be the main focus of the OLS scheme project.

Developing a carbon footprint database for buildings in Ghana

By: Sitsofe Morgah

Mentored by: Anne Treasure

Keywords: Sustainabilty, Open science, Carbon footprint, Buildings

There is a considerable gap in cases of sub-Saharan African countries regarding assessment of embodied energy of building materials and operational energy of various building types. Lack of data remains a critical barrier to closing this gap. Creating a database of embodied energy of building materials and operational energy of building typologies will be key in establishing the carbon footprint of buildings in Ghana. The development of an online platform will also allow interested groups, individuals and cooperation’s to submit key information needed for the computation of energy outputs in buildings. The aim of this project is to explore the scope and path towards establishing an open database and an inclusive community.

Publishing development plan for the Open access journals of TU Delft OPEN Publishing

By: Frédérique Belliard

Mentored by: Arielle Bennett, Julien Colomb

Status: graduated

Keywords: open publishing, open access, open peer review, scholarly communication

TU Delft OPEN Publishing, Open Access academic publisher of the TU Delft, publishes open access (OA) journals and open (text)-books. The OA journals currently receive not enough support regarding their development. This project aims to bring all OA journals to the same high level of quality expected of any TU Delft product. This project will establish the identity of TU Delft OPEN Publishing as a trustworthy academic publisher within and beyond TU Delft. The success of the plan is linked to the open collaboration of the journal editorial boards. The publishing plan will help journals

  • Identify their readership by targeting the correct authors,
  • Evaluate the quality of their publications (Impact),
  • Determine their publishing priorities,
  • Consider other publishing options (special Issues, new article types, publishing peer review comments, cascading),
  • Growth,
  • Improve communications and networking,
  • Diversify the editorial board
  • Take part in the education of young researchers

The plan needs to integrate as much as possible open science principles such as Open Data, Open peer review or authorship transparency in their publishing processes. While the plan will benefit all, It should consider the specificity of each journal. Overall the plan has to demonstrate its value to the editorial board.

By: Biandri Joubert

Mentored by: Batool Almarzouq

Status: graduated

Keywords: Rstudio, qualitative research, quantitative research, legal research, law

This project is envisioned as one that ends up as something that can explain the “why” when encouraging people from legal backgrounds to learn R and some data science. A case for open and reproducible research in a field that does not typically use R and qualitative methods. The idea is to get to that point by creating different data sets derived from commonly used legal sources that law students or graduates would be familiar with and incorporating them into a single platform with a few examples of practical use and application as well as code to encourage such persons to see the “why”. On this platform, I would like to share resources to places where the intro to R courses are available, etc. At this stage, this is an idea I have and I hope to develop it as the week’s progress.

NGS-HuB: An open educational resource for NGS data analysis

By: Nihan Sultan Milat, Faruk Üstünel, Esra Büşra Işık, Birgül Çolak-Al

Mentored by: Beatriz Serrano-Solano, Fotis Psomopoulos

Status: graduated

Keywords: Next-generation sequencing, data analysis, bioinformatics, genomics, open education, open resource, Training and education

Next-generation sequencing (NGS) is a revolutionary technique with wide applications in biology. Its applications are widely being used from young researchers to pioneer researchers of the field. The NGS data analysis has various approaches and many resources are explaining these methods. However, it can be difficult for beginners in this field to quickly understand and apply these methods. Also, beginner researchers might not have enough knowledge to overcome the most common errors that are encountered during these analyses. Due to the challenges that we mentioned above, in this project, we aimed to create a comprehensive open educational resource explaining NGS data analysis and its different methods and offer an example for the application of methodology and solutions to the most common errors. We believe that this project will be enlightening for anyone interested in NGS data analysis by providing a necessary roadmap.

Community engagement: Building an open online learning community

By: Sarah Nietopski

Mentored by: Caleb Kibet

Status: graduated

Keywords: community, community engagement, open learning, Data Science and AI education, training and education

This project aims to establish and nurture an active (virtual) community of learners with the Turing. The Institute is in the process of creating an open online learning platform to host training and learning materials, covering a range of subjects related to Data Science and AI. In order to ensure that the offering is useful to our audience, that it is responsive to their needs, and that it continues to grow, it is key to involve them in the creation and direction of the resources. Having real user voices and input will help to create an open learning space that is truly useful and valuable. Not only that, but the platform can serve as a central meeting point where learners can come together over shared interests and issues, and collaborate on projects that may have a wider impact. In order to do this, learners will need a way to connect and communicate effectively.

Finding paths through the FAIR forest; linking metadata from forestry models and datasets to assist analysis and hypothesis generation

By: Kim Martin

Mentored by: Deepak Unni

Status: graduated

Keywords: forestry, xylogenesis, wood formation, ecophysiology, FAIR, metadata, models, data, ontologies, knowledge graph

This project aims to assist researchers in the field of wood formation and ecophysiology to explore datasets and computational models in a flexible and integrative way. The goal is to provide a platform - in the form of an open knowledge graph and associated interface - that will allow researchers (even those with minimal familiarity with the underlying technology) to explore linked representations of metadata for the included datasets and models. Researchers should be able to survey a variety of models that target phenomena at different scales; ranging from process-based models of the cellular determinants of wood formation, to empirical models of gross tree growth in different environmental contexts. The linked information should allow complex questions to be asked, including: how similar models differ; which datasets can be repurposed to test different model outputs; and identifying whether and how different models can be composed together. This will promote open scientific practices in this research area (through the use of common metadata standards and terms), and may serve as a valuable framework for collaborative knowledge capture and exploration.

SciHack: Promoting the Open Source and DIY movements in Peru

By: Maria Andrea Gonzales Castillo, Nadia Odaliz Chamana Chura, Piero Beraun, Darwin Diaz, Sandra Mirella Larriega Cruz, Jhon Anderson Pérez Silva, Rodrigo Gallegos

Mentored by: Diego Onna

Status: not graduated

Keywords: DIY, Open Source, Biohacking, Computational Biology, Python, R, Programming

We propose SciHack, as the first Open Community Lab in Peru. Our principal aim is to make available the tools and resources necessary for anyone, including non-professionals, to conduct biological engineering research and learning. As part of our activities, we will focus on democratizing science and biotechnology knowledge in low-income populations of Peru where there’s little to no presence of science and technology education. To address these issues, we will conduct workshops about DIY lab equipment, Bioinformatics open source projects and educational resources, and molecular biology tools to train teachers and teach students how to make and analyze scientific experiments. Furthermore, by the end of our activities, we would like to implement a space for bio-makers of all ages and backgrounds to conduct research projects and build prototypes. In this way, we would be fighting against misinformation of science methodology and results that are primordial not only for the current COVID19 situation but also for the progress of science.

Data Science and AI Educators’ Programme AND Tools, Practices & Systems Peer Mentorship Programme

By: Ayesha Dunk, Bridget Nea, Andrea Kocsis

Mentored by: Emmy Tsang

Status: not graduated

Keywords: peer-mentoring, open source, collaboration, academic, Turing Way, ethics, communication

I would like to work with the Skills Team at the Alan Turing institute in order to create a peer-mentoring training programme from the main topics of The Turing Way. Our aim is to turn the five main areas in The Turing Way, namely research reproducibility, project design, collaboration, communication, and ethics into modules which the participants of the peer-mentoring programme can work on together. They would apply it on their own research/ projects, therefore they could put the learnt knowledge in use immediately. The aim of The Turing Way has been to be an open source, collaborative, applicable, and practical tool, and with this programme we would like to facilitate the use of it. We would like to see how people in their different stages of their career (PhD, Post-Doc, Researcher, Admin) can collaborate to deepen their knowledge about the contents of The Turing Way, while understand each other’s perspective on the subject. If the prototype is successful, we would like open it up for any applicants interested in applying the practices of The Turing Way to their own projects as part of the other trainings offered by the Institute.

Increasing the usability of ChemSpaX, a Python tool for chemical space exploration

By: Adarsh Kalikadien

Mentored by: Esther Plomp

Status: graduated

Keywords: software, homogeneous catalysis, data-driven chemistry, chemical space, transition, metal complexes, open source

Together with co-workers, I have developed a Python-based tool (Source code, publication 1, publication 2) which can be used to explore the local chemical space of an existing molecular scaffold. Homogeneous catalysts are important in many of our daily processes, but also in our fight against climate change. Our goal was to create a tool that can automatically generate large datasets that can be used in research for data-driven catalyst discovery.

The issue was that a simple SMILES representation of a molecular scaffold does not work when it contains a transition metal complex. With this tool we use the 3D coordinates of a molecular scaffold and let the user place molecular fragments to create many variations of this scaffold. Other inorganic chemistry fields that use transition-metal containing molecules might benefit from this tool as well. A first prototype is published, but several things can be done to increase the usability of our tool.

The Ersilia Model Hub: encouraging deployment of computer-based research tools for non-expert usage

By: Gemma Turon Rodrigo

Mentored by: Fotis Psomopoulos

Status: graduated

Keywords: reproducibility, sustainability, accessibility, artificial intelligence, community engagement, community

We have created the Ersilia Model Hub, a FLOSS platform containing AI/ML models for infectious and neglected disease research . These models can be accessed with little to no-coding expertise, solving a major roadblock in the applicability of such technologies to day-to-day research. The Hub currently includes a hundred open source models, both from the literature and developed by the Ersilia organization. With this project, we aim to open the Hub to the whole computer science community, encouraging third-author model depositions so that the code they develop is not simply open source but also deployed in a user-friendly manner. By leveraging the Hub architecture, computer scientists can reach more users, interact with them and further the impact of the assets they have developed by facilitating its implementation in real case scenarios. To this end, the project will focus on establishing clear guidelines on the quality of the software and its reproducibility (as it won’t necessarily be yet peer-reviewed) and creating a standard model deposition form and minimum required documentation. The ultimate goal of the project is to build a community of contributors by facilitating their access.

Easy Access Autism Resources for Rural Parents

By: Robert Schreiber

Mentored by: Georgia Aitkenhead, Stephan Heunis

Status: not graduated

Keywords: free resources, autism, translation

Living in South Africa, and in much of the world, you can see a large gap in the accessibility of healthcare resources based on a person’s income and education level. I am privileged enough to have had access to good quality healthcare resources throughout my life, which resulted in me being seen by numerous different therapists, counsellors and psychologists and being diagnosed with Autism Spectrum Disorder, more specifically Asperger’s syndrome. I am also privileged in that I am able to comfortably speak and read English. Unfortunately, a large percentage of our country has a very poor level of education, often not being able to read or speak English very well. Most South African resources I have found for individuals with, and parents of children with, autism are only available in English. My project aims to create a database of resources in a variety of South African languages, both written and as a video format (due to poor literacy rates, especially among older generations due to historical discrimination), in easy to understand language and with concepts that are easy to grasp. This is also useful as more people will be able to access online resources than resources at a healthcare facility.

Exploring “Governance Models” for Open Science community projects as per their maturity stage

By: Malvika Sharan, Anne Lee Steele

Mentored by: Gracielle Higino, Emma Karoune

Status: graduated

Keywords: Open Source, Community, Governance, Research

As open science projects mature, they attract, engage and retain members who actively participate and contribute to the project. They build Communities of Practice around the project through knowledge exchange, maintenance or development practices and ultimately guide the future directions for both the projects and their communities. To build a more equitable, resilient and sustainable open infrastructure for a diverse community, it is important to select governance models that give voices to people (users, contributors and wider society) from different socio-technical and socio-cultural backgrounds, identities, career stages, contextual needs and research communities. The Turing Way is a guide for reproducible, ethical and collaborative research and data science. Open Life Science is a training and mentoring program to help researchers learn and apply open principles in their work. They are mission-aligned open science projects that involve participants from around the world to create something deeply meaningful for them. The Turing Way has grown exponentially in the last three years that offers more than 200 pages co-created by more than 300 contributors. Open Life Science has offered 4 cohorts in the last 2 years and currently supports a community of over 300 members (present and past mentees, mentors and experts). To support the governance work in these projects in 2022, I would like to carry out a systematic study of governance models suitable for decentralised and distributed communities such as these. By creating a portfolio of governance models suitable for projects at different maturity levels, members from these projects will be able to identify the right model for their respective projects. The aim is to establish norms, workflows and processes that ensure a democratic structure for decision-making and leadership in a way that contributes to projects’ own visions while collaborating on the shared mission for global open science.

International Committee on Open Phytolith Science: community building initiative and open science training for the Phytolith Community

By: Carla Lancelotti, Javier Ruiz-Pérez, Maria Gabriela Musaubach, Abraham Dabengwa, Emma Karoune, Celine Kerfant, Zachary Dunseth, Juan José García-Granero

Mentored by: Gracielle Higino, Malvika Sharan

Status: graduated

Keywords: Community Building, Open Science, Open Data, Open Access, Phytolith, Archaeology, Palaeoecology

The International Committee on Open Phytolith Science (ICOPS) was initiated as a new committee within the International Phytolith Society in September 2021 and the first committee meeting took place in December 2021. This committee aims to increase the knowledge of and implementation of open science practices in phytolith research. We are embracing an open source approach to our work in this committee so that the work of our committee is transparent. This will include open documentation, regularly communicating with our community, and providing guidelines and communication channels to enable our community to engage with us. Therefore, we want to establish a solid base for this work going forward by further developing our GitHub repository - adding clear contributing guidelines and documentation on how the committee is to be run. All members of the committee will also benefit from training in all aspects of open science practices. This will allow us to gain further insight into the training and initiatives that we want to work on with our community. We will also start to develop training packages specific to phytolith research such as in open publishing and open and FAIR data.

Building an open community around the Turing-Roche Strategic Partnership

By: Vicky Hellon

Mentored by: Katharina Lauer

Status: graduated

Keywords: treatment heterogeneity, missing health data, academia-industry partnership, open science

The Turing-Roche strategic partnership was established in June 2021 with the goal of establishing a collaboration in advanced analytics between the two organisations to develop new data science methods to investigate large, complex, clinical and healthcare datasets to better understand how and why patients respond differently to treatment, and how treatment can be improved. As Community Manager for the project I am developing a collaborative and open community between both organisations and beyond. As the partnership is just beginning and is flexible in nature there are opportunities to embed open practices such as open publication, reproducibility, open data, training, co-working as well as establishing networks such as an early career researchers.

Art as a means to open science

By: Eirini Botsari

Mentored by: Lena Karvovskaya

Status: not graduated

Keywords: art, open science

On of my goals from my current position, as a community manager at the open science community Rotterdam, is to engage and include as many as possible; I want to sustain and grow the community. One of my ambitions is curating events and engage public to open discussions around open science. One of the events that I really want to arrange is through art (as a means) to raise awareness around open science practices, and create the floor for an open discussion around open science. It is still a general idea, so I am aware that I still need to go over all the details and I am still not sure what blockages will rise through this journey. But I am really positive and I truly believe that art is open and can act as a mediator towards connection, expression, openness, and understanding.

An Incomplete History of Research Ethics

By: Ismael Kherroubi Garcia

Mentored by: Lisanna Paladin

Status: graduated

Keywords: History of Science, Philosophy of Science, Research Ethics

A History of Research Ethics is a free, online resource for researchers, governance professionals, and even college students to learn about science and ethics, and be inspired to develop practical tools for the assurance of adequately conducted research. A key purpose of A History of Research Ethics is to demonstrate the variety of disciplines and backgrounds that research ethics can draw on. In other words, interdisciplinarity is critical for its success. This means both interdisciplinary contributions and adapting to audiences from diverse fields and sectors. By embracing the collaborative nature of GitHub and OLS’ open science community, I expect to take A History of Research Ethics to its next stage of development.

Developing thermally stable loop mediated isothermal amplification kit for a non invasive detection of malaria

By: Cavin Mgawe

Mentored by: Luis Pedro Coelho

Status: graduated

Keywords: Non invasive diagnostics, molecular kit development, molecular assays, molecular diagnostics, R programming

This study aims to develop a diagnostic LAMP kit that’s sensitive and robust to Plasmodium falciparum from saliva and urine to enhance simple and easy non-invasive molecular testing. The first phase of this study involved target validation, primer, and probe design using open-access tools. Here, a principal component analysis (PCA) of the R/adegenet package (Jombart et al, 2010) and phylogenetic tree (Neighbour-joining) using R/ape package (Paradis et al, 2004) to cluster repeats of the chosen amplification target. These clusters were aligned to generate a consensus sequence for designing primers and probes for establishing the LAMP assay. I have designed an incorporated strand displacement probe using the engineering guideline of Juan et al, 2015 and the open-access Nupack software tool. The assay has and the master mix is being lyophilized for further experimental evaluation.

The sensitivity of this kit will be evaluated on extracted DNA from three sample types: saliva, urine, and clinical blood, with crude samples, lysed using lysis buffer. Correlation generated from these results will inform the best sensitive amplification. Further, a possible decay of the lyophilised master mix will be evaluated for six months to ascertain possible shelf-life.

Building Pathways for Onboarding to Research Software Engineering (RSE) Asia Association and Adoption of Code of Conduct

By: Jyoti Bhogal

Mentored by: Malvika Sharan

Status: graduated

Keywords: Community Building, Community, Creating Pathway, Onboarding, Code of Conduct

In October 2021, the RSE Asia Association was launched. This was done to create awareness in the Asia region about the field of Research Software Engineering. The digital infrastructure for the association has been built during the Open Life Science Cohort 4 (OLS-4) program. The webpage is in place, the contact addresses have also been created for communication with people. A small community has also started emerging. It is time that the community can expand. With the expanding community, it is now required that we create well-defined pathways for people to get onboard to the association. This project aims at building such pathways. Also, a basic Code of Conduct is already present on the RSE Asia webpage. It is to be modified to make it more appropriate for the Asian region.

Transcriptomics profiling of bladder cancer using publicly available datasets

By: Umar Ahmad

Mentored by: Malvika Sharan, Yo Yehudi

Status: graduated

Keywords: RNA-Seq, Bladder Cancer, Transcriptomics, Bioinformatics

We are to process bladder cancer RNA-Seq datasets that are publicly available. The co-authors of this manuscript will work on the analysis and apply bioinformatics methods for analysis of this large scale, heterogeneous RNA-sequencing dataset (20 + samples - 3Gb) that will be downloaded from any of the following databases; Genome Atlas (TCGA) database, Genotype-Tissue Expression (GTEx), cBio Cancer GenCancer Genomics Portal (cBioPortal) database and SRA NCBI database (choose any suitable database you are familiar with). The biological questions of particular interest include

  1. identification of differentially expressed transcripts (DEGs)
  2. pathways and gene networks
  3. hub genes associated with cancer progression and recurrence
  4. small molecular identification 5) survival analysis.

Open source software such as R/Bioconductor (DESeq2), Unix/Linux, Python and Jupyter notebook will be mainly used for the analyses.

Bioinformatics Secondary school Outreach in Nigeria

By: Emmanuel Adamolekun

Mentored by: Meag Doherty

Keywords: Bioinformatics, Students, Data analysis, Training and education

Bioinformatics Secondary School Outreach (BSSO) is an initiative to develop bioinformatics capacity among High school students in Nigeria and this will create early interest in genomics data analysis among the students and equip them with the relevant skills and knowledge in Bioinformatics. Bioinformatics Hub Nigeria will be training these students on how to use Bioinformatics tools and pipelines and this can be achieved by establishing Bioinformatics research clubs in the visited schools to facilitate the trainings. We would be working alongside with other sister organizations to achieve this goal

OpenGHG - a cloud platform for greenhouse gas data analysis and collaboration

By: Gareth Jones

Mentored by: Michael Addy

Status: graduated

Keywords: Greenhouse gases, repeatable science, data science, data sharing, data analysis, open source

The OpenGHG project is a NERC funded project that aims to be a community platform for greenhouse gas data science. There is currently no central platform for greenhouse gas / atmospheric chemistry researchers to access standardised data / workflows, or easily share and analyse their measurements. Currently our prototype service processes and standardises the raw measurement data taken from sensor networks worldwide (such as the DECC and AGAGE networks), records associated metadata and makes this data searchable. We are currently in the process of adding the ability to process data from other sources, such as satellite and meteorological models.

Visualisation of participants by their countries

By: Akanksha Chaudhari

Mentored by: Muhammet Celik, Burce Elbasan

Status: not graduated

Keywords: Data Visualization, database, web application

Visualization of participants by their countries I would like to work on a project ‘Visualization of participants by their countries’ from the projects listed here.

With this project, I would try to represent participants and mentors participating in OLS on the map by their countries, and year of participation. I am planning to achieve this using Mapbox and OpenStreetMap. When hovering over a particular flag we can see that particular participants/mentors all info which is listed here or this one.

I would like to make something like this but would do a lot of brainstorming about design and representation. Also, we can add something like showing mentors vs ols-1/2/3/4 participants or showing number per country, or anything else creative. The next step would be to implement this on the official site of OLS. I want to practice my coding and visualization skills through this project and would appreciate the opportunity to meet, interact and work with like-minded people.

Build a community around the TU Delft Open Science MOOC

By: Alessandra Candian, Lisanne Walma

Mentored by: Patricia Herterich

Status: graduated

Keywords: Open Science, Community Building, Open Education, Training and education, Engagement

In this project we want to develop and implement new ways of building, engaging and maintaining the community around the TU Delft Open Science MOOC. We are part of the teaching team of the TU Delft Open Science MOOC called: ‘Open Science: sharing your research with the world’. The MOOC’’s next run starts in May 2022.

The course runs for 6 weeks and discusses a variety of open science topics, course materials are also available on TU OpenCourseWare.

The first Open Science MOOC started in 2018 and on average the course attracts about 1000 participants from an international environment. As the course runs participants engage with the teachers and each other through discussion forums. Here they introduce themselves, post assignments, and share and reply to each other’’s thoughts. While a few participants actively contribute and respond in the fora, engagement is still quite limited. Moreover, there is not yet a strategy in place to maintain the community after the course has finished. We would like to strengthen the community building taking place during and after this course by implementing additional strategies to engage participants during the course run and keeping up with participants after the course has finished.

Hub23: An open source community and infrastructure for Turing’s BinderHub

By: Callum Mole, Lydia France, Luke Hare

Mentored by: Renato Alves

Status: graduated

Keywords: Open Source, Reproducibility, Community, Open Infrastructure, Research

Binderhub is a service that allows users to share reproducible interactive computing environments through public code repositories. The subject of our project, Hub23, is an organisational deployment of Binderhub, designed to allow Turing Researchers to use binder (the user interface) to collaborate on repositories internal to Turing. This is sometimes necessary if the underlying repository can not be shared for some reason, or is not yet ready to publish openly. During the OLS program, we aim to build an open community around Hub23 to help to guide future technical developments, and encourage use and contributions from the wider Turing community. We will host a series of Zero-to-Binder workshops aimed at introducing Turing researchers to regular binder, followed by structured discussion of what the ideal features of a collaborative reproducible environment for research would be. Any conclusions and subsequent technical development will be fed upstream to Binderhub, and we also aim to open source the methodologies used to create an internal binderhub deployment, allowing other organisations to do so.

Development of an Open Source Platform for the Storage, Sharing, Synthesis and Meta-Analysis of Clinical Data

By: Valentina Borghesani, Isil Poyraz Bilgin, Pedro Pinheiro-Chagas, Sladjana Lukic

Mentored by: Sara El-Gebali

Status: graduated

Keywords: neuropsychology, neuroimaging, data sharing, cognitive neuroscience, clinical neuroscience, data visualization, meta-analysis

We aim to build an online platform and community that allows open sharing, storage, and synthesis of clinical (meta)data, crucial for the development of modern, transdiagnostic, FAIR neuropsychology. First, published peer-reviewed papers will be scrapped to collect already available (meta)data. Second, our platform will allow direct uploading of clinical brain maps and their corresponding metadata.

A basic automated preprocessing and data-quality check pipeline will be implemented. Key data will be automatically extracted, synthesized, and made available alongside the one directly uploaded. All the available demographic, behavioral, clinical, and cognitive data will be properly organized and mapped onto the neural data to allow statistical analysis (i.e., data-driven lesion-symptom mapping). Ultimately, probabilistic maps synthesizing transdiagnostic information on lesion-symptom mapping would be constantly updated as more data are gathered. To this end, data visualization will be critical (e.g. https://speechbrainviewer.com/). Overall, the platform will

  1. enable sharing of FAIR neuropsychological datasets across research centres and groups;
  2. foster understanding of the topographical distribution and morphological characteristics of brain lesions;
  3. allow large-scale, data-driven exploration of the associations between behavior and cognitive symptoms and brain regions.

Open Science, Open Future

By: Mariangela Panniello

Mentored by: Sara Villa

Status: graduated

Keywords: open education, student training, community building, educational resources, neuroscience, Training and education

Open science is vital for reproducible, fair, and rigorous research. For its principles to thrive, we need OS practices to be shared and adopted by as many scientists as possible, from the earliest stages of their career. A 2017 survey by the European Commission reports that, among 1277 researchers at all career stages, the majority were unaware of the OS concept, and had never attended an OS initiative (from the Open Science Skills Working Group Report, July 2017).

For open research to become an established reality, those who are moving their first steps into research must have the opportunity to develop the necessary skillset to apply and disseminate the OS framework. “Open Science, Open Future” aims to be an educational resource to be used online by young scientists: undergraduates, MSc students, and potentially high-school pupils. The curriculum will consist of several modules explaining why each aspect of the OS practice can improve research and make it fairer (e.g. best practices in sharing protocols, storing data, publishing, collaborating). I’m the co-founder of a pan-european collective of scientists, https://biotop.co/, aiming at rethinking the way we do science. Fellow members are willing to take part to the project.

Open Science for Improve diagnostics of Cancer through Artificial Intelligence and Digital Pathology

By: Nodira Ibrogimova, Elisee Jafsia, Wapouo Fadanka Stephane, Agossou Bidossessi Emmanuel

Mentored by: Andres Sebastian Ayala Ruano

Status: graduated

Keywords: AI, Cancer, Diagnostics, Digital pathology

Open Science for Improve diagnostics of Cancer through Artificial Intelligence and Digital Pathology.

Cancer is becoming increasingly prevalent among the group of treatable diseases in African countries. In sub-Saharan Africa, only 10% of histopathology needs are met and this is a major barrier to comprehensive management of cancers

  1. There is a shortage of clinicians and pathologists available for cancer diagnosis and treatment. One of the critical factors in treatment efficiency is the correct and timely diagnosis of specimens by pathologists. However, there is currently a significant shortage of cancer care clinicians in Africa and an even more considerable shortage of pathologists. In Cameroon, there are 19 pathologists currently in practice for 22,179,707 inhabitants
  2. The absolute number of patients with cancer in Cameroon was estimated to be 25,000 cases a year.

Diagnosis of cancer relies on histology in nearly 80% of cases, cytology in 10%, and clinical diagnosis in 10% (1). There is, therefore, an urgent need to develop a rapid, highly sensitive and diagnostic tool for the diagnosis of cancers, to increase cancer treatment efficacy and reduce overtreatment of tumors clinically suspicious for malignancy. We propose a hybrid diagnosis method with a deep Learning algorithm applied on hematoxylin and eosin histology slides. Digital microscopy and telepathology were already successfully used to mitigate the lack of pathologists in Cameroon, thus confirming the availability of a robust dataset for our project (1). Following splitting into training, validation and test sets, we will use CNNs as algorithms on the collected images to train the algorithm before deployment and tests. In addition to automated diagnostic, the developed program will have specific features such as sample information storage and tracking software as well as image optimization and analysis tools.

  1. Gruber-Mösenbacher, U., Katzell, L., McNeely, M., Neier, E., Jean, B., Kuran, A., & Chamala, S. (2021). Digital pathology in Cameroon. JCO Global Oncology, 7, 1380-1389.
  2. Ministry of Public Health (2017) Health analytical profile 2016 Cameroon. Ministry of Public Health, Cameroon, Yaounde.

Cultivating a Community of Practice of AI researchers

By: Achintya Rao

Mentored by: Yo Yehudi

Status: graduated

Keywords: community management, Community, Reproducible science, Open data, Data science, AI, White paper

The “AI for Science and Government” (ASG) programme at The Alan Turing Institute seeks to produce three community-led white papers that will capture the outcomes of research into deploying AI and data science in priority areas to support the UK’s economy. The papers will also highlight advances in practices towards open and reproducible research in the fields of AI and data science. The process of authoring the white papers will itself be collaborative, open and transparent, soliciting contributions from the wider ASG community at every step of the way.

Argentinean Public Health Research on Data Science and Artificial Intelligence for Epidemic Prevention (ARPHAI)

By: Verónica Xhardez, Sabrina López, Victoria Dumas, Federico Cestares, Laura Ación

Mentored by: Mayya Sundukova

Status: graduated

Keywords: Public health, Data science, AI, Epidemiology

ARPHAI is an interdisciplinary research consortium, whose mission is to develop technological tools and recommendations to anticipate and manage epidemiological events. ARPHAI pilots data-driven open source tools using artificial intelligence and data science towards upgrading Argentina’s electronic health record (EHR) system. ARPHAI is part of the Global South AI4COVID Program.

ARPHAI includes persons from 20 institutions. ARPHAI started in October 2020 and has grown very fast from scratch. More specifically, ARPHAI is piloting three EHR-based components in parallel to anticipate and detect potential epidemic outbreaks

  1. The extraction of computable phenotypes of diseases, symptoms, and syndromes using natural language processing to analyze EHR structured and free-text;
  2. Models for understanding and prediction of relevant epidemiological variables using computable phenotypes and open data information as input; and
  3. Dashboard visualization of the results from both points above, along with additional open data sources to inform decisions made by the public sector epidemiological authorities.

There are two additional lines of work ARPHAI undertakes that are transversal to these three research developments, which include a) diversity, equity, and inclusion (DEI) with a focus on gender and b) responsible use of health data.

FarawayFermi - A platform for open source bioinformatic tools to detect biosignatures in astrobiology

By: Sagarika Valluri, Sairaj R Dillikar

Mentored by:

Status: not graduated

Keywords: Bioinformatics, data, astrobiology, biosignatures, life evolution, game theory

The project focuses on building tools to understand the evolution of life. We develop bioinformatic tools to determine evolutionary processes to detect early stage life development. The project looks at data from two specific parts of detecting life - co-evolution of life and environment and biosignature assessment within the context of habitability. We use data from current experimental projects and develop new models to aid the growth of astrobiology search for life. The platform will cater to multiple sections such as- data management from all astrobiology projects, experiments, research labs and conferences; new tools to analyse data, predictive model section to simulations from the data set and collaborative forum to encourage citizen science.The platform will help create open source bioinformatic tools to help detect biosignature, assess habitability, promote involvement within astrobiology. In addition to using bioinformatic tools, a part of the platform will use game theory and gamification to test the citizen science component. Using both the platform as a destination for tool testing and science education, we hope to advance the research in astrobiology.

Open collaborative network and incentive system for brain health

By: Juyeon Kim

Mentored by:

Status: graduated

Keywords: Collaboration, building trust, intersectoral collaboration, Emotional/Physical/Nutritional diet for brain, global connectivity, Digitalization, Incentives to public and scientists

  1. Scientific perspectives Depression is becoming more common mental illness caused by the complicate network of various extrinsic and intrinsic stimulators.For the healthier and happier brain status, three key diets such as Emotional,Physical, and Nutritional diets should be considered. Scientific researchers are mostly focusing on research in unraveling the key molecule associated the mechanism of depression. However, we need to pull and process more extensive dataset from individuals, public health, and professionals in psychology, nutrition science, physical science, and neuroscience to improve or treat the brain kept or recovered to the healthy status.
  2. Open science perspectives. For comprehensive data, we need to build up the collaborative digital network with trust through benefiting to each participant to accelerate innovation from the open dataset. Effective collaboration tool or co-creation platform for the intersectoral collaboration would be required for the practice in this project.

Participants

The GitHub avatar of

Sitsofe Morgah

Pronouns: He
@0sahene

Carbon Footprint Development For Buildings In Ghana. Phase 2

Expertise:
Civil engineer, Construction, Sustainability

More about Sitsofe

The GitHub avatar of

Abraham Dabengwa

Pronouns: he/ him
@Nqabutho

School Of Animal, Plant And Environmental Sciences, University Of The Witwatersrand

Expertise:
Palaeoecology, Savanna and grassland ecology, Landscape ecology

More about Abraham

The GitHub avatar of

Agossou Bidossessi Emmanuel

Pronouns: He/Him/His

Mboalab

Expertise:
Software design, Data analysis, Machine learning, Artificial intelligence, Iot, Icd4d

More about Agossou

The GitHub avatar of

Adarsh Kalikadien


Tu Delft

Expertise:
Catalyst discovery, Cheminformatics, Quantum chemistry, Chemical space exploration, Data-driven catalysis, Data science, Python;

More about Adarsh

The GitHub avatar of

Alessandra Candian

Pronouns: she/her/hers
@donnainfiorino

University Of Amsterdam

Expertise:
Astrochemistry, Diversity, Inclusion & equity, Information literacy, Mentoring, Education, Training

More about Alessandra

The GitHub avatar of

Anne Lee Steele

Pronouns: she/her
@aleesteele

Alan Turing Institute

Expertise:
Community management, Facilitation, Ethnography, Anthropology, Sociology, Interdisciplinary collaboration, Art, Open science, Event organisation

More about Anne Lee

The GitHub avatar of

Alvaro Andre Vargas Aguilar

@AndreVargasAgu1

Universidad Mayor De San Simón

Expertise:
Editorial management
The GitHub avatar of

Akanksha Chaudhari

Pronouns: She/ Her
@Akanksh27024516

Expertise:
Machine learning, Data science, Visualization, Git/github, Software development, Community science, Flutter

More about Akanksha

The GitHub avatar of

Andrea Kocsis

Pronouns: she/her
@aurigandrea

Alan Turing Institute

Expertise:
Digital humanities

More about Andrea

The GitHub avatar of

Ayesha Dunk

Pronouns: She/her

The Alan Turing Institute

Expertise:
Community building, Open research, Train the trainer, Education, Project development, Data science and ai education, Open source, Scalability, Sustainability

More about Ayesha

The GitHub avatar of

Biandri Joubert

Pronouns: She/Her
@biandri

Independent Researcher

Expertise:
Law, International trade, Legal education, Mixed methodology research

More about Biandri

The GitHub avatar of

Umar Ahmad

Pronouns: Data Science
@babasaraki1

Bauchi State University

Expertise:
Anatomics, Genetics, Cancer, Computational genomics, Bioinformatics, R, Git/github, Unix/linux, Open science, Preprints, Data science, Python (dataviz), Mentoring, High-throughput sequencing.

More about Umar

The GitHub avatar of

Birgül Çolak-Al


Bezmialem Vakif University

The GitHub avatar of

Callum Mole

Pronouns: He/Him
@CallumDMole

The Alan Turing Institute

Expertise:
Open infrastructure, Open science, Reproducibility

More about Callum

The GitHub avatar of

Camila Gómez

Expertise:
Health, Pediatrics

More about Camila

The GitHub avatar of

Celine Kerfant


University Pompeu Fabra

Expertise:
Archaeobotany, Plant comparative anatomy, Phytolith studies, Ethnobotany

More about Celine

The GitHub avatar of

Carla Lancelotti

Pronouns: she-her
@cl379

Universitat Pompeu Fabra And Icrea

Expertise:
Archaeobotany

More about Carla

The GitHub avatar of

Rodrigo Gallegos

More about Rodrigo

The GitHub avatar of

Isil Poyraz Bilgin

Pronouns: Dr. or Miss
@complexbrains

University Of Reading, Uk

Expertise:
Interdisciplineary neuroscience research (e.g. fmri, Eeg, Human langauge processing), Software development (e.g. data preprocessing, Analysis, Ai, Dl), Open science (community engagements, Management, Material, Guideline development)

More about Isil

The GitHub avatar of

Danae Carelis Davila Espinoza

@CarelisDavila

More about Danae Carelis

The GitHub avatar of

Darwin Diaz

Pronouns: She/Her

Utec

Expertise:
Synthetic biology, Biomaterials.

More about Darwin

The GitHub avatar of

Eirini Botsari

Pronouns: she/her
@eirini_botsari

More about Eirini

The GitHub avatar of

Emma Karoune

Pronouns: she/her
@ekaroune

The Alan Turing Institute And Historic England

Expertise:
Open data, Fair data, Open research, Sensitive data, Community building, Open publishing, Phytoliths, Environmental archaeology, Palaeoecology

More about Emma

The GitHub avatar of

Emmanuel Adamolekun

Pronouns: He/Him
@EAdamolekun

Helix Biogen Institute

Expertise:
Bioinformatics, Genomics, Community building

More about Emmanuel

The GitHub avatar of

Esra Büşra Işık


Bezmialem Vakif University

More about Esra Büşra

The GitHub avatar of

Evelyn Greeves

Pronouns: she/her

University Of York; Software Sustainability Institute

Expertise:
Omics, Environmental biotechnology, Training, Community building
The GitHub avatar of

Wapouo Fadanka Stephane

Pronouns: He, Him
@StephaneFadanka

Mboalab

Expertise:
Biotechnology, Open science advocacy, Local innovation, Bio-entrepreneurship, Equitable research practices

More about Wapouo Fadanka

The GitHub avatar of

Faruk Üstünel


Bezmiâlem Vakıf University

Expertise:
Bioinformatics, Viral informatics, Viral immunology, Immunoinformatics
The GitHub avatar of

Federico Cestares

Pronouns: He- él
@fecestares

Ciecti - Arphai

Expertise:
International relations, Development economics

More about Federico

The GitHub avatar of

Frédérique Belliard

Pronouns: she/her
@fredbelliard

Tu Delft Open Publishing

Expertise:
Open publishing, Open access, Open peer review, Scholarly communication

More about Frédérique

The GitHub avatar of

Maria Gabriela Musaubach

@Gabi Musa

National University Of Jujuy. National Scientific And Technical Research Council - Argentina

Expertise:
Phytolith analysis, Pre-hispanic archaeology, Archaeobotany, Ancient starch analysis

More about Maria Gabriela

The GitHub avatar of

Gareth Jones

Pronouns: He/ Him

University Of Bristol

Expertise:
Research software engineer

More about Gareth

The GitHub avatar of

Gemma Turon Rodrigo

Pronouns: she/her
@TuronGemma

Ersilia Open Source Initiative

Expertise:
AI, Molecular biology, Drug discovery, Infectious disease

More about Gemma

The GitHub avatar of

Ismael Kherroubi Garcia

Pronouns: He/him
@hermeneuticist

The Alan Turing Institute

Expertise:
History of science, Philosophy of the social sciences, Philosophy of science

More about Ismael

The GitHub avatar of

Elisee Jafsia

Pronouns: He/Him
@Twitter: euclude

Mboalab

Expertise:
Data science

More about Elisee

The GitHub avatar of

Jhon Anderson Pérez Silva

Pronouns: He/him
@JhonPrz6

Scihack Community Bio Lab

Expertise:
Computational biology, Bioinformatics, Python, R, Structural biology

More about Jhon Anderson

The GitHub avatar of

Javier Ruiz-Pérez

@J_Ruiz_Perez

Texas A&M University

Expertise:
Archaeobotany, Palaeoecology, Phytolith analysis
The GitHub avatar of

Juan José García-Granero

Pronouns: He/him/his

Spanish National Research Council

Expertise:
Archaeology

More about Juan José

The GitHub avatar of

Jose Luis Villca Villegas

Pronouns: He/Him
@villca_villegas

Universidad Mayor De San Simon

Expertise:
Health sciences, Public health, Open science

More about Jose Luis

The GitHub avatar of

Jyoti Bhogal

Pronouns: she/her
@JyotiBhogal7

Expertise:
Statistics, Computation, Software quality testing

More about Jyoti

The GitHub avatar of

Kim Martin


Stellenbosch University

Expertise:
Reproducible research, Research software engineering (rse), Linked data, Knowledge graphs

More about Kim

The GitHub avatar of

Laura Ación

Pronouns: she/ella
@_lacion_

MetaDocencia

Expertise:
Ethics in artificial intelligence, Health data science, Responsible use of data, Latin america, Open education, Community building

More about Laura

The GitHub avatar of

Luke Hare

Pronouns: he/him

The Alan Turing Institute

The GitHub avatar of

Lisanne Walma

Pronouns: she/her
@lisannewalma

Tu Delft

Expertise:
Information literacy, Teaching, History of medicine, Basic text-mining

More about Lisanne

The GitHub avatar of

Lydia France


The Alan Turing Institute

Expertise:
Life sciences, Research software engineering, Binder, Biology, Zoology, Data science

More about Lydia

The GitHub avatar of

Malvika Sharan

Pronouns: she/her
@malvikasharan

The Alan Turing Institute

Role in OLS: Director of Partnerships and Strategy

Expertise:
Community building, Mentoring, Data Science best practices, Reproducibility, Inclusive and collaborative practices, Python, Version Control, Funding Proposals, Bioinformatics, Algorithm design

More about Malvika

The GitHub avatar of

Maria Andrea Gonzales Castillo

Pronouns: She/Her

Expertise:
Synthetic biology, Biotechnology;

More about Maria Andrea

The GitHub avatar of

Mariangela Panniello

Pronouns: she/her
@marpamondo

Italian Institute Of Technology

Expertise:
Neuroscience, Educational resources, Community engagement

More about Mariangela

The GitHub avatar of

Cavin Mgawe

Pronouns: He
@CavinMgawe

Expertise:
Molecular biology, Lamp assay, Strand displacement probes, Multiple sequence alignment, Principle component analysis using r, Primers design, Multiple sequence alignment using r package msa, Invitro culture of pf3d7, Open access publication

More about Cavin

The GitHub avatar of

Nadia Odaliz Chamana Chura

Pronouns: She/her/hers

Scihack

Expertise:
Synthetic biology, Environmental biotechnology, Food industry, Scientific communication

More about Nadia Odaliz

The GitHub avatar of

Bridget Nea

Pronouns: she/her/hers

The Alan Turing Institute

Expertise:
Learning and development, Project management, Training and skills

More about Bridget

The GitHub avatar of

Nelson Franco Condori Salluco

Pronouns: Nelson
@Nelson Franco Condori Salluco

Universidad Mayor De San Simón Facultad De Medicina: Cochabamba, Bolivia

Expertise:
Area de investigación: ciencias de la salud y gestión editorial temas con los que puedo ayudar: temas referente a la promoción de la salud y edición de revistas en bolivia

More about Nelson Franco

The GitHub avatar of

Nihan Sultan Milat

@MilatNihan

Bezmialem Vakif University, Beykoz Institute Of Life Science And Biotechnology

Expertise:
Life Sciences, Molecular biology, Developmental Genetics

More about Nihan Sultan

The GitHub avatar of

Nodira Ibrogimova

Pronouns: She/Hers

Expertise:
Python, Machine learning, Angular;

More about Nodira

The GitHub avatar of

Juyeon Kim


Binnobase Llc.

Expertise:
Open innovation, Research translation, Innovation management, Women in life science, Neuroinflammation, Depression

More about Juyeon

The GitHub avatar of

Piero Beraun

Pronouns: He/Him
@BeraunPiero

Scihack

Expertise:
Synthetic biology, Systems biology, Enviromental biotechnology, Astrobiology

More about Piero

The GitHub avatar of

Pedro Pinheiro-Chagas

Pronouns: He / him / his
@ppinheirochagas

Stanford University

Expertise:
Cognitive neuroscience

More about Pedro

The GitHub avatar of

Achintya Rao

Pronouns: he/him
@RaoOfPhysics

The Alan Turing Institute

Expertise:
Science communication, Public engagement, Version control, A bit of r, Project binder

More about Achintya

The GitHub avatar of

Robert Schreiber

Pronouns: He/him
@schrob25

Expertise:
Affordable and easy access to healthcare services

More about Robert

The GitHub avatar of

Sairaj R Dillikar

@SairajDillikar

Expertise:
Seti
The GitHub avatar of

Sandra Mirella Larriega Cruz

Pronouns: She, her
@SandraLarriega

Scihack

Expertise:
Biotechnology, Biomaterials, Synbio
The GitHub avatar of

Sarah Nietopski

Pronouns: she/her

The Alan Turing Institute

Expertise:
Course design, Online learning, Blending learning

More about Sarah

The GitHub avatar of

Zarena Syrgak

Pronouns: she/they

Nazarbayev University

Expertise:
Research ethics, Epistemic injustice, Decolonial research, Political economy of knowledge production

More about Zarena

The GitHub avatar of

Sladjana Lukic

Pronouns: she/her
@NoLaB_Lukic

Adelphi University

Expertise:
Lexical system, Psycholinguistics, Aphasia, Emotions, Lesion-symptom mapping

More about Sladjana

The GitHub avatar of

Sabrina López

Pronouns: seh/her/ella
@SLLDeC

Arphai / Instituto De Cálculo (Uba-Conicet)

Expertise:
Health data science; responsible use of data; neuroscience

More about Sabrina

The GitHub avatar of

Sagarika Valluri

Pronouns: She/Her
@thatspacegirl__

Expertise:
Astrobiology, Planetary science, Ml, Data science;

More about Sagarika

The GitHub avatar of

Valentina Borghesani

Pronouns: she/her/hers
@vborghesani

University Of Montreal

Expertise:
Cognitive neuroscience, Clinical neuroscience, Neuroimaging;

More about Valentina

The GitHub avatar of

Verónica Xhardez

Pronouns: She/ella
@ xhardez

Arphai/Ciecti

Expertise:
Software libre; comunidades de práctica; coproducción de conocimiento; proyectos interdisciplinarios

More about Verónica

The GitHub avatar of

Vicky Hellon

Pronouns: she/her
@vickyhellon

The Alan Turing Institution

Expertise:
Personalised medicine, Open access publishing

More about Vicky

The GitHub avatar of

Victoria Dumas

Pronouns: She, Her
@VickyDumas

Fundación Sadosky - Arphai

Expertise:
Data science - data visualization

More about Victoria

The GitHub avatar of

Zachary Dunseth

Pronouns: he/his

Icops

Expertise:
Phytolith, Geoarchaeology