Publications

What is a Publication?
66 Publications visible to you, out of a total of 66

Abstract (Expand)

Research Object Crate (RO-Crate) is a lightweight method to package research outputs along with their metadata. Signposting provides a simple yet powerful approach to navigate scholarly objects on the Web. Combining these technologies form a "webby" implementation of the FAIR Digital Object principles which is suitable for retrofitting to existing data infrastructures or even for ad-hoc research objects using regular Web hosting platforms. Here we give an update of recent community development and adoption of RO-Crate and Signposting. It is notable that programmatic access and more detailed profiles have received high attention, as well as several FDO implementations that use RO-Crate.

Authors: Stian Soiland-Reyes, Peter Sefton, Simone Leo, Leyla Jael Castro, Claus Weiland, Herbert Van de Sompel

Date Published: 18th Mar 2025

Publication Type: Journal Article

Abstract (Expand)

Description Documentation of the cross-domain adoption of the EuroScienceGateway (ESG) project, showcasing how Galaxy was used and extended to meet the data analysis needs of researchers acrosss biodiversity, climate science, astrophysics, materials science, and biomedical domains. This record outlines ESG’s impact on the onboarding of diverse scientific communities, enabling scalable, reproducible, and FAIR-compliant workflows. Through targeted outreach, infrastructure integration, and community-driven tool development, the project successfully onboarded new user groups and demonstrated Galaxy’s adaptability across multiple scientific verticals. Over 800 tools were integrated into Galaxy during the past 3 years, and dozens of reusable workflows were published to support sensitive data handling, high-throughput image analysis, simulation environments, and federated compute. The deliverable documents use cases, domain-specific onboarding models, training efforts, and collaborative success stories, including the development of the Galaxy Codex and strategic alignment with EOSC, ELIXIR, and NFDI initiatives. Project: EuroScienceGateway was funded by the European Union’s Horizon Europe programme (HORIZON-INFRA-2021-EOSC-01) under grant agreement number 101057388. Document: D5.2 Publication of the usage of EuroScienceGateway by multiple communities Work Package: Work Package 5. Community engagement, adoption and onboarding Tasks: Task 5.1 Biodiversity and Climate Science Task 5.2 Materials Science Task 5.3 Astrophysics Task 5.4 Mentoring and onboarding new communities Lead Beneficiary: University of Oslo (UiO) Contributing Beneficiaries: UiO, ALU-FR, CNRS, UNIFI, UKRI, EPFL, UP, BSC

Authors: Armin Dadras, Denys Savchenko, Andrii Neronov, Volodymyr Savchenko, Nikolay Vazov, Jean Iaquinta, Eva Alloza, María Chavero Díez, Anthony Bretaudeau

Date Published: 18th Aug 2025

Publication Type: Report

Abstract (Expand)

Description Documentation of the design, deployment, and operationalization of the European Pulsar Network, developed within the EuroScienceGateway (ESG) project. This deliverable outlines how thee Pulsar Network enables scalable, federated, and interoperable remote job execution across European Galaxy servers and compute infrastructures. This record showcases the technical architecture, automation strategies, and monitoring solutions behind the distributed execution framework, supporting reproducible workflows and efficient resource sharing. The network connects 13 Pulsar endpoints across 10 countries, integrated with six national Galaxy servers and the European Galaxy server. Deployments span public clouds, institutional HPCs, and EOSC resources, unified under a secure, open-source infrastructure stack using Terraform, Ansible, RabbitMQ, CVMFS, and SABER. The deliverable demonstrates how ESG addressed interoperability and scalability challenges through open infrastructure tooling, cross-institutional coordination, and continuous monitoring. It provides a replicable model for distributed compute resource integration and highlights Galaxy's extensibility in federated scientific computing. Project: EuroScienceGateway, funded by the European Union’s Horizon Europe programme (HORIZON-INFRA-2021-EOSC-01) under grant agreement number 101057388. Document: D3.2 Publication on the Pulsar Network, integrated in workflow management systems Work Package: Work Package 3. Pulsar Network: Distributed heterogeneuos compute Tasks: - Task 3.1 Develop and maintain an Open Infrastructure-based deployment model for Pulsar endpoints - Task 3.2 Add GA4GH Task Execution Service (TES) API to Pulsar - Task 3.3 Build a European-wide network of Pulsar sites - Task 3.4 Add TES support to WfExS (Workflow Execution Service) - Task 3.5 Developing and maintaining national or domain-driven Galaxy servers Lead Beneficiary: CNR Contributing Beneficiaries: CNR, INFN, ALU-FR, CNRS, CESNET, UiO, UB, EPFL, AGH/AGH-UST, BSC, VIB, IISAS, TUBITAK, UNIMAN

Authors: Armin Dadras, Marco Antonio Tangaro

Date Published: 1st Aug 2025

Publication Type: Report

Abstract (Expand)

Project: EuroScienceGateway was funded by the European Union programme Horizon Europe (HORIZON-INFRA-2021-EOSC-01-04) under grant agreement number 101057388 and by UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee grant number 10038963. Document: D4.2 Publication on the smart job scheduler implementation Work Package: Work Package 4. Building blocks for a sustainable operating model. Task: - Task 4.3 Implement a smart job-scheduling system across Europe Lead Beneficiary: EGI Contributing Beneficiary: ALU-FR, CESNET, EGI, UiO, and VIB Executive Summary Galaxy is currently using the Total Perspective Vortex (TPV) to schedule millions of jobs for hundred thousand users globally. While TPV has proven to be a robust meta-scheduling tool for Galaxy in the last years, there are areas of improvement that have been addressed in the EuroScienceGateway project: - Gathering live usage metrics from across the distributed computing endpoints connected to Galaxy in order to distribute the load across all sites. - Adding latitude and longitude attributes to data stores and computing endpoints to allocate jobs as close as possible to the location of the data. - Visualizing job distribution across sites with an intuitive dashboard. As a result the EuroScienceGateway project has developed two new tools: - TPV Broker for the efficient meta-scheduling of jobs taking into account real-time usage metrics and data-locality information - Galaxy Job Radar: a web dashboard to easily visualize the allocation of jobs across all sites The EuroScienceGateway project has significantly improved the meta-scheduling of jobs for Galaxy, resulting in less waiting times for users to see their job completed and improving resource utilization across all sites.

Authors: Abdulrahman Azab, Sanjay Kumar Srikakulam, Paul De Geest, Tomáš Vondrák, Björn Grüning, Mira Kuntz, Enol Fernandez-del-Castillo, Sebastian Luna-Valero

Date Published: 27th Feb 2025

Publication Type: Report

Abstract (Expand)

Recording the provenance of scientific computation results is key to the support of traceability, reproducibility and quality assessment of data products. Several data models have been explored to address this need, providing representations of workflow plans and their executions as well as means of packaging the resulting information for archiving and sharing. However, existing approaches tend to lack interoperable adoption across workflow management systems. In this work we present Workflow Run RO-Crate, an extension of RO-Crate (Research Object Crate) and Schema.org to capture the provenance of the execution of computational workflows at different levels of granularity and bundle together all their associated objects (inputs, outputs, code, etc.). The model is supported by a diverse, open community that runs regular meetings, discussing development, maintenance and adoption aspects. Workflow Run RO-Crate is already implemented by several workflow management systems, allowing interoperable comparisons between workflow runs from heterogeneous systems. We describe the model, its alignment to standards such as W3C PROV, and its implementation in six workflow systems. Finally, we illustrate the application of Workflow Run RO-Crate in two use cases of machine learning in the digital image analysis domain.

Authors: Simone Leo, Michael R. Crusoe, Laura Rodríguez-Navas, Raül Sirvent, Alexander Kanitz, Paul De Geest, Rudolf Wittner, Luca Pireddu, Daniel Garijo, José M. Fernández, Iacopo Colonnelli, Matej Gallo, Tazro Ohta, Hirotaka Suetake, Salvador Capella-Gutierrez, Renske de Wit, Bruno P. Kinoshita, Stian Soiland-Reyes

Date Published: 10th Sep 2024

Publication Type: Journal Article

Abstract (Expand)

Description This preprint outlines the development and deployment of the European Pulsar Network (EPN)—a federated, scalable architecture enabling distributed job execution across national and Europeann Galaxy instances. Built within the Horizon Europe EuroScienceGateway project, the EPN leverages the Galaxy workflow system and the Pulsar job execution service to offload computational workloads to remote endpoints seamlessly and securely. The work introduces an Open Infrastructure (OI) framework that automates provisioning, deployment, and monitoring using Terraform, Ansible, and Jenkins. The pre-print highlights deployments across thirteen Pulsar nodes and six national Galaxy portals, illustrating how the EPN supports reproducible, FAIR-aligned data analysis while abstracting infrastructure complexity for researchers.

Authors: Marco Antonio Tangaro, Stefano Nicotri, Björn Grüning, Sanjay Kumar Srikakulam, Armin Dadras, Oana Kaiser, Mira Kuntz, Anthony Bretaudeau, Paul De Geest, Sebastian Luna-Valero, María Chavero Díez, José María Fernández González, Salvador Capella-Gutierrez, Josep Lluís Gelpí, Jan Astalos, Boris Jurič, Miroslav Ruda, Łukasz Opioła, Hakan Bayındır, SILVIA GIOIOSA, Gaetanomaria De Sanctis, Federico Zambelli

Date Published: 7th Aug 2025

Publication Type: Preprint

Abstract (Expand)

Background The covid-19 pandemic brought negative impacts in almost every country in the world. These impacts were observed mainly in the public health sphere, with a rapid raise and spread of the disease and failed attempts to restrain it while there was no treatment. However, in developing countries, the impacts were severe in other aspects such as the intensification of social inequality, poverty and food insecurity. Specifically in Brazil, the miscommunication among the government layers conducted the control measures to a complete chaos in a country of continental dimensions. Brazil made an effort to register granular informative data about the case reports and their outcomes, while this data is available and can be consumed freely, there are issues concerning the integrity and inconsistencies between the real number of cases and the number of notifications in this dataset. Results We projected and implemented four types of analysis to explore the Brazilian public dataset of Severe Acute Respiratory Syndrome (srag dataset) notifications and the google dataset of community mobility change (mobility dataset). These analysis provides some diagnosis of data integration issues and strategies to integrate data and experimentation of surveillance analysis. The first type of analysis aims at describing and exploring the data contained in both datasets, starting by assessing the data quality concerning missing data, then summarizing the patterns found in this datasets. The Second type concerns an statistical experiment to estimate the cases from mobility patterns organized in periods of time. We also developed, as the third analysis type, an algorithm to help the understanding of the disease waves by detecting them and compare the time periods across the cities. Lastly, we build time series datasets considering deaths, overall cases and residential mobility change in regular time periods and used as features to group cities with similar behavior. Conclusion The exploratory data analysis showed the under representation of covid-19 cases in many small cities in Brazil that were absent in the srag dataset or with a number of cases very low than real projections. We also assessed the availability of data for the Brazilian cities in the mobility dataset in each state, finding out that not all the states were represented and the best coverage occurred in Rio de Janeiro state. We compared the capacity of place categories mobility change combination on estimating the number of cases measuring the errors and identifying the best components in mobility that could affect the cases. In order to target specific strategies for groups of cities, we compared strategies to cluster cities that obtained similar outcomes behavior along the time, highlighting the divergence on handling the disease.

Authors: Yasmmin Côrtes Martins, Ronaldo Francisco da Silva

Date Published: 27th Sep 2023

Publication Type: Journal Article

Abstract

Not specified

Authors: Michael J. Roach, N. Tessa Pierce-Ward, Radoslaw Suchecki, Vijini Mallawaarachchi, Bhavya Papudeshi, Scott A. Handley, C. Titus Brown, Nathan S. Watson-Haigh, Robert A. Edwards

Date Published: 15th Dec 2022

Publication Type: Journal Article

Abstract (Expand)

Workflows have become a core part of computational scientific analysis in recent years. Automated computational workflows multiply the power of researchers, potentially turning “hand-cranked” data processing by informaticians into robust factories for complex research output. However, in order for a piece of software to be usable as a workflow-ready tool, it may require alteration from its likely origin as a standalone tool. Research software is often created in response to the need to answer a research question with the minimum expenditure of time and money in resource-constrained projects. The level of quality might range from “it works on my computer” to mature and robust projects with support across multiple operating systems. Despite significant increase in uptake of workflow tools, there is little specific guidance for writing software intended to slot in as a tool within a workflow; or on converting an existing standalone research-quality software tool into a reusable, composable, well-behaved citizen within a larger workflow. In this paper we present 10 simple rules for how a software tool can be prepared for workflow use.

Authors: Paul Brack, Peter Crowther, Stian Soiland-Reyes, Stuart Owen, Douglas Lowe, Alan R. Williams, Quentin Groom, Mathias Dillen, Frederik Coppens, Björn Grüning, Ignacio Eguinoa, Philip Ewels, Carole Goble

Date Published: 24th Mar 2022

Publication Type: Journal Article

Abstract (Expand)

Motivation The identification of the most important mutations, that lead to a structural and functional change in a highly transmissible virus variants, is essential to understand the impacts and the possible chances of vaccine and antibody escape. Strategies to rapidly associate mutations to functional and conformational properties are needed to rapidly analyze mutations in proteins and their impacts in antibodies and human binding proteins. Results Comparative analysis showed the main structural characteristics of the essential mutations found for each variant of concern in relation to the reference proteins. The paper presented a series of methodologies to track and associate conformational changes and the impacts promoted by the mutations.

Authors: Yasmmin Martins, Ronaldo Francisco da Silva

Date Published: 22nd Jun 2023

Publication Type: Journal Article

Powered by
(v.1.17.3)
Copyright © 2008 - 2026 The University of Manchester and HITS gGmbH