Blog

06/05/2026

From 14 hours to under 5: faster whole-genome analysis with EuroHPC’s MeluXina

By Latvian Biomedical Research and Study Centre

Large-scale genome sequencing projects are opening doors to earlier diagnoses, more effective treatments, and new insights into disease understanding. Yet, analysing thousands of whole genomes remains a major computational challenge.

The Latvian genome reference project explores scalable solutions to make fast whole-genome sequencing (WGS) analysis accessible and practical, which would benefit the patients and clinicians. With EPICURE’s support, the program reduced whole-genome analysis runtime from over 14 hours to under 5, achieving a 3.1× speed-up and enabling more scalable genomic processing.

The Latvian Biomedical Research and Study Centre has already sequenced around 4,000 human genomes but faces limitations in local HPC capacity that slow down secondary analyses. To solve this, they benchmarked and optimised their containerised variant-calling pipeline (nf-core/sarek with Nextflow) on the MeluXina supercomputer to identify configurations suitable for efficient large-scale genomic processing.

Scientific and technical challenges

WGS analysis at this stage requires efficient orchestration of complex, resource-intensive workflows. The initial pipeline setup faced several limitations, such as restricted parallelism at the node level and inefficient scheduling caused by large numbers of small jobs.

In addition, specific steps such as duplicate marking introduced significant computational overhead. Performance varied depending on storage tiers, executor configuration, and hardware choices. The researchers needed to understand how to balance CPU and GPU resources, optimise data placement, and ensure efficient execution across nodes, all of which were essential to improving scalability.

Image of a supercomputer with a blue filter

@LuxProvide

EPICURE support and EuroHPC resources

To overcome these challenges, the EPICURE support focused on the redesign and optimisation of the current Nextflow/Sarek pipeline for execution on the MeluXina system.

The original workflow was reconfigured to use the HyperQueue executor on pre-allocated nodes, to avoid inefficient job scheduling, and to enable effective multi-node execution. EPICURE also guided the application of MeluXina’s best practices for data placement and supported the installation of NVIDIA’s GPU-accelerated genomics software Parabricks.

The result was a robust and scalable pipeline that runs efficiently across both CPU and GPU environments and supports large-scale genomic analyses.

Results and impact

The optimised configurations delivered a significant improvement in performance and scalability. The best-performing setup, which used GPU nodes, Parabricks, and HyperQueue, reduced runtime from approximately 14.6 hours in the initial CPU-only configuration to around 4.7 hours using three GPU nodes. This corresponds to a speed-up of about 3.1×.

Beyond raw performance gains, the project defined practical strategies for scaling containerised genomics workflows on HPC systems. The results showed that the most substantial improvements came from GPU-accelerated tools combined with appropriate workflow orchestration, rather than relying only on hardware upgrades.

These advances enable a faster and more reliable process for large genomic datasets. They also support the completion of the Latvian genome reference project and contribute to broader initiatives such as the Genome of Europe project. In the long term, this work carried out by the Latvian Biomedical Research and Study Centre will support the development of allele frequency databases for clinical and scientific use.

Close-up 3D illustration of a DNA double helix composed of clustered particles against a light background.

Next steps

The next phase of the project will focus on applying these configurations and best practices to large-scale production analyses of Latvian genomic data. This will ensure more efficient use of both CPU and GPU resources at each stage of the pipeline.

Future work will address the optimisation of resource allocation and the development of additional components, such as joint-calling workflows, to support more advanced genomic analysis.

To learn more about the Exploring additional computational resources for Latvian genome reference project, visit the project page on the European HPC application support portal.

BACK TO BLOG

Comments

Discover other posts

30/04/2026

Hand interacting with virtual dashboards displaying financial charts and analytics, alongside an AI interface icon, on a laptop screen.

Published by Catarina Fernandes on 30/04/2026

Categories

Scaling financial AI for non-English markets with EuroHPC

16/04/2026

Group of people standing in front of a building entrance, posing for a photo.

Published by wp_epicure on 16/04/2026

Categories

Building Better HPC Support Together: Highlights from the EPICURE Annual Meeting 2026

14/04/2026

Abstract scientific visualisation of molecular interaction, showing fluid-like structures in blue and white with a highlighted central binding area in orange and red.

Published by Catarina Fernandes on 14/04/2026

Categories

Accelerating drug discovery through scalable AI and HPC

17/03/2026

Close-up view of illuminated server racks and cooling tubes inside a high-performance computing data centre, with blue and red lighting reflecting across the floor.

Published by Catarina Fernandes on 17/03/2026

Categories

Connecting with the HPC community at the EasyBuild User Meeting 2026

From 14 hours to under 5: faster whole-genome analysis with EuroHPC’s MeluXina

Scientific and technical challenges

EPICURE support and EuroHPC resources

Results and impact

Next steps

Comments

Leave a Reply Cancel reply

Discover other posts

Scaling financial AI for non-English markets with EuroHPC

Building Better HPC Support Together: Highlights from the EPICURE Annual Meeting 2026

Accelerating drug discovery through scalable AI and HPC

EPICURE Hackathon: speeding up your code

This project has received funding from the European High-Performance Computing Joint Undertaking under grant agreement No.101139786.

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or EuroHPC Joint Undertaking. Neither the European Union nor the granting authority can be held responsible for them.

Developed By SUBA.PT

Copyright © Epicure 2024

Privacy Policy