How an EPICURE Project Is Supported (with an Example from BSC)

06/08/2025

EPICURE project partners gathered outside CINECA during a project review meeting.

EPICURE completes review meeting with solid progress

25/09/2025

09/09/2025

The last 18 months with EPICURE: why ask for help?

by João Barbosa (IT4Innovations National Supercomputing Center)

If you’re about to submit or have just received access to computational resources, choose Application Support (EPICURE) in your request to receive the link to your award or apply later via our portal. You may find the quickest progress comes from not tackling the most complex parts alone.

If you’re running on EuroHPC systems and something feels harder than it should – porting to GPUs, scaling past a stubborn node count, keeping your ML training stable at size –, EPICURE was built for exactly that moment. Over the last 18 months, we treated advanced support as a collaboration, not a queue, and, in many cases, the results spoke for themselves: faster starts, reproducible runs, and measurable improvements that carried beyond the engagement.

EPICURE in practice: speed and outcomes

Our approach is straightforward. From the instant a request arrives, the clock starts ticking. On average, projects are moved from “submitted” to a named lead partner in about 2.9 days, and into their first working meeting roughly a week later. Engagements typically last around three months, with approximately two person-months of focused effort, where it matters most: profiling first, changing second, and validating continuously.

Users ultimately gave the assistance a 4.7/5 rating for helpfulness and responsiveness and a 4.8/5 rating for recommendation. In real life, those scores can translate into fewer lost weeks, fewer enigmatic regressions, and code that responds appropriately when the system or toolchain changes unavoidably.

What changed for supported projects by EPICURE?

What changed for the projects we touched was disciplined engineering delivered quickly. Fragile builds that only worked on a lucky node were transformed into portable, version-locked environments, often containerized, allowing teams to iterate without chasing ABI gremlins.

Jobs that quietly left GPUs idle were re-launched with topology-aware bindings and telemetry that made under-utilisation obvious. Memory blowups that initially appeared to be “bad luck at scale” became tractable once we mapped batch size, staging buffers, and host–device transfers to the realities of each machine. And for groups pushing into AI, the gains often came from everything surrounding the kernels: staging data so the filesystem kept up, launching in ways that respected the node layout, and choosing precision modes that preserved scientific validity while still allowing throughput to climb.

Across 135 support requests, that rhythm repeated. Not every project finishes within an evaluation window. Still, the cadence remains consistent: days to get started, weeks to turn measurements into changes, and months to lock in improvements and hand over artifacts that the team can keep. Just as important, fixes stopped being one-offs. As patterns emerged – collectives stalling on a particular fabric, GPU occupancy sinking for a known reason, I/O saturating in predictable ways – we wrote them down. Templates for Slurm on actual node topologies. MPI and NCCL settings that are more beneficial than detrimental. Memory-tuning checklists that prevent the classic accidental out-of-memory. Known-good container stacks for CUDA or ROCm paired with specific framework versions. The next project began further ahead because the last one shared what it had learned.

Application support for projects of all sizes

If you’re wondering whether your problem is big enough, that’s the point: you don’t need to be a lighthouse code to benefit. We’ve worked shoulder-to-shoulder with teams modernising legacy CPU-only solvers, with SMEs who need reliable speedups without a six-month detour into tooling, and with researchers whose ML pipelines are brilliant but brittle at supercomputing scale. The heterogeneity across EuroHPC systems isn’t going away, and it shouldn’t; EPICURE can offer a more straightforward path through it, guided by people who have already hit the same walls this year.

How to engage with EPICURE

The best time to involve us is during your allocation request: choose EPICURE Support, so you receive the EPICURE link with your award. If you decide to apply later, you can do so via our support portal. A short, focused performance clinic can help turn “it’s slow” into a specific hotspot with concrete next steps. And if your work spans multiple centres or architectures, that’s fine too; collaboration across sites is normal for us, and it’s part of why solutions are generalised.

If you take one message from the last eighteen months, make it this: advanced support isn’t a last resort; it can be an accelerator. Ask for it when you’re planning a port, when you’re designing a scaling experiment, or when you’ve hit the kind of problem that feels like it will eat the next month. We bring the measurement, playbooks, and guardrails, and aim to leave you with artifacts that you can reuse after the engagement.

Partners

We gratefully acknowledge the collaboration of the EPICURE partners (alphabetical order): Academic Computer Centre Cyfronet AGH (CYFRONET); Barcelona Supercomputing Center – Centro Nacional de Supercomputación (BSC); CINECA Consorzio Interuniversitario; CINES – Centre Informatique National de l’Enseignement Supérieur; CSC – Tieteen tietotekniikan keskus Oy; Danish e-Infrastructure Consortium (DeiC/DTU); Forschungszentrum Jülich GmbH (FZJ); GENCI – Grand Équipement National de Calcul Intensif; INESC TEC – Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência; Institut informacijskih znanosti (IZUM); Jožef Stefan Institute (JSI); Kungliga Tekniska högskolan (KTH); LuxProvide S.A.; Sofia Tech Park JSC; Universiteit Antwerpen (UAntwerpen); and VSB – Technical University of Ostrava (IT4I@VSB).

BACK TO BLOG

Comments

Discover other posts

29/04/2026

Hand interacting with virtual dashboards displaying financial charts and analytics, alongside an AI interface icon, on a laptop screen.

Published by Catarina Fernandes on 29/04/2026

Categories

Scaling financial AI for non-English markets with EuroHPC

16/04/2026

Group of people standing in front of a building entrance, posing for a photo.

Published by wp_epicure on 16/04/2026

Categories

Building Better HPC Support Together: Highlights from the EPICURE Annual Meeting 2026

14/04/2026

Abstract scientific visualisation of molecular interaction, showing fluid-like structures in blue and white with a highlighted central binding area in orange and red.

Published by Catarina Fernandes on 14/04/2026

Categories

Accelerating drug discovery through scalable AI and HPC

17/03/2026

Close-up view of illuminated server racks and cooling tubes inside a high-performance computing data centre, with blue and red lighting reflecting across the floor.

Published by Catarina Fernandes on 17/03/2026

Categories

How an EPICURE Project Is Supported (with an Example from BSC)

EPICURE completes review meeting with solid progress

The last 18 months with EPICURE: why ask for help?

EPICURE in practice: speed and outcomes

What changed for supported projects by EPICURE?

Application support for projects of all sizes

How to engage with EPICURE

Partners

Comments

Leave a Reply Cancel reply

Discover other posts

Scaling financial AI for non-English markets with EuroHPC

Building Better HPC Support Together: Highlights from the EPICURE Annual Meeting 2026

Accelerating drug discovery through scalable AI and HPC

EPICURE Hackathon: speeding up your code

This project has received funding from the European High-Performance Computing Joint Undertaking under grant agreement No.101139786.

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or EuroHPC Joint Undertaking. Neither the European Union nor the granting authority can be held responsible for them.

Developed By SUBA.PT

Copyright © Epicure 2024

Privacy Policy