Joined: 30th Jun 2023
Expertise: Not specified
Tools: Not specified
Related items
eFlows4HPC project aims at providing workflow software stack and an additional set of services to enable the integration of HPC simulations and modelling with big data analytics and machine learning in scientific and industrial applications. The project is also developing the HPC Workflows as a Service (HPCWaaS) methodology that aims at providing tools to simplify the development, deployment, execution and reuse of workflows. The project demonstrates its advances through three application Pillars ...
Teams: Cluster Emergent del Cervell Humà, Workflows and Distributed Computing, Pillar I: Manufacturing, Pillar II: Climate, Pillar III: Urgent computing for natural hazards, eFlows4HPC general, COMPSs Tutorials
Web page:
Distributed computing aims to offer tools and mechanisms that enable the sharing, selection, and aggregation of a wide variety of geographically distributed computational resources in a transparent way. The research done in this team is based on the past expertise of the group, and on extending it towards the aspects of distributed computing that can benefit from this expertise. The team at BSC has a strong focus on programming models and resource management and scheduling in distributed computing ...
Space: eFlows4HPC
Public web page:
Organisms: Not specified
Project that aims to create the NeuroPlat portal for neurodrug design
Space: eFlows4HPC
Public web page:
Organisms: Not specified
Name: TruncatedSVD (Randomized SVD) Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: MareNostrum5
TruncatedSVD (Randomized SVD) for computing just 456 singular values out of a (4.5M x 850) size matrix. The input matrix represents a CFD transient simulation of air moving past a cylinder. This application used dislib-0.9.0
Name: Matmul GPU Case 1 Cache-ON Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4
Matmul running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000) B: shape (56_900_000, 10) block_size (11_380_000, 10) C: shape (320, 10) block_size ...
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: Matmul GPU Case 1 Cache-OFF Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs 3.3 Machine: Minotauro-MN4
Matmul running on the GPU without Cache. Launched using 32 GPUs (16 nodes). Performs C = A @ B Where A: shape (320, 56_900_000) block_size (10, 11_380_000) B: shape (56_900_000, 10) block_size (11_380_000, 10) C: shape (320, 10) block_size (10, 10) Total dataset size 291 ...
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: K-Means GPU Cache OFF Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4
K-Means running on GPUs. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9
Average task execution time: 194 seconds
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: K-Means GPU Cache ON Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4
K-Means running on the GPU leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Parameters used: K=40 and 32 blocks of size (1_000_000, 1200). It creates a block for each GPU. Total dataset shape is (32_000_000, 1200). Version dislib-0.9
Average task execution time: 16 seconds
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: Dislib Distributed Training - Cache ON Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4
PyTorch distributed training of CNN on GPU and leveraging COMPSs GPU Cache for deserialization speedup. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101
Average task execution time: 36 seconds
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: Dislib Distributed Training - Cache OFF Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: Minotauro-MN4
PyTorch distributed training of CNN on GPU. Launched using 32 GPUs (16 nodes). Dataset: Imagenet Version dislib-0.9 Version PyTorch 1.7.1+cu101
Average task execution time: 84 seconds
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu
Name: TruncatedSVD (Randomized SVD) Contact Person: [email protected] Access Level: public License Agreement: Apache2 Platform: COMPSs Machine: MareNostrum4
TruncatedSVD (Randomized SVD) for computing just 456 singular values out of a (3.6M x 1200) size matrix. The input matrix represents a CFD transient simulation of aire moving past a cylinder. This application used dislib-0.9.0
Type: COMPSs
Creators: Cristian Tatu, The Workflows and Distributed Computing Team (
Submitter: Cristian Tatu