Name: Word Count
Contact Person: [email protected]
Access Level: public
License Agreement: Apache2
Platform: COMPSs
Wordcount is an application that counts the number of words for a given set of files.
To allow parallelism the file is divided in blocks that are treated separately and merged afterwards.
Results are printed to a Pickle binary file, so they can be checked using: python -mpickle result.txt
This example also shows how to manually add input or output datasets to the workflow provenance recording (using the 'input' and 'output' terms in the ro-crate-info.yaml file).
Execution instructions
runcompss --lang=python $(pwd)/application_sources/src/ filePath resultPath blockSize
- filePath: Absolute path of the file to parse
- resultPath: Absolute path to the result file
- blockSize: Size of each block. The lower the number, the more tasks will be generated in the workflow
Execution Examples
runcompss --lang=python $(pwd)/application_sources/src/ $(pwd)/dataset/data/compss.txt result.txt 300
runcompss $(pwd)/application_sources/src/ $(pwd)/dataset/data/compss.txt result.txt 300
python -m pycompss $(pwd)/application_sources/src/ $(pwd)/dataset/data/compss.txt result.txt 300
No build is required
Click and drag the diagram to pan, double click or use the controls to zoom.
Version History
COMPSs 3.3 (earliest) Created 15th Dec 2023 at 14:57 by Raül Sirvent
Run using COMPSs 3.3 version at Marenostrum IV supercomputing, using 1 node (48 cores).

Additional credit
The Workflows and Distributed Computing Team (
Views: 3335 Downloads: 462
Created: 15th Dec 2023 at 14:57
