Single node pipeline processing #58

manuparra · 2023-06-02T10:18:21Z

Hi all, I've been reviewing the code and I see that the pipeline scripts are tightly coupled to their execution in slurm (of course this is a pipeline for this kind of clusters). We are doing some tests to run the pipeline without slurm, directly using singularity, but we see that there are references to variables and of course to procedures of the slurm structure, which makes this complicated.

Do you think it could be ported to a model without slurm?

The text was updated successfully, but these errors were encountered:

Jordatious · 2023-06-02T11:05:35Z

Hi @manuparra, it's generally on the road map to support other non-SLURM platforms, although I must admit that single node/VM support isn't a strong part of that, as multi-node processing is a crucial design of the pipeline. However, I believe one can use MPI on a single node/VM anyway, assuming there are enough cores to make it worthwhile. And a few tasks like tclean can also make use of multiple cores through OpenMP.

So, with a few tweaks, I have used the pipeline successfully on a single VM. However, that was before version 1.1, in which we introduced SPW splitting.

To do this, one can take the sbatch scripts written and simply run them as bash scripts, since the #SBATCH lines are commented out. The tweaks include removing the SLURM srun wrapper, which is easily done within the code, or otherwise making srun a script/alias or something understood by the system. Another tweak is to write your own submit_pipeline.sh script, where instead of using SLURM dependencies, you use the bash && operator, which will achieve a similar effect by running each job after the previous one successfully finishes, or discontinuing the job if the previous job crashes. So for example, if you first make all the sbatch scripts executable (e.g. with chmod +x validate_input.sbatch), using the default scripts in the scripts config parameter, you could write something like this:

./partition.sbatch && ./validate_input.sbatch && ./flag_round_1.sbatch && ./calc_refant.sbatch && ./setjy.sbatch && ./xx_yy_solve.sbatch && ./xx_yy_apply.sbatch && ./flag_round_2.sbatch && ./xx_yy_solve.sbatch && ./xx_yy_apply.sbatch && ./split.sbatch && ./quick_tclean.sbatch

And you might also want to look at redirecting the output to a log within your sbatch scripts, with something like this:

1> logs/validate_input.out 2> logs/validate_input.err

However, doing this with SPW splitting (where nspw > 1) would require a bit more thought. Writing a script to launch the custom submit_pipeline.sh scripts inside each SPW wouldn't be difficult, but the trick after that would be automatically running the post-cross-calibration scripts such as concatenation and further selfcal and science imaging. But it would be easy splitting that into a separate step following the previous example. It would just require further intervention by the user after the first cross-calibration steps have run over all SPWs.

Another trick that might be useful is the [-l --local] option, which bypasses SLURM/srun and builds the pipeline without it.

Are you thinking of doing some of this development yourself? What's the platform and software you're using? I'd be happy to walk you through doing some of these things if it's useful.

Jordatious · 2023-06-02T11:11:06Z

Hi @manuparra, I suppose you could consider a high level script that runs the custom submit_pipeline.sh scripts inside the SPW directories, and then also uses the && operator and || operator (e.g. when you wish to run concat, selfcal and science imaging even if some SPWs failed) to then run the final imaging steps in a similar way.

Of course, beyond that, you could abandon the bash approach altogether and use Python or something else, but that may require quite a bit of development.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single node pipeline processing #58

Single node pipeline processing #58

manuparra commented Jun 2, 2023

Jordatious commented Jun 2, 2023 •

edited

Loading

Jordatious commented Jun 2, 2023

Single node pipeline processing #58

Single node pipeline processing #58

Comments

manuparra commented Jun 2, 2023

Jordatious commented Jun 2, 2023 • edited Loading

Jordatious commented Jun 2, 2023

Jordatious commented Jun 2, 2023 •

edited

Loading