Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JNB:BUG: Updated Suite2P example workflow for new Suite2p API and multiprocessing bug #181

Closed
wants to merge 164 commits into from

Conversation

swkeemink
Copy link
Member

@swkeemink swkeemink commented Jun 16, 2021

The Suite2p notebook had stopped working due to an API update in Suite2p, as well as an undetermined change in either Suite2p or some dependency that broke multiprocessing in FISSA (see #147), but only after Suite2p was first run. This PR fixes both.

A bit of background and summary fixing the multiprocessing issue:

First, the basics of spawn and fork processes --
https://docs.python.org/3/library/multiprocessing.html
The 'spawn' context starts a fresh python interpreter process, and is slower to initialize. This is default in macOS and Windows.
The 'fork' context inherets from another process, and is faster to initialize. This is default on Unix.

The 'fork' context can get in trouble with improper attribution of Pools (as per here https://pythonspeed.com/articles/python-multiprocessing/). This is normally not a problem, but is a problem if we just ran Suite2p in the same Python instance before. Indeed, by using the spawn context within FISSA we can fix this problem. It does make initializing Pools a lot slower however, and for example as a result on my system running pytest once goes from 4-5 seconds to ~40 seconds.

By setting

from multiprocessing import set_start_method
set_start_method("spawn") 

at the start of any code that wants to run both Suite2p and FISSA in the same Python instance we can make sure they work nicely together.

This fixes #150 and #147.

@swkeemink
Copy link
Member Author

@scottclowe : it would be very useful if you could try running the new notebook to check if this fix works across systems. I'll see if I can get a Windows test-run also.

@codecov
Copy link

codecov bot commented Jun 16, 2021

Codecov Report

Merging #181 (4380376) into master (fd9d282) will not change coverage.
The diff coverage is n/a.

❗ Current head 4380376 differs from pull request most recent head fa5a1f5. Consider uploading reports for the commit fa5a1f5 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master     #181   +/-   ##
=======================================
  Coverage   92.65%   92.65%           
=======================================
  Files           8        8           
  Lines         831      831           
  Branches      165      165           
=======================================
  Hits          770      770           
  Misses         31       31           
  Partials       30       30           
Flag Coverage Δ
unittests 92.53% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd9d282...fa5a1f5. Read the comment docs.

@scottclowe
Copy link
Member

Can you please update .binder/environment.yml to be the output of:

conda activate your_fissa_suite2p_env_name
conda env export --from-history -f environment.yml

@swkeemink
Copy link
Member Author

conda env export --from-history -f environment.yml
doesn't give anything (just a barebones enviornment file), as probably I pip installed the requirements? Secondly, the environment is in Python 3.9, but environment.yml is for Python 3.6. Should we update the environment to be Python 3.9 or should I try to get it running in Python 3.6?

@scottclowe
Copy link
Member

I think the latest release of Suite2p requires python 3.7, but it is hard to be sure.

We need a working environment for Binder to run. At the moment Binder can't initialise the environment at all. If you can get it running in Python 3.6, that would be great as that is compatible with SIMA. If not, any other version of Python is fine. I think we need to specify more things in the environment.yml to prevent it from breaking in the future, but the problem is a complete export of everything into environment.yml includes system specific things in site-packages, which shouldn't be included. The command I gave above was the instruction from Binder on how to export an environment that didn't include those, but apparently it doesn't include enough.

If we can't get an environment which works with Suite2p for Binder, we can remove Suite2p from the Binder. the link to Suite2p on Binder and just use a streamlined environment which runs the fissa-only notebooks.

@scottclowe
Copy link
Member

An alternative solution for the multiprocessing issue (#147) would be to use joblib, an alternative to the multiprocessing built into Python, which was recommended to me. Apparently it is used by sklearn.

@swkeemink
Copy link
Member Author

I think the latest release of Suite2p requires python 3.7, but it is hard to be sure.

We need a working environment for Binder to run. At the moment Binder can't initialise the environment at all. If you can get it running in Python 3.6, that would be great as that is compatible with SIMA. If not, any other version of Python is fine. I think we need to specify more things in the environment.yml to prevent it from breaking in the future, but the problem is a complete export of everything into environment.yml includes system specific things in site-packages, which shouldn't be included. The command I gave above was the instruction from Binder on how to export an environment that didn't include those, but apparently it doesn't include enough.

Yes Suite2p will only work with Python >3.7 because of importing annotation from future (https://www.python.org/dev/peps/pep-0563/#enabling-the-future-behavior-in-python-3-7).

So basically we can't get SIMA and Suite2p running on the same environment.

If we can't get an environment which works with Suite2p for Binder, we can remove Suite2p from the Binder. the link to Suite2p on Binder and just use a streamlined environment which runs the fissa-only notebooks.
But we could still run SIMA since it works anyway?

An alternative solution for the multiprocessing issue (#147) would be to use joblib, an alternative to the multiprocessing built into Python, which was recommended to me. Apparently it is used by sklearn.
That looks very good actually, but would want to do that as a separate PR, as it is a significant backend-change. (also want to do a bit more reading on joblib)

@scottclowe
Copy link
Member

If we can't get an environment which works with Suite2p for Binder, we can remove Suite2p from the Binder. the link to Suite2p on Binder and just use a streamlined environment which runs the fissa-only notebooks.

But we could still run SIMA since it works anyway?

When picking between the two, I would rather Binder had a Python 3.7+ environment that runs Suite2p than a Python 3.6 environment that runs SIMA. Because SIMA is no longer maintained, whereas Suite2p is, and I think Suite2p is more popular. But the environment needs to include a locked Suite2p version, to prevent the notebook breaking in the future.

BTW, you will need to rebase onto master to fix the merge conflicts. The conflict should just be about blackening the notebooks. Now that we have cleared versions of the notebooks, it should be much more practical to resolve the merge conflict.

@swkeemink
Copy link
Member Author

swkeemink commented Jun 23, 2021

BTW, you will need to rebase onto master to fix the merge conflicts. The conflict should just be about blackening the notebooks. Now that we have cleared versions of the notebooks, it should be much more practical to resolve the merge conflict.

Yes absolutely amazing update for tracking changes to notebooks, I found it very useful when doing #200.

@swkeemink
Copy link
Member Author

swkeemink commented Jun 23, 2021

When picking between the two, I would rather Binder had a Python 3.7+ environment that runs Suite2p than a Python 3.6 environment that runs SIMA. Because SIMA is no longer maintained, whereas Suite2p is, and I think Suite2p is more popular. But the environment needs to include a locked Suite2p version, to prevent the notebook breaking in the future.

Yes Suite2p is a lot more popular. In that case I'll make a 3.9 environment in which all the notebooks except SIMA work.

@swkeemink
Copy link
Member Author

Updating the yaml is taking a while because I'm having trouble getting the updated environment to work as a separate install.

Additionally, should we just put at the top of the sima notebook 'sima will only run on python 3.6, and thus you cannot run this notebook in the binder'? Or can we just not make it show up on binder (to start with, removing the link from the readme).

@scottclowe
Copy link
Member

Additionally, should we just put at the top of the sima notebook 'sima will only run on python 3.6, and thus you cannot run this notebook in the binder'? Or can we just not make it show up on binder (to start with, removing the link from the readme).

Yes, we should do both of these.

scottclowe and others added 28 commits June 28, 2021 14:13
Too many functions and methods named "separate".
We no longer support the SIMA notebook on Binder, as it requires Python
3.6.
@swkeemink swkeemink closed this Jun 29, 2021
@swkeemink swkeemink deleted the suite2p_notebook branch June 29, 2021 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Suite2P implementation
2 participants