Pytest-cov leaves .coverage.hostname.number.number data files if running tests against multiprocessing.Pool #250

manycoding · 2018-12-24T16:44:42Z

Running pytest with coverage for tests against multiprocessing.Pool generates several data files, but not all of them are cleaned after. It looks similar to #100.
Pool.join() doesn't seem to change anything.
Example filename .coverage.myhost.local.7582.948066

Here's the repository to reproduce: https://github.com/manycoding/pytest_cov_pool_datafiles_250

Your operating system name and version:
Mac OS Mojave 10.14 (18A391)

Any details about your local setup that might be helpful in troubleshooting:

pipenv shell; pipenv install

Pipfile

pytest = "*"
pytest-cov = "*"
pytest-mock = "*"
tox-pipenv = "*"
pytest-pythonpath = "*"

Detailed steps to reproduce the bug:

pytest --cov=src --cov-report=term-missing  tests/test_cov_pool.py

The text was updated successfully, but these errors were encountered:

manycoding · 2019-02-14T14:57:31Z

@ionelmc Hi, any thoughts on this?

blueyed · 2019-02-14T16:01:04Z

Try running coverage combine manually afterwards.
Maybe it gets created after pytest-cov finishes? (check the timestamps)

manycoding · 2019-02-14T16:31:16Z

It gets created during pytest execution and I see warnings on these files:

Coverage.py warning: Data file .coverage.myhost.local.26934.807659' doesn't seem to be a coverage data file: Couldn't use >data file .coverage.myhost.local.26934.807659': no such table: coverage_schema

If I execute coverage report separately, no files are created.
coverage combine throws same errors as above.

blueyed · 2019-02-14T17:29:00Z

no such table

That looks like it is using the new format from coveragepy 5.
And that might be caused by using different configs, or rather something not using the config (e.g. when run via tox / in a temporary dir).

You can try exporting COVERAGE_DEBUG (see the coveragepy docs to get more information).

manycoding · 2019-02-14T18:31:17Z

It happens with both coverage 5 and 4.5.2
Here is the log which I got with export COVERAGE_DEBUG=config,multiproc,sys,plugin
https://gist.github.com/manycoding/6e92cdb3d994085c7660f7f2127a5dbe

blueyed · 2019-02-14T19:10:39Z

Looking at "config_reads:" indicates that it always uses your tox.ini, that looks fine.
I would also add "process" to COVERAGE_DEBUG.

What I've meant is that the error looks like coverage 4 might read a file created by 5.

manycoding · 2019-02-15T15:44:01Z

@blueyed I use pipenv and clean envs, so there's no way that coverage 4 and 5 files can be mixed.

manycoding · 2019-02-15T15:47:14Z

Updated the gist above with the debug output containing process

blueyed · 2019-02-15T16:24:57Z

So then you were using coverage 5 with the following error:

Coverage.py warning: Data file .coverage.myhost.local.26934.807659' doesn't seem to be a coverage data file: Couldn't use >data file .coverage.myhost.local.26934.807659': no such table: coverage_schema

Have you inspected the file? Is it maybe just empty? Maybe coveragepy is crashing after all.

Anyway, I cannot really help further I guess.

manycoding · 2019-02-15T16:39:45Z

Yep it's empty and I think it's the same problem as in #100 for no particular reason.

I tested it both with coverage 4 and 5 and in both versions the problem persists.

blueyed · 2019-02-15T16:51:42Z

You could try to figure out where it gets created from: either from/based on logs (search for the pid from the filename there), and/or by bisecting your tests (i.e. run the first 50% first, etc), and figure out from there which test creates it.

ionelmc · 2019-02-16T17:28:19Z

So I've traced a bit the example. It turns out that once in a while there's an unclean shutdown, eg:

551756 19:25:52.717097916 3 py.test (27575) < openat fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512) dirfd=-100(AT_FDCWD) name=/code/.coverage.ec28a80d3a90.3493.150512 flags=4358(O_TRUNC|O_CREAT|O_WRONLY|O_CL
OEXEC) mode=0666
551757 19:25:52.717099916 3 py.test (27575) > fstat fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512)
551758 19:25:52.717101016 3 py.test (27575) < fstat res=0
551759 19:25:52.717103516 3 py.test (27575) > ioctl fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512) request=5401 argument=7FFF9CA7E740
551760 19:25:52.717104616 3 py.test (27575) < ioctl res=-25(ENOTTY)
551761 19:25:52.717108516 3 py.test (27575) > lseek fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512) offset=0 whence=1(SEEK_CUR)
551762 19:25:52.717109016 3 py.test (27575) < lseek res=0
551763 19:25:52.717111916 3 py.test (27575) > ioctl fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512) request=5401 argument=7FFF9CA7E6C0
551764 19:25:52.717112316 3 py.test (27575) < ioctl res=-25(ENOTTY)
551765 19:25:52.717133316 3 py.test (27575) > lseek fd=10(<f>/code/.coverage.ec28a80d3a90.3493.150512) offset=0 whence=1(SEEK_CUR)
551766 19:25:52.717134116 3 py.test (27575) < lseek res=0
551767 19:25:52.717173016 0 py.test (27500) > write fd=12(<p>) size=8
551768 19:25:52.717175917 0 py.test (27500) < write res=8 data=
N.
551769 19:25:52.717189917 3 py.test (27575) > lstat
551770 19:25:52.717193517 3 py.test (27575) < lstat res=0 path=/usr
551771 19:25:52.717200517 3 py.test (27575) > lstat
551772 19:25:52.717202417 3 py.test (27575) < lstat res=0 path=/usr/lib
551773 19:25:52.717204817 0 py.test (27500) > wait4
551774 19:25:52.717206917 3 py.test (27575) > lstat
551775 19:25:52.717208617 3 py.test (27575) < lstat res=0 path=/usr/lib/python3.7
551776 19:25:52.717212117 0 py.test (27500) < wait4
551777 19:25:52.717213017 3 py.test (27575) > lstat
551778 19:25:52.717214717 3 py.test (27575) < lstat res=0 path=/usr/lib/python3.7/json
551779 19:25:52.717218917 3 py.test (27575) > lstat
551780 19:25:52.717220317 0 py.test (27500) > wait4
551781 19:25:52.717220917 3 py.test (27575) < lstat res=0 path=/usr/lib/python3.7/json/encoder.py
551782 19:25:52.717221817 0 py.test (27500) < wait4
551783 19:25:52.717229717 0 py.test (27500) > kill pid=3492 sig=15(SIGTERM)
551784 19:25:52.717247717 0 py.test (27500) < kill res=0
551785 19:25:52.717251317 0 py.test (27500) > wait4
551786 19:25:52.717252017 0 py.test (27500) < wait4
551787 19:25:52.717255217 0 py.test (27500) > kill pid=3493 sig=15(SIGTERM)
551788 19:25:52.717256217 1 py.test (27574) > signaldeliver spid=0 dpid=27574(py.test) sig=9(SIGKILL)
551789 19:25:52.717266017 0 py.test (27500) < kill res=0
551790 19:25:52.717268717 3 py.test (27575) > signaldeliver spid=0 dpid=27575(py.test) sig=9(SIGKILL)

ionelmc · 2019-02-16T17:42:49Z

So this really simple trick (/s) makes the problem go away:

@@ -14,7 +14,8 @@ def get_items_with_pool(count: int, start_index: int = 0):
     processes_count = os.cpu_count()

     batch_size = math.ceil(count / processes_count)
-    with Pool(processes_count) as p:
+    p = Pool(processes_count)
+    try:
         results = p.starmap(
             get_items,
             zip(
@@ -24,10 +25,7 @@ def get_items_with_pool(count: int, start_index: int = 0):
         )
         for items_batch in results:
             items.extend(items_batch)
-    p.join()
+    finally:
+        p.close()
+        p.join()
     return items

For some reason Pool.__exit__ sends KILL in addition to TERM ...

ionelmc · 2019-02-16T19:31:07Z

So my plan for handling this situation is to offer users two solutions:

as documentation change (document the p.close() trick in https://pytest-cov.readthedocs.io/en/latest/mp.html#ungraceful-pool-shutdown)
as fixes to pytest_cov.embed.cleanup_on_sigterm (for people that can't change how the process poll is managed) - also, it turns out there's a separate issue there (more details later)

ionelmc · 2019-02-16T19:38:45Z

So the secondary issue if you use pytest_cov.embed.cleanup_on_sigterm is that it overlaps with multiprocessing_start that also registers a cleanup handler via multiprocessing.util.Finalize.

Thus sometimes while the finalizer runs the signal handler will also run and leave one unfinished file around (it will os._exit and the other file is left dangling).

ssbarnea · 2020-02-06T12:23:11Z

I am randomly seeing something very similar to this on CI, where we wipe everyting between runs. See https://dashboard.zuul.ansible.com/t/ansible/build/28f84185f22a448f92f00f47253fc931

My impression is that happens only on py27 jobs but still is a PITA and I do not know what to do to avoid it.

cgnorthcutt · 2020-02-26T03:43:08Z

I also see this issue on TravisCI, running Windows server, python 3.6 and python 3.7. Scroll down to the bottom of the output here: https://travis-ci.com/cgnorthcutt/cleanlab/jobs/291188046

ssbarnea · 2020-02-26T10:25:07Z

This forced me to pin down coverage version below 4 for all versions of python. Mainly this is caused by the fact the coverage expects all files to be generated using the same version of coverage. If you happen to still support py27, you are forced to use older coverage for all environments, not only those using py27.

I guess that the decision to drop py27 support backfires because it forces many users to avoid newer version completely.

ssbarnea · 2020-03-21T12:22:46Z

It seems that I keep seeing these random-like errors https://dashboard.zuul.ansible.com/t/ansible/build/fa4b20a11a484ee39050b11f619484b1 -- still more likely on py27.

Bug: pytest-dev/pytest-cov#250

domeniconappo · 2020-03-22T14:13:05Z

Hi,
I also see those errors when I test some functions that use Dask for multiprocessing.
Can it be a regression? It seems the issue was fixed Feb 2019 but it pops out again.

Thanks!

GergelyKalmar · 2020-05-21T12:06:26Z

This issue seems to have appeared for me too with coverage 5.0 and pytest 5.4.2.

ionelmc · 2020-05-21T12:34:40Z

@GergelyKalmar please open a new issue and add details

This was referenced Feb 16, 2019

Improve cleanup_on_sigterm #262

Merged

Remove multiprocess finalizer and improve subprocess docs #265

Merged

ionelmc added a commit that referenced this issue Feb 17, 2019

Add a test for #250.

8a2bfa8

ionelmc closed this as completed in #265 Mar 9, 2019

ssbarnea added a commit to ssbarnea/molecule that referenced this issue Mar 21, 2020

Avoid unreliable coverage versions

7294371

Bug: pytest-dev/pytest-cov#250

ssbarnea mentioned this issue Mar 21, 2020

Avoid unreliable coverage versions ansible/molecule#2618

Merged

yarikoptic mentioned this issue Aug 14, 2020

ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist dandi/dandi-cli#194

Closed

GergelyKalmar mentioned this issue Aug 17, 2020

pytest-cov warnings when used with parallel pytest-pylint runs #424

Closed

euri10 mentioned this issue Oct 6, 2020

Update .gitignore to ignore .coverage.* encode/uvicorn#805

Closed

tsalo mentioned this issue Oct 7, 2021

Coverage-related warnings in CircleCI runs neurostuff/NiMARE#528

Closed

john-science mentioned this issue Jan 4, 2022

MPI unit tests finally working again terrapower/armi#521

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytest-cov leaves .coverage.hostname.number.number data files if running tests against multiprocessing.Pool #250

Pytest-cov leaves .coverage.hostname.number.number data files if running tests against multiprocessing.Pool #250

manycoding commented Dec 24, 2018 •

edited

Loading

manycoding commented Feb 14, 2019

blueyed commented Feb 14, 2019

manycoding commented Feb 14, 2019 •

edited

Loading

blueyed commented Feb 14, 2019

manycoding commented Feb 14, 2019 •

edited

Loading

blueyed commented Feb 14, 2019

manycoding commented Feb 15, 2019

manycoding commented Feb 15, 2019

blueyed commented Feb 15, 2019

manycoding commented Feb 15, 2019 •

edited

Loading

blueyed commented Feb 15, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ssbarnea commented Feb 6, 2020

cgnorthcutt commented Feb 26, 2020

ssbarnea commented Feb 26, 2020

ssbarnea commented Mar 21, 2020

domeniconappo commented Mar 22, 2020

GergelyKalmar commented May 21, 2020

ionelmc commented May 21, 2020

Pytest-cov leaves .coverage.hostname.number.number data files if running tests against multiprocessing.Pool #250

Pytest-cov leaves .coverage.hostname.number.number data files if running tests against multiprocessing.Pool #250

Comments

manycoding commented Dec 24, 2018 • edited Loading

manycoding commented Feb 14, 2019

blueyed commented Feb 14, 2019

manycoding commented Feb 14, 2019 • edited Loading

blueyed commented Feb 14, 2019

manycoding commented Feb 14, 2019 • edited Loading

blueyed commented Feb 14, 2019

manycoding commented Feb 15, 2019

manycoding commented Feb 15, 2019

blueyed commented Feb 15, 2019

manycoding commented Feb 15, 2019 • edited Loading

blueyed commented Feb 15, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ionelmc commented Feb 16, 2019

ssbarnea commented Feb 6, 2020

cgnorthcutt commented Feb 26, 2020

ssbarnea commented Feb 26, 2020

ssbarnea commented Mar 21, 2020

domeniconappo commented Mar 22, 2020

GergelyKalmar commented May 21, 2020

ionelmc commented May 21, 2020

manycoding commented Dec 24, 2018 •

edited

Loading

manycoding commented Feb 14, 2019 •

edited

Loading

manycoding commented Feb 14, 2019 •

edited

Loading

manycoding commented Feb 15, 2019 •

edited

Loading