[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Any ideas Re: #1030096 dask.distributed intermittent autopkgtest fail ?



(Background: the pandas + dask transition broke dask.distributed and it was hence removed from testing; I didn't notice at the time that if we don't get it back in we lose Spyder.)

On 05/02/2023 21:44, Rebecca N. Palmer wrote:
I currently have this in a state where it sometimes succeeds and sometimes doesn't:
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/tree/fix1030096

Tests I've seen to fail multiple times (and don't have a fix for):
test_balance_expensive_tasks[enough work to steal]
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/jobs/3902376
(Seems to be the most common problem.  Using @pytest.mark.flaky to try 3 times doesn't seem to have helped, suggesting that if it fails once it keeps failing in that run.  Applying part of upstream pull 7253 seemed to make things worse, but I haven't tried applying the whole thing.)
test_popen_timeout
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/jobs/3902745

Tests I've seen to fail once:
test_stress_scatter_death
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/jobs/3902040
test_tcp_many_listeners[asyncio]
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/jobs/3896327

And now test_failing_task_increments_suspicious (once):
https://salsa.debian.org/rnpalmer-guest/dask.distributed/-/jobs/3903956
(We don't have to pass build-i386 (as this is an arch:all package) or reprotest, but if these are effectively-random failures, they might also be able to occur in build or autopkgtest.)

I'm probably the wrong person to be working on this - I don't know enough about this package to say whether ignoring this kind of intermittent failure (as my 'flaky' marks do) is appropriate, or to have much idea how to actually fix it.

We could also try upgrading dask + dask.distributed to 2023.1, but that's a risky move at this point.


Reply to: