Fixed random SlaveRecoveryTest.PingTimeoutDuringRecovery test failure. (#436) master
authorcf-natali <cf.natali@gmail.com>
Sun, 7 Aug 2022 10:29:21 +0000 (11:29 +0100)
committerGitHub <noreply@github.com>
Sun, 7 Aug 2022 10:29:21 +0000 (11:29 +0100)
commitfb3d05b152fa0e158231cc71b629bc92b62c0a3c
tree08ce8337dede78d9a6e9311f4c421ea21bc457ed
parent8894191338e5e7e9a0cfb7abed6b29110eba9a31
Fixed random SlaveRecoveryTest.PingTimeoutDuringRecovery test failure. (#436)

This test would randomly fail with:
```
18:16:59 3: F0501 17:16:59.192818 19175 slave.cpp:1445] Check
failed:
state == DISCONNECTED || state == RUNNING || state == TERMINATING
RECOVERING
```

The cause was that the test re-starts the slave with the same PID, which
means that timers started by the previous slave process could fire while
the new slave process was running.

In this specific case, what happened is that the previous slave's ping
timer would fire in the middle of recovery of the second slave instance,
yielding this assertion.

Fixed by cancelling the `pingTimer` in the slave destructor.

Tested by running the test in a loop, while running a CPU-intensive
workload - `stress-ng --cpu $(nproc)0` in parallel.
src/slave/slave.cpp