[BK-GC] avoid blocking call in gc-thread
authorRajan Dhabalia <rdhabalia@apache.org>
Wed, 15 May 2019 06:35:18 +0000 (23:35 -0700)
committerEnrico Olivelli <eolivelli@gmail.com>
Wed, 15 May 2019 06:35:18 +0000 (08:35 +0200)
commitf5ddc36cc30c5bab5d2e8ebc5fa552c2ad0d6eea
treeaa8bbcefb14bee6c1c6f6523385e9bf00359edd0
parentd35aa22ade87969a4f8e932d925c5d134feb764b
[BK-GC] avoid blocking call in gc-thread

### Motivation

Right now, we have below 3 issues because of which gc thread gets blocked forever and it can't perform gc-task further. Below issues are mainly related to blocking call while doing zk-operation without timeout.

bug-fixes:
1. right now, [GC - ScanAndCompareGarbageCollector](https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/ScanAndCompareGarbageCollector.java#L142) passes timeout in millisecond to [LedgerManager](https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/LongHierarchicalLedgerManager.java#L166) but it
takes it as second and again try to convert it in millis so, 30Kms timeout becomes [30M ms timeout](https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/util/ZkUtils.java#L245). Sp, fix timeout unit during gc.

2. Right now, GC makes blocking call to get list of children on ledger znode and sometime zk-call back doesn't comeback which blocks the gc-thread forever. However, recently we added the timeout on the [object-waiting-lock](https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/util/ZkUtils.java#L243-L248) which doesn't work because it's in while loop and `object.wait(timeout)` completes without any exception and GC threads keep running in while loop.

3. add zk-timeout during delete ledgers in bookie else it can also block the GC thread.

### Changes

add timeout while bk-gc makes zk-call to verify deleted ledgers.

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>, Rajan Dhabalia <rdhabalia@apache.org>

This closes #1940 from rdhabalia/verify_gc
13 files changed:
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/ScanAndCompareGarbageCollector.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/CleanupLedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/FlatLedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/HierarchicalLedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/LedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/LegacyHierarchicalLedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/LongHierarchicalLedgerManager.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/MSLedgerManagerFactory.java
bookkeeper-server/src/main/java/org/apache/bookkeeper/util/ZkUtils.java
bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/CompactionTest.java
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/ParallelLedgerRecoveryTest.java
bookkeeper-server/src/test/java/org/apache/bookkeeper/meta/MockLedgerManager.java
metadata-drivers/etcd/src/main/java/org/apache/bookkeeper/metadata/etcd/EtcdLedgerManager.java