bookkeeper.git
2 months ago[BUILD] fix master branch broken http-core license check
ZhangJian He [Wed, 6 Apr 2022 15:37:28 +0000 (23:37 +0800)] 
[BUILD] fix master branch broken http-core license check

### Changes
- the bkctl and bk-server has different `http-core` version
- unify the `http-core` version to 4.4.15

Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #3183 from Shoothzj/fix-broken-license-check

2 months ago[website] add CI checks to validate the website (#3164)
Nicolò Boschi [Wed, 6 Apr 2022 09:17:07 +0000 (11:17 +0200)] 
[website] add CI checks to validate the website (#3164)

2 months agoBump minimist from 1.2.5 to 1.2.6 in /site3/website (#3179)
dependabot[bot] [Wed, 6 Apr 2022 09:07:50 +0000 (11:07 +0200)] 
Bump minimist from 1.2.5 to 1.2.6 in /site3/website (#3179)

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 months agoBump node-forge from 1.2.1 to 1.3.1 in /site3/website (#3180)
dependabot[bot] [Wed, 6 Apr 2022 09:06:54 +0000 (11:06 +0200)] 
Bump node-forge from 1.2.1 to 1.3.1 in /site3/website (#3180)

Bumps [node-forge](https://github.com/digitalbazaar/forge) from 1.2.1 to 1.3.1.
- [Release notes](https://github.com/digitalbazaar/forge/releases)
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md)
- [Commits](https://github.com/digitalbazaar/forge/compare/v1.2.1...v1.3.1)

---
updated-dependencies:
- dependency-name: node-forge
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2 months agoUse netty maxDirectMemory instead of DirectMemoryUtils
ZhangJian He [Wed, 6 Apr 2022 08:13:13 +0000 (16:13 +0800)] 
Use netty maxDirectMemory instead of DirectMemoryUtils

### Motivation

Our `DirectMemoryUtils` has huge limit, it can't work well with other jvm. The Netty `PlatformDependent.maxDirectMemory();` is more generic.

### Changes
Use `PlatformDependent.maxDirectMemory();` instead of `DirectMemoryUtils`

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>, Matteo Merli <mmerli@apache.org>, Nicolò Boschi <boschi1997@gmail.com>

This closes #2989 from Shoothzj/direct-memory

2 months ago[build] support apple m1 build (#3175)
ZhangJian He [Wed, 6 Apr 2022 05:46:17 +0000 (13:46 +0800)] 
[build] support apple m1 build (#3175)

2 months ago[netty] remove no longer used properties io.netty.recycler.linkCapacity and io.netty...
Nicolò Boschi [Sun, 3 Apr 2022 16:09:49 +0000 (18:09 +0200)] 
[netty] remove no longer used properties io.netty.recycler.linkCapacity and io.netty.recycler.maxCapacity.default (#3172)

* [netty] remove no longer used properties io.netty.recycler.linkCapacity and io.netty.recycler.maxCapacity.default

* gradle

2 months agoBump netty version to 4.1.75.Final, grpc to 1.45.1 (#3163)
ZhangJian He [Sun, 3 Apr 2022 08:00:21 +0000 (16:00 +0800)] 
Bump netty version to 4.1.75.Final, grpc to 1.45.1 (#3163)

2 months ago[build] Complement missing maven plugin version (#3166)
ZhangJian He [Sat, 2 Apr 2022 19:22:13 +0000 (03:22 +0800)] 
[build] Complement missing maven plugin version (#3166)

2 months ago[security] Bump bc fips version from 1.0.2.1 to 1.0.2.3 (#3087)
ZhangJian He [Sat, 2 Apr 2022 19:21:18 +0000 (03:21 +0800)] 
[security] Bump bc fips version from 1.0.2.1 to 1.0.2.3 (#3087)

2 months ago[WEBSITE] Update current stable version to 4.14.4
Nicolò Boschi [Fri, 1 Apr 2022 19:54:24 +0000 (21:54 +0200)] 
[WEBSITE] Update current stable version to 4.14.4

### Motivation

The current stable version is 4.14.4 but in the website is still 4.11.1

### Changes

* Edit variables to point to 4.14.4
* Cleanup staging website "download" page

Reviewers: Yong Zhang <zhangyong1025.zy@gmail.com>

This closes #3154 from nicoloboschi/update-stable-version

3 months ago[build] Fix various spotbugs warnings (#3160)
Nicolò Boschi [Fri, 1 Apr 2022 07:37:57 +0000 (09:37 +0200)] 
[build] Fix various spotbugs warnings  (#3160)

3 months agoFix region/rack aware placement police replace bookie bug (#2642)
Hang Chen [Thu, 31 Mar 2022 13:39:47 +0000 (21:39 +0800)] 
Fix region/rack aware placement police replace bookie bug (#2642)

3 months ago[website] Update committers page
Nicolò Boschi [Thu, 31 Mar 2022 11:45:12 +0000 (13:45 +0200)] 
[website] Update committers page

### Changes
Update committers info

Reviewers: Andrey Yegorov <None>

This closes #3161 from nicoloboschi/update-committers-nicoloboschi

3 months agoRevert rocksdb compaction on checkpoint to reduce cpu intensive (#3144)
Hang Chen [Thu, 31 Mar 2022 07:08:13 +0000 (15:08 +0800)] 
Revert rocksdb compaction on checkpoint to reduce cpu intensive (#3144)

3 months agocatch onBookieRackChange exception (#3060)
Hang Chen [Thu, 31 Mar 2022 05:42:34 +0000 (13:42 +0800)] 
catch onBookieRackChange exception (#3060)

### Motivation
When we update the bookie rack info, it will use all the bookie list to update rack topology. However If one bookie update failed and throw exception out, it will throw the exception out and the remains bookie info won't be updated into the rack topology, which will affect the ledger ensemble selection.

### Changes

Catch the bookie topology update exception to ensure the remaining bookies' info can be updated into the rack topology.

3 months agouse mockito.any instead of deprecated mockito.anyObject
ZhangJian He [Wed, 30 Mar 2022 23:10:25 +0000 (07:10 +0800)] 
use mockito.any instead of deprecated mockito.anyObject

### Changes
use mockito.any instead of deprecated mockito.anyObject

Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3152 from Shoothzj/use-mockito-any-instead-of-anyObject

3 months agoUpgrade ZooKeeper to 3.8.0 (#3145)
Enrico Olivelli [Wed, 30 Mar 2022 10:41:33 +0000 (12:41 +0200)] 
Upgrade ZooKeeper to 3.8.0 (#3145)

3 months agofix duplicate typeline for prometheus type (#3137)
ZhangJian He [Tue, 29 Mar 2022 06:29:00 +0000 (14:29 +0800)] 
fix duplicate typeline for prometheus type (#3137)

3 months ago[maven-release-plugin] prepare for next development iteration
Andrey Yegorov [Mon, 28 Mar 2022 23:01:51 +0000 (16:01 -0700)] 
[maven-release-plugin] prepare for next development iteration

3 months ago[maven-release-plugin] prepare branch branch-4.15
Andrey Yegorov [Mon, 28 Mar 2022 23:01:48 +0000 (16:01 -0700)] 
[maven-release-plugin] prepare branch branch-4.15

3 months agoFix NPE while reordering read-sequence for local-bookie ensemble policy
Rajan Dhabalia [Mon, 28 Mar 2022 22:54:35 +0000 (15:54 -0700)] 
Fix NPE while reordering read-sequence for local-bookie ensemble policy

### Motivation

When Bookie sanity and autoreovery use the same conf file which has flag `reorderReadSequenceEnabled=true` then bookie-sanity command throws NPE as `LocalBookieEnsemblePlacementPolicy::reorderReadLACSequence` returns null writesets which causes the sanity failure.

```
00:46:46.202 [BookKeeperClientWorker-OrderedExecutor-11-0] ERROR o.a.b.common.util.SafeRunnable       - Unexpected throwable caught
java.lang.NullPointerException: null
at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:399)
at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.read(PendingReadOp.java:385)
at org.apache.bookkeeper.client.PendingReadOp.initiate(PendingReadOp.java:529)
at org.apache.bookkeeper.client.LedgerRecoveryOp.doRecoveryRead(LedgerRecoveryOp.java:148)
at org.apache.bookkeeper.client.LedgerRecoveryOp.access$000(LedgerRecoveryOp.java:37)
at org.apache.bookkeeper.client.LedgerRecoveryOp$1.readLastConfirmedDataComplete(LedgerRecoveryOp.java:109)
at org.apache.bookkeeper.client.ReadLastConfirmedOp.readEntryComplete(ReadLastConfirmedOp.java:135)
at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1829)
at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleReadResponse(PerChannelBookieClient.java:1910)
at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleV3Response(PerChannelBookieClient.java:1885)
at org.apache.bookkeeper.proto.PerChannelBookieClient$3.safeRun(PerChannelBookieClient.java:1446)
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)

```

### Modification
Fix NPE for local ensemble policy while reading entry with `reorderReadSequenceEnabled` flag enabled.

Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3127 from rdhabalia/repl_seq

3 months agoFix doc code problem.
赵延 [Mon, 28 Mar 2022 22:23:14 +0000 (06:23 +0800)] 
Fix doc code problem.

fix docs code demo problem.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Andrey Yegorov <None>

This closes #3142 from horizonzy/fix-docs-code-problem

3 months agoBookieAutoRecoveryTest.testEmptyLedgerLosesQuorumEventually fix flaky test, ensure...
Enrico Olivelli [Mon, 28 Mar 2022 15:53:04 +0000 (17:53 +0200)] 
BookieAutoRecoveryTest.testEmptyLedgerLosesQuorumEventually fix flaky test, ensure that the Auditor is alive (#3149)

3 months ago[security] Upgrade jackson-databind to get rid of CVE-2020-36518 (#3140)
Nicolò Boschi [Sun, 27 Mar 2022 12:37:16 +0000 (14:37 +0200)] 
[security] Upgrade jackson-databind to get rid of CVE-2020-36518 (#3140)

* [security] Upgrade jackson-databind to get rid of CVE-2020-36518

3 months agoUpgrade rocksdb to 6.29.4.1 (#3143)
Hang Chen [Sun, 27 Mar 2022 09:53:36 +0000 (17:53 +0800)] 
Upgrade rocksdb to 6.29.4.1 (#3143)

3 months agoBringing back maven build (#3130)
Andrey Yegorov [Thu, 24 Mar 2022 16:15:09 +0000 (09:15 -0700)] 
Bringing back maven build (#3130)

* Revert "[build] remove Maven POM files (#3009)"

This reverts commit e089b51ab5e1cf5f061c81463d27f33a21198271.

* rxjava: add maven dependency

(cherry picked from commit ac73541ce79953141a08c60642cd39c7984ade1e)

* Bring guava to the same version as gradle

* ignore deprecation warnings in tests

* mockito-inline, as in gradle + suppress warnings

* suppressed warning

* Exclude site3/ from RAT check

* CI to use (mostly) maven

* OWASP check with maven

* Up'd versions to match gradle, corrected license files: looks like gradle build didn't force versions consistently

* Removed current-version-image to match https://github.com/apache/bookkeeper/pull/3027

* Shading patetrn to match gradle

* Fixed/suppressed CVEs

* Attempt to fix failing tests in CompactionByEntriesWithMetadataCacheTest

Co-authored-by: lushiji <lushiji@didiglobal.com>
3 months agoPendingReadOp: Fix ledgerEntryImpl reuse problem (#3110)
congbo [Thu, 24 Mar 2022 08:22:02 +0000 (16:22 +0800)] 
PendingReadOp: Fix ledgerEntryImpl reuse problem (#3110)

3 months agoSet BOOKIE_HTTP_PORT to make it optional in docker run (#3096)
Kezhu Wang [Wed, 23 Mar 2022 08:01:53 +0000 (16:01 +0800)] 
Set BOOKIE_HTTP_PORT to make it optional in docker run (#3096)

Fixes #3075.

3 months agoIssue #3105: Optimize OrderedExecutor performance by using GrowableArrayBlockingQueue...
Lari Hotari [Wed, 23 Mar 2022 08:01:00 +0000 (10:01 +0200)] 
Issue #3105: Optimize OrderedExecutor performance by using GrowableArrayBlockingQueue (#3108)

Fixes #3105

3 months agoRemove unused site2 directory (#3116)
Nicolò Boschi [Wed, 23 Mar 2022 08:00:34 +0000 (09:00 +0100)] 
Remove unused site2 directory (#3116)

3 months agoLog NoLedgerException on debug level (#3117)
Andras Beni [Mon, 21 Mar 2022 15:06:01 +0000 (16:06 +0100)] 
Log NoLedgerException on debug level (#3117)

NoLedgerException does not signify an error in the Bookie that needs
to be fixed. Instead it is - at most - a user error that the user is
notified about via the status code ENOLEDGER.
Logging this problem at error level introduces an odd difference
between the behavior of readLac using v2 versus v3 protocol version.
In the former case ReadEntryProcessor logs the same problem at debug
level. As a result changing protocol version appers to be introducing
an error.

3 months ago[WEBSITE] Staging website: images are not visible
Nicolò Boschi [Fri, 18 Mar 2022 16:40:43 +0000 (17:40 +0100)] 
[WEBSITE] Staging website: images are not visible

### Motivation

In the staging website image links are broken

### Changes

* Fixed the links with the correct syntax (I checked every `img` reference)

Reviewers: Andrey Yegorov <None>

This closes #3126 from nicoloboschi/website/fix-imgs

3 months ago[website] fix GITHUB_TOKEN used for deployment (#3125)
Nicolò Boschi [Fri, 18 Mar 2022 10:43:42 +0000 (11:43 +0100)] 
[website] fix GITHUB_TOKEN used for deployment (#3125)

3 months ago[website] use GH_TOKEN to push to asf-site and asf-staging branches (#3124)
Nicolò Boschi [Fri, 18 Mar 2022 08:59:11 +0000 (09:59 +0100)] 
[website] use GH_TOKEN to push to asf-site and asf-staging branches (#3124)

3 months ago[website] Fix deploy push to git - use ssh (#3123)
Nicolò Boschi [Thu, 17 Mar 2022 16:27:50 +0000 (17:27 +0100)] 
[website] Fix deploy push to git - use ssh (#3123)

3 months ago[website] fix deploy staging script (#3122)
Nicolò Boschi [Thu, 17 Mar 2022 16:09:04 +0000 (17:09 +0100)] 
[website] fix deploy staging script (#3122)

3 months ago[website] fix current site deployment (#3120)
Nicolò Boschi [Thu, 17 Mar 2022 16:08:09 +0000 (17:08 +0100)] 
[website] fix current site deployment (#3120)

3 months ago[website] deploy staging after every change (#3118)
Nicolò Boschi [Thu, 17 Mar 2022 11:40:36 +0000 (12:40 +0100)] 
[website] deploy staging after every change (#3118)

3 months agoGradle build: add mavenLocal() repository
Enrico Olivelli [Wed, 16 Mar 2022 18:15:49 +0000 (19:15 +0100)] 
Gradle build: add mavenLocal() repository

### Motivation

In Gradle you need to add `mavenLocal()` repository if you want to test local versions of third party libraries built with Maven (like ZooKeeper, Curator...)

### Changes
Add `mavenLocal()`  repository

Reviewers: Matteo Merli <mmerli@apache.org>, Nicolò Boschi <boschi1997@gmail.com>, Andrey Yegorov <None>

This closes #3114 from eolivelli/impl/gradle-maven-local

3 months ago[website] New Website built with Docusaurus v2 (#3088)
Nicolò Boschi [Wed, 16 Mar 2022 16:06:24 +0000 (17:06 +0100)] 
[website] New Website built with Docusaurus v2 (#3088)

3 months agoadd stats for throttled-write (#3102)
StevenLuMT [Wed, 16 Mar 2022 10:10:59 +0000 (18:10 +0800)] 
add stats for throttled-write (#3102)

Descriptions of the changes in this PR:

### Motivation

method:triggerFlushAndAddEntry costing time is a changing,so add a stats metric focus on this method

### Changes

1.the previous counter metrics(throttledWriteRequests) are retained
2.add throttledWriteStats to record cost time and count for the method(triggerFlushAndAddEntry)

3 months agoBump testcontainers version to 1.16.3
ZhangJian He [Tue, 15 Mar 2022 23:42:56 +0000 (07:42 +0800)] 
Bump testcontainers version to 1.16.3

### Motivation
- Bump the testcontainers version to make tests can run on the latest docker on `Mac Intel Chip`
- Bump `docker-java` version to `3.2.13`

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Andrey Yegorov <None>

This closes #3101 from Shoothzj/bump-test-container-version

3 months agobump lombok from 1.18.20 to 1.18.22 to support java17 compile
ZhangJian He [Tue, 15 Mar 2022 23:32:24 +0000 (07:32 +0800)] 
bump lombok from 1.18.20 to 1.18.22 to support java17 compile

### Motivation
- required for compilation on JDK 17
- see https://projectlombok.org/changelog

Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3097 from Shoothzj/bump-lombok-1-18-22

3 months agoRun protobuf code generation automatically in IntelliJ and fix config (#3107)
Lari Hotari [Tue, 15 Mar 2022 16:42:49 +0000 (18:42 +0200)] 
Run protobuf code generation automatically in IntelliJ and fix config (#3107)

3 months agoConcurrentLong map and set": add unit tests for reduce unnecessary expansions (...
LinChen [Tue, 15 Mar 2022 10:55:42 +0000 (18:55 +0800)] 
ConcurrentLong map and set": add unit tests for reduce unnecessary expansions  (#3092)

3 months agoConcurrentOpenHashSet: fix reduce unnecessary expansions (#3082)
LinChen [Tue, 15 Mar 2022 10:28:29 +0000 (18:28 +0800)] 
ConcurrentOpenHashSet: fix reduce  unnecessary expansions (#3082)

3 months ago[website] update website every time there is a change (#3090)
Nicolò Boschi [Mon, 14 Mar 2022 11:10:35 +0000 (12:10 +0100)] 
[website] update website every time there is a change (#3090)

3 months agofix bkperf message rate limit to 2GB/s (#3100)
Hang Chen [Sun, 13 Mar 2022 12:50:17 +0000 (20:50 +0800)] 
fix bkperf message rate limit to 2GB/s (#3100)

3 months agoPreparing for the release 4.15
Andrey Yegorov [Fri, 11 Mar 2022 23:07:12 +0000 (15:07 -0800)] 
Preparing for the release 4.15

Descriptions of the changes in this PR:

Updated py client's version to 4.15

Motivation

Preparing for the release
https://bookkeeper.apache.org/community/release_guide/#change-python-client-version

Reviewers: Enrico Olivelli <eolivelli@gmail.com>

This closes #3095 from dlg99/python_client_rel_4.15

3 months agofix a metric error in bookieStats
ken [Fri, 11 Mar 2022 20:13:16 +0000 (04:13 +0800)] 
fix a metric error in bookieStats

Descriptions of the changes in this PR:

### Motivation

fix a metric error in bookieStats

### Changes

getReadEntryStats().registerFailedValue(entrySize) -> getReadBytesStats().registerFailedValue(entrySize)

Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3083 from TakaHiro0208/fix_bookieStats_metric_error

3 months agoAdd a REST API to get or update bookie readOnly state
Yang Yang [Fri, 11 Mar 2022 19:57:51 +0000 (03:57 +0800)] 
Add a REST API to get or update bookie readOnly state

### Motivation

This PR is a part of the work to improve the process of removing bookies from the cluster. Specifically, it implements the `readOnly` API described in the [mail](http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/202109.mbox/raw/%3CCAJdLeK03g8K0h6swn%3D9yVP1Ze2zHxe8TDobK6a-zpTdABkeQEA%40mail.gmail.com%3E).

### Changes

- Add an REST API at `/api/v1/bookie/state/readonly`
  - The `GET` method returns the current `readOnly` status
  - The `PUT` method updates the `readOnly` status if needed.

### TODOs

- Update the document once the PR is accepted.
- Update the `BookieStateManager` & `BookieImpl` to persist the information that the state change is triggered by the external API request and do not change the state based on the notification from the dirs monitoring service.

Reviewers: Yong Zhang <zhangyong1025.zy@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>

This closes #2799 from fantapsody/readonly-api

3 months ago[website] Fix deploy action (#3089)
Nicolò Boschi [Thu, 10 Mar 2022 07:32:58 +0000 (08:32 +0100)] 
[website] Fix deploy action (#3089)

3 months agoFix Journal.ForceWriteThread.forceWriteRequests.put deadlock
Kezhu Wang [Thu, 10 Mar 2022 00:09:25 +0000 (08:09 +0800)] 
Fix Journal.ForceWriteThread.forceWriteRequests.put deadlock

Descriptions of the changes in this PR:

### Motivation
`Journal.ForceWriteThread` could deadlock as it is the sole consumer of `Journal.forceWriteRequests` while it send group marker blocking using `BlockingQueue.put`.

This PR try to fix this.

### Changes
* Add testing code to deadlock `Journal.ForceWriteThread` on `forceWriteRequests.put`.
* Send force write group marker non-blocking to avoid deadlock `ForceWriteThread`.

Master Issue: #2948

Reviewers: Andrey Yegorov <None>

This closes #2962 from kezhuw/fix-Journal.ForceWriteThread.forceWriteRequests.put-deadlock

3 months agochange rocksdb init: use OptionsUtil
StevenLuMT [Thu, 10 Mar 2022 00:05:59 +0000 (08:05 +0800)] 
change rocksdb init: use OptionsUtil

Descriptions of the changes in this PR:

### Motivation

1. some old parameters in rocksDB is not configurable
2. for all the tuning of rocksdb in the future, there is no need to update the code or introduce configuration to bookie

### Changes

1)   rocks all old parameter change to be configurable
2)  use OptionsUtil to init all params for rocksdb

the old pr #3006  has some rebase error,open a new pr

Reviewers: Andrey Yegorov <None>, LinChen <None>

This closes #3056 from StevenLuMT/master_improveRocksDB

3 months agoAdd throttle for rebuild entryMetadataMap
Hang Chen [Thu, 10 Mar 2022 00:04:20 +0000 (08:04 +0800)] 
Add throttle for rebuild entryMetadataMap

### Motivation
When a bookie restart, the garbageCollectorThread will rebuild entryMetadataMap from all the entry log files in ledger directory. For normal case, it will extract the EntryLogMetadata from the index in entry log file. However, if there's no index, then fallback to scanning the entry log file.

In user's production environment, the log files without index occupied 4%. The total entry log files is 80000, and the log files without index is 3000. The default entry log file size is 2GB, and the garbageCollectorThread will read 3000 * 2GB = 6TB data without speed limit, which will cause ledger disk IO util runs high for dozens of minutes and affect ledger read and write latency.

### Modification
1. Add read speed rate limiter for scanning entry log file in entryMetadataMap rebuild.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #2963 from hangc0276/chenhang/add_throttle_for_build_entryMetadataMap

3 months agocheck all bookies of writeset are writable
wuYin [Thu, 10 Mar 2022 00:03:16 +0000 (08:03 +0800)] 
check all bookies of writeset are writable

### Motivation

#1088 introduced ensemble writable checking before sending requests, but we should check bookies of writeset, instead of the first few bookies in current ensemble.

### Changes

Get the bookies of writeset from ensemble and check writeable.

Related change: https://github.com/apache/bookkeeper/pull/1088/files#diff-1d893bb31553b5e1f55c8301d04ae15f38e0d35f531f9dd22475128b7972ddf9R1108

Reviewers: Andrey Yegorov <None>

This closes #3055 from wuYin/writeset-writable

3 months agoAdd sizeInBytes interface for ConcurrentLong map and set
Hang Chen [Thu, 10 Mar 2022 00:02:00 +0000 (08:02 +0800)] 
Add sizeInBytes interface for ConcurrentLong map and set

### Motivation
We provide some concurrent maps and sets for specific usage, and provide size() and capacity() interface for user to get the real item number and the max item number.

However, if user want to monitor how much memory those current maps and set allocated, there is not interface to expose this metric.

### Changes
Add `sizeInBytes()` interface to expose the memory size has been allocated for those concurrent maps and sets.

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3068 from hangc0276/chenhang/add_sizeInBytes_interface_for_concurrent_maps

3 months agoFix publish do not include test jar
Hang Chen [Thu, 10 Mar 2022 00:00:08 +0000 (08:00 +0800)] 
Fix publish do not include test jar

### Motivation
When use `./gradlew publishToMavenLocal` command to publish jars to local maven repository, it doesn't include test jars.

### Changes
When publish, including the test jars.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, ZhangJian He <shoothzj@gmail.com>, Yong Zhang <zhangyong1025.zy@gmail.com>

This closes #3071 from hangc0276/chenhang/fix_publish_do_not_include_test_jar

3 months ago[website] restore javadoc generation and automate website deployment (#3081)
Nicolò Boschi [Tue, 8 Mar 2022 17:22:55 +0000 (18:22 +0100)] 
[website] restore javadoc generation and automate website deployment (#3081)

* [website] restore javadoc generation and automate website deployment

3 months agosupport shrink for ConcurrentLong map or set (#3074)
lin chen [Thu, 3 Mar 2022 17:42:10 +0000 (01:42 +0800)] 
support shrink for ConcurrentLong map or set (#3074)

* support shrink for ConcurrentLong map or set

* fix unit test

* check style

* add shrink unit test.

* fix unit test

3 months ago[ci] Move CI to JDK11 (#3027)
Nicolò Boschi [Wed, 2 Mar 2022 08:05:11 +0000 (09:05 +0100)] 
[ci] Move CI to JDK11 (#3027)

4 months ago[tests] Replace powermockito usages with mockito-inline - 'tools' submodule (#3077)
Nicolò Boschi [Tue, 1 Mar 2022 21:04:56 +0000 (22:04 +0100)] 
[tests] Replace powermockito usages with mockito-inline - 'tools' submodule (#3077)

4 months agoreduce unnecessary expansions for ConcurrentLong map and set (#3072)
lin chen [Sun, 27 Feb 2022 21:14:48 +0000 (05:14 +0800)] 
reduce unnecessary expansions for  ConcurrentLong map and set (#3072)

4 months agoISSUE #2898: DistributedLogManager can skip over a segment on read.
Andrey Yegorov [Fri, 25 Feb 2022 17:26:37 +0000 (09:26 -0800)] 
ISSUE #2898: DistributedLogManager can skip over a segment on read.

Descriptions of the changes in this PR:
### Motivation

DLM test suite was flaky.
Repro/troubleshooting shows that DLM can skip over a data segment on read.

### Changes

Test + fix (don't move segment if it moved already).

Master Issue: #2898

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3064 from dlg99/fix/dlm-issue2898, closes #2898

4 months ago[CI] remaining-tests is running all the tests
Nicolò Boschi [Fri, 25 Feb 2022 17:22:41 +0000 (18:22 +0100)] 
[CI] remaining-tests is running all the tests

### Motivation

The 'remaining-tests' is supposed to exclude a lot of tests covered by the other checks but the syntax used is wrong

### Changes

* Correctly excluded all the tests not needed in 'remaining-tests' suite

Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>

This closes #3078 from nicoloboschi/ci/remove-duplicated-tests

4 months agoAllow enabling http tls (#2995)
ZhangJian He [Fri, 25 Feb 2022 06:36:05 +0000 (14:36 +0800)] 
Allow enabling http tls (#2995)

4 months agoIssue 2974: better thread selection for the Ordered Executor (#3023)
Andrey Yegorov [Wed, 23 Feb 2022 21:03:11 +0000 (13:03 -0800)] 
Issue 2974: better thread selection for the Ordered Executor (#3023)

4 months ago[ISSUE 3031] fixed test; chooseThread() uses orderingKey as a param, not a thread...
Andrey Yegorov [Wed, 23 Feb 2022 21:01:36 +0000 (13:01 -0800)] 
[ISSUE 3031] fixed test; chooseThread() uses orderingKey as a param, not a thread index (#3032)

4 months agofix checkAllLedgersDuration compute (#2970)
lin chen [Wed, 23 Feb 2022 10:16:56 +0000 (18:16 +0800)] 
fix checkAllLedgersDuration compute (#2970)

4 months agoupdate doc RecoveryBookieService: bookie_dest has been removed (#2961)
lin chen [Wed, 23 Feb 2022 10:15:14 +0000 (18:15 +0800)] 
update doc RecoveryBookieService: bookie_dest has been removed (#2961)

4 months agofix gradle publishToMavenLocal failed (#3069)
Hang Chen [Tue, 22 Feb 2022 19:24:57 +0000 (03:24 +0800)] 
fix gradle publishToMavenLocal failed (#3069)

4 months agoAvoiding call fileChannelProvider init multiple times (#3046)
Hang Chen [Tue, 22 Feb 2022 07:00:48 +0000 (15:00 +0800)] 
Avoiding call fileChannelProvider init multiple times (#3046)

4 months agoOptimize memory:Support shrinking in ConcurrentLongLongPairHashMap (#3061)
lin chen [Tue, 22 Feb 2022 05:01:03 +0000 (13:01 +0800)] 
Optimize memory:Support shrinking in ConcurrentLongLongPairHashMap (#3061)

* support shrink

* Reduce unnecessary rehash

* check style

* fix: unnecessary rehash

* add unit test: testExpandAndShrink

* fix unit test: testExpandAndShrink

* fix test:
1.verify that the map is able to expand after shrink;
2.does not keep shrinking at every remove() operation;

* 1.add builder;
2.add config:
  ①MapFillFactor;②MapIdleFactor;③autoShrink;④expandFactor;⑤shrinkFactor

* check style

* 1.check style;
2.add check :
shrinkFactor>1
expandFactor>1

* check style

* keep and Deprecate all the public constructors.

* add final for  autoShrink

* fix unit test testExpandAndShrink, set autoShrink true

* add method for update parameters value:
  ①setMapFillFactor
  ②setMapIdleFactor
  ③setExpandFactor
  ④setShrinkFactor
  ⑤setAutoShrink

* use lombok.Setter replace lombok.Data

* use pulic for getUsedBucketCount

* 1.check parameters;
2.fix the shrinkage condition:
  ①newCapacity > size: in order to prevent the infinite loop of rehash, newCapacity should be larger than the currently used size;
  ②newCapacity > resizeThresholdUp: in order to prevent continuous expansion and contraction, newCapacity should be greater than the expansion threshold;

* 1.update  parameters check;
2.fix newCapacity calculation when shrinking :
  rehash((int) Math.max(size / mapFillFactor, capacity / shrinkFactor));

* remove set methods:
①setMapFillFactor
②setMapIdleFactor
③setExpandFactor
④setShrinkFactor
⑤setAutoShrink

* Repair shrinkage conditions: ①newCapacity must be the nth power of 2; ②reduce unnecessary shrinkage;

* Repair shrinkage conditions

* add shrinkage when clear

* 1.add test for clear shrink
2. fix initCapacity value

4 months agoISSUE 3044: ETCD tests hang. Added global timeout, fork tests jvm, fixed noop slf4j...
Andrey Yegorov [Tue, 22 Feb 2022 02:17:38 +0000 (18:17 -0800)] 
ISSUE 3044: ETCD tests hang. Added global timeout, fork tests jvm, fixed noop slf4j to see log in case of hang (#3051)

Descriptions of the changes in this PR:

### Motivation

ETCD test flake / hang occasionally causing CI job timeout.

### Changes

Added global timeout - kill test early if hanged
fork tests jvm - I think it helped locally (no repro) but possibly just reduced frequency of hangs
fixed noop slf4j warning, also to see log in case of hang

Master Issue: #3044

4 months agoset Mod initial Delay time to simply avoid GarbageCollectorThread working at the...
StevenLuMT [Tue, 22 Feb 2022 00:42:13 +0000 (08:42 +0800)] 
set Mod initial Delay time to simply avoid GarbageCollectorThread working at the same time (#3012)

Descriptions of the changes in this PR:

### Motivation

when number of ledger's Dir are more than 1,the same of GarbageCollectorThread will do the same thing,
Especially:
  1) deleting ledger, then SyncThread will be timed to do rocksDB compact
  2) compact: entry, cost cpu.

### Changes

set a Mod initial Delay time to simply avoid GarbageCollectorThread working at the same time

4 months agoISSUE 3034: Fixed flaky test (#3065)
Andrey Yegorov [Mon, 21 Feb 2022 01:54:24 +0000 (17:54 -0800)] 
ISSUE 3034: Fixed flaky test (#3065)

Descriptions of the changes in this PR:

### Motivation

MockExecutorControllerWithSchedulerTest is flaky

### Changes

testExecute flakes because runnable runs actually asynchronously in this case, modified the test.

Master Issue: #3034

4 months agoISSUE #3034: Add extra checks in the mock to help with error troubleshooting on CI
Andrey Yegorov [Tue, 15 Feb 2022 20:05:44 +0000 (12:05 -0800)] 
ISSUE #3034: Add extra checks in the mock to help with error troubleshooting on CI

### Motivation

Flaky test on CI

### Changes

Added extra check & logging to simplify troubleshooting of the flaky test on CI.
Cannot repro the failure locally after running 100+ times in a loop.

Master Issue: #3034

Reviewers: Yong Zhang <zhangyong1025.zy@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #3049 from dlg99/fix/issue3034, closes #3034

4 months agoISSUE #3040: RocksDB segfaulted during CompactionTest
Andrey Yegorov [Tue, 15 Feb 2022 17:05:17 +0000 (09:05 -0800)] 
ISSUE #3040: RocksDB segfaulted during CompactionTest

Descriptions of the changes in this PR:

### Motivation

RocksDB segfaulted during CompactionTest

### Changes

RocksDB can segfault if one tries to use it after close.
[Shutdown/compaction sequence](https://github.com/apache/bookkeeper/issues/3040#issuecomment-1036508397) can lead to such situation. The fix prevents segfault.

CompactionTests were updated at some point to use metadata cache and non-cached case is not tested.
I added the test suites for this case.

Master Issue: #3040

Reviewers: Yong Zhang <zhangyong1025.zy@gmail.com>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3043 from dlg99/fix/issue3040, closes #3040

4 months agoBump netty version to 4.1.74.Final (#3045)
ZhangJian He [Tue, 15 Feb 2022 09:01:14 +0000 (17:01 +0800)] 
Bump netty version to 4.1.74.Final (#3045)

### Motivation

Changelog: https://netty.io/news/2022/02/08/4-1-74-Final.html

Netty 4.1.74 had solved several dns resolver bug

### Modifications

* Upgrade Netty from 4.1.73.Final to 4.1.74.Final
* Netty 4.1.74.Final depends on netty-tc-native 2.0.48, also updates

4 months agoupdate metrics (#2999)
StevenLuMT [Tue, 15 Feb 2022 08:55:57 +0000 (16:55 +0800)] 
update metrics (#2999)

Descriptions of the changes in this PR:

### Motivation

some metric's value is not right,so update it
the current is problem-driven, and a comprehensive review will be done later.

### Changes

update 2 metric:
1.Bookie: ReadBytes use entrySize
2.Journal: report journal write error metric

4 months ago[CI] Dump stacktrace when a job is cancelled
Nicolò Boschi [Tue, 15 Feb 2022 01:43:28 +0000 (02:43 +0100)] 
[CI] Dump stacktrace when a job is cancelled

### Motivation

Sometimes CI jobs fail due to timeout. It would be useful understand what the latest test was doing before being interrupted.

### Changes

* Added a new script for dumping stacktrace.
* Added in all the jobs the step in case of `cancelled()` is true.

Reviewers: Andrey Yegorov <None>

This closes #3042 from nicoloboschi/ci-thread-dump

4 months agoUse OutOfMemoryPolicy when the direct memory is insufficient when reading the entry...
wenbingshen [Mon, 14 Feb 2022 22:50:25 +0000 (06:50 +0800)] 
Use OutOfMemoryPolicy when the direct memory is insufficient when reading the entry in ReadCache

### Motivation

Original PR: https://github.com/apache/bookkeeper/pull/1755,
It should be that this PR forgot to modify the memory application method.

When the direct memory is insufficient, it does not fall back to the jvm memory, and the bookie hangs directly.

![image](https://user-images.githubusercontent.com/35599757/137859349-f145bb88-7d1c-4739-b6d1-6f8987831cc0.png)

![image](https://user-images.githubusercontent.com/35599757/137859462-4e2b3dc5-3287-4bf7-8dad-048ad8a7723f.png)

### Changes

Use `OutOfMemoryPolicy` when the direct memory is insufficient when reading the entry in `ReadCache`.

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>

This closes #2836 from wenbingshen/useOutOfMemoryPolicyInReadCache

4 months agofix gradle implicit dependency (#3029)
mauricebarnum [Mon, 14 Feb 2022 20:19:54 +0000 (12:19 -0800)] 
fix gradle implicit dependency (#3029)

```
> Task :bookkeeper-tools-framework:compileTestJava
Execution optimizations have been disabled for task ':bookkeeper-tools-framework:compileTestJava' to ensure correctness due to the following reasons:
  - Gradle detected a problem with the following location: '/Users/mbarnum/src/bookkeeper/tools/framework/build/classes/java/main'. Reason: Task ':bookkeeper-tools-framework:compileTestJava' uses this output of task ':tools:framework:compileJava' without declaring an explicit or implicit dependency. This can lead to incorrect results being produced, depending on what order the tasks are executed. Please refer to https://docs.gradle.org/7.3.3/userguide/validation_problems.html#implicit_dependency for more details about this problem.
```

4 months agoAdd flaky-test template to track many flaky-test.
Qiang Zhao [Mon, 14 Feb 2022 19:36:06 +0000 (03:36 +0800)] 
Add flaky-test template to track many flaky-test.

### Motivation

I found many flaky-test like  #3031 #3034 #3033.
Because many flaky tests are actually production code issues so I think it's a good way to add flaky-test template to track them

### Changes

- Add flaky-test template.

Reviewers: Andrey Yegorov <None>

This closes #3035 from mattisonchao/template_flaky_test

4 months agofix(cli): incorrect description for autodiscovery
Eric Shen [Mon, 14 Feb 2022 19:32:55 +0000 (03:32 +0800)] 
fix(cli): incorrect description for autodiscovery

Signed-off-by: Eric Shen <ericshenyuhaooutlook.com>
Descriptions of the changes in this PR:

### Motivation

The description of `bin/bookkeeper autorecovery` is wrong, it won't start in daemon.

### Changes

* Changed the description in bookkeeper shell
* Update the doc

Reviewers: Yong Zhang <zhangyong1025.zy@gmail.com>

This closes #2910 from ericsyh/fix-bk-cli

4 months agoExplicit error message if an exception other than BKNoSuchLedgerExistsOnMetadataServe...
shustsud [Mon, 14 Feb 2022 19:20:49 +0000 (04:20 +0900)] 
Explicit error message if an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs in over-replicated ledger GC

### Motivation
- Even if an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs of readLedgerMetadata in over-replicated ledger GC, nothing will be output to the log.
(https://github.com/apache/bookkeeper/pull/2844#discussion_r735219876)

### Changes
- If an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs in readLedgerMetadata, output information to the log.

Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com>

This closes #2873 from shustsud/improved_error_handling

4 months agoReplication stat num-under-replicated-ledgers changed as with the process of replication
gaozhangmin [Mon, 14 Feb 2022 19:17:28 +0000 (03:17 +0800)] 
Replication stat num-under-replicated-ledgers changed as with the process of replication

Motivation
Now ReplicationStats numUnderReplicatedLedger registers when `publishSuspectedLedgersAsync`, but its value doesn't decrease as with the ledger replicated successfully, We cannot know the progress of replication from the stat.

Changes
registers a notifyUnderReplicationLedgerChanged when auditor starts. numUnderReplicatedLedger value will decrease when the ledger path under replicate deleted.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>

This closes #2805 from gaozhangmin/replication-stats-num-under-replicated-ledgers

4 months agoBP-46: Running without journal proposal
Jack Vanlightly [Mon, 14 Feb 2022 19:11:38 +0000 (20:11 +0100)] 
BP-46: Running without journal proposal

Includes the BP-46 design proposal markdown document.

Master Issue: #2705

Reviewers: Andrey Yegorov <None>, Enrico Olivelli <eolivelli@gmail.com>

This closes #2706 from Vanlightly/bp-44

4 months agodelete duplicated semicolon
gaozhangmin [Mon, 14 Feb 2022 19:01:22 +0000 (03:01 +0800)] 
delete duplicated semicolon

As title, delete duplicated semicolon

Reviewers: Andrey Yegorov <None>

This closes #2810 from gaozhangmin/remove-duplicated-semicolon

4 months agomake rocksdb format version configurable
Hang Chen [Mon, 14 Feb 2022 19:00:06 +0000 (03:00 +0800)] 
make rocksdb format version configurable

### Motivation
Fix #2823
RocksDB support several format versions which uses different data structure to implement key-values indexes and have huge different performance. https://rocksdb.org/blog/2019/03/08/format-version-4.html

https://github.com/facebook/rocksdb/blob/d52b520d5168de6be5f1494b2035b61ff0958c11/include/rocksdb/table.h#L368-L394

```C++
  // We currently have five versions:
  // 0 -- This version is currently written out by all RocksDB's versions by
  // default.  Can be read by really old RocksDB's. Doesn't support changing
  // checksum (default is CRC32).
  // 1 -- Can be read by RocksDB's versions since 3.0. Supports non-default
  // checksum, like xxHash. It is written by RocksDB when
  // BlockBasedTableOptions::checksum is something other than kCRC32c. (version
  // 0 is silently upconverted)
  // 2 -- Can be read by RocksDB's versions since 3.10. Changes the way we
  // encode compressed blocks with LZ4, BZip2 and Zlib compression. If you
  // don't plan to run RocksDB before version 3.10, you should probably use
  // this.
  // 3 -- Can be read by RocksDB's versions since 5.15. Changes the way we
  // encode the keys in index blocks. If you don't plan to run RocksDB before
  // version 5.15, you should probably use this.
  // This option only affects newly written tables. When reading existing
  // tables, the information about version is read from the footer.
  // 4 -- Can be read by RocksDB's versions since 5.16. Changes the way we
  // encode the values in index blocks. If you don't plan to run RocksDB before
  // version 5.16 and you are using index_block_restart_interval > 1, you should
  // probably use this as it would reduce the index size.
  // This option only affects newly written tables. When reading existing
  // tables, the information about version is read from the footer.
  // 5 -- Can be read by RocksDB's versions since 6.6.0. Full and partitioned
  // filters use a generally faster and more accurate Bloom filter
  // implementation, with a different schema.
  uint32_t format_version = 5;
```
Different format version requires different rocksDB version and it couldn't roll back once upgrade to new format version

In our current RocksDB storage code, we hard code the format_version to 2, which is hard to to upgrade format_version to achieve new RocksDB's high performance.

### Changes

1. Make the format_version configurable.

Reviewers: Matteo Merli <mmerli@apache.org>, Enrico Olivelli <eolivelli@gmail.com>

This closes #2824 from hangc0276/chenhang/make_rocksdb_format_version_configurable

4 months agoEnsure BookKeeper process receives sigterm in docker container
Jack Vanlightly [Mon, 14 Feb 2022 18:56:19 +0000 (19:56 +0100)] 
Ensure BookKeeper process receives sigterm in docker container

### Motivation

Current official docker images do not handle the SIGTERM sent by the docker runtime and so get killed after the timeout. No graceful shutdown occurs.

The reason is that the entrypoint does not use `exec` when executing the `bin/bookkeeper` shell script and so the BookKeeper process cannot receive signals from the docker runtime.

### Changes

Use `exec` when calling the `bin/bookkeeper` shell script.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, Lari Hotari <None>, Matteo Merli <mmerli@apache.org>

This closes #2857 from Vanlightly/docker-image-handle-sigterm

4 months agochange log level from error to warn when dns resolver initialize failed (#2856)
Hang Chen [Mon, 14 Feb 2022 02:39:23 +0000 (10:39 +0800)] 
change log level from error to warn when dns resolver initialize failed (#2856)

Descriptions of the changes in this PR:

### Motivation
When start bookie, it will throws the following error message when dns resolver initialize failed.
```
[main] ERROR org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to initialize DNS Resolver org.apache.bookkeeper.net.ScriptBasedMapping, used default subnet resolver : java.lang.RuntimeException: No network topology script is found when using script based DNS resolver.
```
It is confusing for users.

### Modification
1. change the log level from error to warn.

4 months ago[ISSUE 3038] Fixed flaky CompactionTest.testMinorCompactionWithMaxTimeMillis (#3039)
Andrey Yegorov [Fri, 11 Feb 2022 07:16:30 +0000 (23:16 -0800)] 
[ISSUE 3038] Fixed flaky CompactionTest.testMinorCompactionWithMaxTimeMillis (#3039)

4 months agoSupport multi ledger directories for rocksdb backend entryMetadataMap (#2965)
Hang Chen [Fri, 11 Feb 2022 03:12:27 +0000 (11:12 +0800)] 
Support multi ledger directories for rocksdb backend entryMetadataMap (#2965)

### Motivation
When we use RocksDB backend entryMetadataMap for multi ledger directories configured, the bookie start up failed, and throw the following exception.
```
12:24:28.530 [main] ERROR org.apache.pulsar.PulsarStandaloneStarter - Failed to start pulsar service.
java.io.IOException: Error open RocksDB database
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:202) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:89) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.PersistentEntryLogMetadataMap.<init>(PersistentEntryLogMetadataMap.java:87) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.GarbageCollectorThread.createEntryLogMetadataMap(GarbageCollectorThread.java:265) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.GarbageCollectorThread.<init>(GarbageCollectorThread.java:154) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.GarbageCollectorThread.<init>(GarbageCollectorThread.java:133) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:182) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:190) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.bookkeeper.bookie.BookieResources.createLedgerStorage(BookieResources.java:110) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        at org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.buildBookie(LocalBookkeeperEnsemble.java:328) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
        at org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.runBookies(LocalBookkeeperEnsemble.java:391) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
        at org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.startStandalone(LocalBookkeeperEnsemble.java:521) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
        at org.apache.pulsar.PulsarStandalone.start(PulsarStandalone.java:264) ~[org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
        at org.apache.pulsar.PulsarStandaloneStarter.main(PulsarStandaloneStarter.java:121) [org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
Caused by: org.rocksdb.RocksDBException: lock hold by current process, acquire time 1640492668 acquiring thread 123145515651072: data/standalone/bookkeeper00/entrylogIndexCache/metadata-cache/LOCK: No locks available
        at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
        at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
        ... 15 more
```

The reason is multi garbageCollectionThread will open the same RocksDB and own the LOCK, and then throw the above exception.

### Modification
1. Change the default GcEntryLogMetadataCachePath from `getLedgerDirNames()[0] + "/" + ENTRYLOG_INDEX_CACHE` to  `null`. If it is `null`, it will use each ledger's directory.
2. Remove the internal directory `entrylogIndexCache`. The data structure looks like:
```
   └── current
       ├── lastMark
       ├── ledgers
       │   ├── 000003.log
       │   ├── CURRENT
       │   ├── IDENTITY
       │   ├── LOCK
       │   ├── LOG
       │   ├── MANIFEST-000001
       │   └── OPTIONS-000005
       ├── locations
       │   ├── 000003.log
       │   ├── CURRENT
       │   ├── IDENTITY
       │   ├── LOCK
       │   ├── LOG
       │   ├── MANIFEST-000001
       │   └── OPTIONS-000005
       └── metadata-cache
           ├── 000003.log
           ├── CURRENT
           ├── IDENTITY
           ├── LOCK
           ├── LOG
           ├── MANIFEST-000001
           └── OPTIONS-000005
```
3. If user configured `GcEntryLogMetadataCachePath` in `bk_server.conf`, it only support one ledger directory configured for `ledgerDirectories`. Otherwise, the best practice is to keep it default.
4. The PR is better to release with #1949

4 months agoAdd rack name invalid check (#2980)
Hang Chen [Fri, 11 Feb 2022 03:11:03 +0000 (11:11 +0800)] 
Add rack name invalid check (#2980)

### Motivation
When we set region or rack placement policy, but the region or rack name set to `/` or empty string, it will throw the following exception on handling bookies join.
```
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1841) ~[?:?]
        at org.apache.bookkeeper.net.NetworkTopologyImpl$InnerNode.getNextAncestorName(NetworkTopologyImpl.java:144) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.net.NetworkTopologyImpl$InnerNode.add(NetworkTopologyImpl.java:180) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.net.NetworkTopologyImpl.add(NetworkTopologyImpl.java:425) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.handleBookiesThatJoined(TopologyAwareEnsemblePlacementPolicy.java:717) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.handleBookiesThatJoined(RackawareEnsemblePlacementPolicyImpl.java:80) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.handleBookiesThatJoined(RackawareEnsemblePlacementPolicy.java:249) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.onClusterChanged(TopologyAwareEnsemblePlacementPolicy.java:663) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.onClusterChanged(RackawareEnsemblePlacementPolicyImpl.java:80) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.onClusterChanged(RackawareEnsemblePlacementPolicy.java:92) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.BookieWatcherImpl.processWritableBookiesChanged(BookieWatcherImpl.java:197) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.client.BookieWatcherImpl.lambda$initialBlockingBookieRead$1(BookieWatcherImpl.java:233) ~[io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.accept(ZKRegistrationClient.java:147) [io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.accept(ZKRegistrationClient.java:70) [io.streamnative-bookkeeper-server-4.14.3.1.jar:4.14.3.1]
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) [?:?]
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) [?:?]
        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final]
        at java.lang.Thread.run(Thread.java:829) [?:?]
```
The root cause is that the node networkLocation is empty string and then use `substring(1)` operation, which will lead to `StringIndexOutOfBoundsException`

### Modification
1. Add `n.getNetworkLocation()` is empty check on `isAncestor` method to make the exception more clear.

4 months agoSkip update entryLogMetaMap if not modified (#2964)
Hang Chen [Fri, 11 Feb 2022 03:10:28 +0000 (11:10 +0800)] 
Skip update entryLogMetaMap if not modified (#2964)

### Motivation
After we support RocksDB backend entryMetaMap, we should avoid updating the entryMetaMap if unnecessary.

In `doGcEntryLogs` method, it iterate through the entryLogMetaMap and update the meta if ledgerNotExists. We should check whether the meta has been modified in `removeIfLedgerNotExists`. If not modified, we can avoid update the  entryLogMetaMap.

### Modification
 1. Add a flag to represent whether the meta has been modified in `removeIfLedgerNotExists` method. If not, skip update the entryLogMetaMap.

4 months agoUpgrade RocksDB
Andrey Yegorov [Wed, 9 Feb 2022 21:13:44 +0000 (13:13 -0800)] 
Upgrade RocksDB

Descriptions of the changes in this PR:

Dependency change

### Motivation

I encountered https://github.com/apache/bookkeeper/issues/3024 and noticed that newer version of RocksDB includes multiple fixes for concurrency issues with various side-effects and fixes for a few crashes.
I upgraded, ran `org.apache.bookkeeper.bookie.BookieJournalTest` test in a loop and didn't repro the crash so far.
It is hard to say 100% if it is fixed given it was not happening all the time.

### Changes

Upgraded RocksDB
Master Issue: #3024

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Nicolò Boschi <boschi1997@gmail.com>

This closes #3026 from dlg99/rocksdb-upgrade

4 months agoFix performance issue to avoid unnessary loop. (#3030)
Qiang Zhao [Wed, 9 Feb 2022 19:24:21 +0000 (03:24 +0800)] 
Fix performance issue to avoid unnessary loop. (#3030)

4 months agoSupport specifying bookie http port as a command argument (#2769)
Yang Yang [Wed, 9 Feb 2022 01:54:54 +0000 (09:54 +0800)] 
Support specifying bookie http port as a command argument (#2769)

### Motivation

I was trying to start multiple bookies locally and found it's a bit inconvenient to specify different http ports for different bookies.

### Changes

Add a command-line argument `httpport` to the bookie command to support specifying bookie http port from the command line.