Use binary search to improve DimensionRangeShardSpec lookup (#12417)
authorhqx871 <hqx871@gmail.com>
Fri, 15 Apr 2022 16:07:06 +0000 (00:07 +0800)
committerGitHub <noreply@github.com>
Fri, 15 Apr 2022 16:07:06 +0000 (21:37 +0530)
commita22d4137250b7f074fe596396755494905617e99
tree6a027a5c8f55d42cee8b3aa14e661939bd5a19a9
parentcd6fba2f6cf6e189ba26097c73b9dbe4f1d0edbc
Use binary search to improve DimensionRangeShardSpec lookup (#12417)

If there are many shards, mapper of IndexGeneratorJob seems to spend a lot of time in calling
DimensionRangeShardSpec.isInChunk to lookup target shard. This can be significantly improved
by using binary search instead of comparing an input row to every shardSpec.

Changes:
* Add `BaseDimensionRangeShardSpec` which provides a binary-search-based
   implementation for `createLookup`
* `DimensionRangeShardSpec`, `SingleDimensionShardSpec`, and
   `DimensionRangeBucketShardSpec` now extend `BaseDimensionRangeShardSpec`
core/src/main/java/org/apache/druid/timeline/partition/BaseDimensionRangeShardSpec.java [new file with mode: 0644]
core/src/main/java/org/apache/druid/timeline/partition/DimensionRangeBucketShardSpec.java
core/src/main/java/org/apache/druid/timeline/partition/DimensionRangeShardSpec.java
core/src/main/java/org/apache/druid/timeline/partition/SingleDimensionRangeBucketShardSpec.java
core/src/main/java/org/apache/druid/timeline/partition/SingleDimensionShardSpec.java
core/src/test/java/org/apache/druid/timeline/partition/DimensionRangeShardSpecTest.java
core/src/test/java/org/apache/druid/timeline/partition/SingleDimensionRangeBucketShardSpecTest.java
core/src/test/java/org/apache/druid/timeline/partition/SingleDimensionShardSpecTest.java