Simplify bloom filters (#258)
authorClaude Warren <claude@apache.org>
Wed, 15 Jun 2022 17:24:22 +0000 (18:24 +0100)
committerGitHub <noreply@github.com>
Wed, 15 Jun 2022 17:24:22 +0000 (13:24 -0400)
commit87647d0812f3ec8f10ead3d2059e60801f7538b7
tree8b356b8b23ac379d9df6462554002dcb95b9de09
parent1e6ccae88426aa8c2d257fe0f42c382ea9b9c14e
Simplify bloom filters (#258)

* Fixed some unit tests

* First set with complete test cases.

* Cleaned up hasher collecton processing

* cleaned up code

* added license headers

* Refactored and cleaned up

Moved to dependency on BitMapProducer, IndexProducer and
BitCountProducer to retrieve internal representations of the data.

* Added license header.

* Updated documentation

* Fixed bug and added tests

* Added "@since 4.5" where necessary

* Added BitMapProducer constructor to SimpleBloomFilter

* added BitMapProducer.fromLongArray() and Hasher.isEmpty()

* Changes to speed up Simple filter processing

* Null hasher used when a hasher is required but no values are available.

* Added Hasher.Filter and Hasher.FilteredIntConsumer

* Updated documentation + formatted.

* Added license

* fixed checkstyle issues

* fixed javadoc issues

* fixed test issue

* fixed javadoc issues

* Reduced the acceptable delta for p tests

* Updated docs and test cases

* Updated docs and test cases

* fixed issue with Shape javadoc

* Added more test coverage.

* fixed formatting issues

* Updated tests to use assertThrows.

* fixed indents

* Added constructor with IndexProducer

* Fixed issue with compare and different length bitMap arrays

* fixed formatting issues

* Efficiency changes

cleaned up asIndexArray

BitMapProducer to IndexProducer conversion

* changed XProviers to use XPredicates

* Removed NoMatchException

* Removed unneeded BitMap funcs

Moves isSparse() to Shape.

* fixed checkstyle issues

* Fixed javadoc errors

* simplified parameter in BitMapProducer.fromIndexProducer

* fixed tests

* added BitMapping verification

* Added more tests

* Added more tests

* Fixed typos

* Changes requested  by aherbert

* fixed "bit map" in documentation

* Renamed tests

* Removed blank lines

* changed new X<foo> to new X<>

* updated documentation

* Added BloomFilter.copy()

* changed ArrayCountingBloomFilter to use copy() method

* cleaned up numberOfBitsMaps()

* added asBitMapArray() and makePredicate() to BitMapProducer

* Moved asIndexArray() to IndexProducer

* harmonized Simple and Sparse Bloom filter constructors

* Implemented AbstractCountingBloomFilter.asindexArray()

* updated documentation

* fixed up NullHasher

* implemented hasher filter

* Fixed style issues

* added default SimpleHasher increment.

* Added modulus calculation to SimpleHasher

* fixed Hashing issues

* moved hasher/filter/* to /hasher

* moved bloomfilter/hasher to bloomfilter

* fixed up checkstyle issues

* Made Filter -> IndexFilter -w- factory

* moved IndexFilter into Hasher

* updated hashing tests & fixed checksyle

* removed SingleItemhasherCollection as associated methods

* Fixed some unit tests

* First set with complete test cases.

* Cleaned up hasher collecton processing

* cleaned up code

* added license headers

* Refactored and cleaned up

Moved to dependency on BitMapProducer, IndexProducer and
BitCountProducer to retrieve internal representations of the data.

* Added license header.

* Updated documentation

* Fixed bug and added tests

* Added "@since 4.5" where necessary

* Added BitMapProducer constructor to SimpleBloomFilter

* added BitMapProducer.fromLongArray() and Hasher.isEmpty()

* Changes to speed up Simple filter processing

* Null hasher used when a hasher is required but no values are available.

* Added Hasher.Filter and Hasher.FilteredIntConsumer

* Updated documentation + formatted.

* Added license

* fixed checkstyle issues

* fixed javadoc issues

* fixed test issue

* fixed javadoc issues

* Reduced the acceptable delta for p tests

* Updated docs and test cases

* Updated docs and test cases

* fixed issue with Shape javadoc

* Added more test coverage.

* fixed formatting issues

* Updated tests to use assertThrows.

* fixed indents

* Added constructor with IndexProducer

* Fixed issue with compare and different length bitMap arrays

* fixed formatting issues

* Efficiency changes

cleaned up asIndexArray

BitMapProducer to IndexProducer conversion

* changed XProviers to use XPredicates

* Removed NoMatchException

* Removed unneeded BitMap funcs

Moves isSparse() to Shape.

* fixed checkstyle issues

* Fixed javadoc errors

* simplified parameter in BitMapProducer.fromIndexProducer

* fixed tests

* added BitMapping verification

* Added more tests

* Added more tests

* Fixed typos

* Changes requested  by aherbert

* fixed "bit map" in documentation

* Renamed tests

* Removed blank lines

* changed new X<foo> to new X<>

* updated documentation

* Added BloomFilter.copy()

* changed ArrayCountingBloomFilter to use copy() method

* cleaned up numberOfBitsMaps()

* added asBitMapArray() and makePredicate() to BitMapProducer

* Moved asIndexArray() to IndexProducer

* harmonized Simple and Sparse Bloom filter constructors

* Implemented AbstractCountingBloomFilter.asindexArray()

* updated documentation

* fixed up NullHasher

* implemented hasher filter

* Fixed style issues

* added default SimpleHasher increment.

* Added modulus calculation to SimpleHasher

* fixed Hashing issues

* moved hasher/filter/* to /hasher

* moved bloomfilter/hasher to bloomfilter

* fixed up checkstyle issues

* Made Filter -> IndexFilter -w- factory

* moved IndexFilter into Hasher

* updated hashing tests & fixed checksyle

* removed SingleItemhasherCollection as associated methods

* fixed javadoc issues

* fixed javadoc issues

* added checks for BitMapProducer limits and index limits

* updated tests

* added tests

* fixed checkstyle issues

* fixed formatting and test coverage

* fixed javadoc issue

* put back checkstyle.xml

* switched to forEachBitMapPair

* updated BitMap and Index array production

* fixed merge with BitMapProducer

* Cleaned up formatting

* fixed checkstyle issues

* fixed coding issues

* updated documentation

* simplified test

* removed unwanted merge files

* removed duplicate entry

* put back test that incorrectly removed

* fixed asIndexArray error

* fixed checkstyle errors

* Changes for last review
88 files changed:
pom.xml
src/main/java/org/apache/commons/collections4/bloomfilter/AbstractBloomFilter.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/ArrayCountingBloomFilter.java
src/main/java/org/apache/commons/collections4/bloomfilter/BitCountProducer.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/BitMap.java [moved from src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilterIndexer.java with 53% similarity]
src/main/java/org/apache/commons/collections4/bloomfilter/BitMapProducer.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/BitSetBloomFilter.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java
src/main/java/org/apache/commons/collections4/bloomfilter/CountingBloomFilter.java
src/main/java/org/apache/commons/collections4/bloomfilter/Hasher.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/HasherBloomFilter.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/HasherCollection.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/IndexFilters.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/IndexProducer.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/LongBiPredicate.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/SetOperations.java
src/main/java/org/apache/commons/collections4/bloomfilter/Shape.java [moved from src/main/java/org/apache/commons/collections4/bloomfilter/hasher/Shape.java with 59% similarity]
src/main/java/org/apache/commons/collections4/bloomfilter/SimpleBloomFilter.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/SimpleHasher.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/SparseBloomFilter.java [new file with mode: 0644]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/DynamicHasher.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunction.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunctionIdentity.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunctionIdentityImpl.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunctionValidator.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/Hasher.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/StaticHasher.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/MD5Cyclic.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/Murmur128x64Cyclic.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/Murmur32x86Iterative.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/ObjectsHashIterative.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/Signatures.java [deleted file]
src/main/java/org/apache/commons/collections4/bloomfilter/package-info.java
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitCountProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBitMapProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractBloomFilterTest.java
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractCountingBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractHasherTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/AbstractIndexProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/ArrayCountingBloomFilterTest.java
src/test/java/org/apache/commons/collections4/bloomfilter/ArrayTrackerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromArrayCountingBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitCountProducerFromIndexProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapProducerFromArrayCountingBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapProducerFromIndexProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapProducerFromLongArrayTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapProducerFromSimpleBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapProducerFromSparseBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BitMapTrackerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/BloomFilterIndexerTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultBitMapProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultBloomFilterMethodsTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/DefaultBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/FixedIndexesTestHasher.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/HasherBloomFilterTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/HasherCollectionTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexFilterTest.java
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromArrayCountingBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromBitmapProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromHasherCollectionTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromHasherTest.java [moved from src/main/java/org/apache/commons/collections4/bloomfilter/hasher/function/package-info.java with 66% similarity]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromIntArrayTest.java [moved from src/main/java/org/apache/commons/collections4/bloomfilter/hasher/package-info.java with 66% similarity]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromSimpleBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerFromSparseBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/IndexProducerTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/NullHasher.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/SetOperationsTest.java
src/test/java/org/apache/commons/collections4/bloomfilter/ShapeTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/SimpleBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/SimpleHasherTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/SparseBloomFilterTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/UniqueIndexProducerFromHasherCollectionTest.java [moved from src/test/java/org/apache/commons/collections4/bloomfilter/BitSetBloomFilterTest.java with 63% similarity]
src/test/java/org/apache/commons/collections4/bloomfilter/UniqueIndexProducerFromHasherTest.java [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/checkstyle.xml [new file with mode: 0644]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/DynamicHasherBuilderTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/DynamicHasherTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunctionIdentityImplTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/HashFunctionValidatorTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/HasherBuilderTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/ShapeTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/StaticHasherTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/function/AbstractHashFunctionTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/function/MD5CyclicTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/function/Murmur128x64CyclicTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/function/Murmur32x86IterativeTest.java [deleted file]
src/test/java/org/apache/commons/collections4/bloomfilter/hasher/function/ObjectsHashIterativeTest.java [deleted file]
src/test/java/org/apache/commons/collections4/map/AbstractMapTest.java