aherbert [Thu, 5 May 2022 10:02:40 +0000 (11:02 +0100)]
Javadoc
aherbert [Thu, 5 May 2022 10:01:08 +0000 (11:01 +0100)]
Avoid BigDecimal computing constants on class initialisation
The constants are precomputed and verified in a unit test.
aherbert [Wed, 4 May 2022 16:38:21 +0000 (17:38 +0100)]
Correct user guide table
aherbert [Wed, 4 May 2022 16:37:41 +0000 (17:37 +0100)]
Sonar fix: Compute the value 2^-53 for the assertion
aherbert [Wed, 4 May 2022 16:30:36 +0000 (17:30 +0100)]
Sonar fix: rename variable to avoid restricted identifier
aherbert [Wed, 4 May 2022 16:26:56 +0000 (17:26 +0100)]
Remove incomplete test method
Arturo Bernal [Sun, 17 Apr 2022 18:51:38 +0000 (20:51 +0200)]
Fix malformed format string
Arturo Bernal [Sun, 17 Apr 2022 18:44:55 +0000 (20:44 +0200)]
* Use lambdas.
* Fix javadoc.
aherbert [Tue, 12 Apr 2022 10:43:01 +0000 (11:43 +0100)]
Use https link
aherbert [Tue, 12 Apr 2022 10:40:04 +0000 (11:40 +0100)]
Remove ordered lists from inside paragraph block elements
Alex Herbert [Sun, 6 Mar 2022 22:36:41 +0000 (22:36 +0000)]
Update issue tracking guide
Replace reference to subversion with git.
Alex Herbert [Sun, 6 Mar 2022 22:33:29 +0000 (22:33 +0000)]
Update developer guide
Remove reference to building using ant.
Remove reference to non-existent wiki.
Add sections about GitHub PRs.
Remove git config for setting core.autocrlf. The project uses a
.gitattributes file to define this setting for the text files in the
repository.
Use https links. Fix broken/outdated links.
Alex Herbert [Sat, 5 Mar 2022 08:38:20 +0000 (08:38 +0000)]
Expand the user guide
Create a single user guide document based on the template from Commons
Geometry.
Add more entries to the site 'User Guide' menu.
Added a simple test class to verify the code examples used in the user
guide.
Alex Herbert [Thu, 3 Mar 2022 21:43:20 +0000 (21:43 +0000)]
Correct example code in the user guide
Alex Herbert [Thu, 3 Mar 2022 21:42:56 +0000 (21:42 +0000)]
Add skeleton site documentation for distribution module
Alex Herbert [Thu, 3 Mar 2022 19:13:58 +0000 (19:13 +0000)]
Update site.xml using Commons RNG as template
Gary Gregory [Fri, 25 Feb 2022 13:24:22 +0000 (08:24 -0500)]
Add SECURITY.md
Gary Gregory [Fri, 25 Feb 2022 13:20:22 +0000 (08:20 -0500)]
Add CODE_OF_CONDUCT.md
Alex Herbert [Sat, 29 Jan 2022 08:58:07 +0000 (08:58 +0000)]
Update f distribution pdf function
Greater accuracy is obtained using log(pdf) than using the log
computation. However the regular pdf may have intermediate overflow with
extreme parameters when the result is finite.
Extract the pdf and logpdf computation to a single method. Try the
regular pdf and switch to the log computation when the density is
sub-normal or overflows.
Alex Herbert [Wed, 26 Jan 2022 19:41:25 +0000 (19:41 +0000)]
Add sample size = 0 test case
Alex Herbert [Wed, 26 Jan 2022 19:41:07 +0000 (19:41 +0000)]
Add special cases for binomial distribution PMF
When x=0 or x=n the power function can be used to evaluate the PMF to 1
ULP. This is more accurate than exp(logPMF) when the expected result is
very small.
Increased precision of reference test values.
Added specific bounds test using exact pmf computation.
Updated test tolerance from default 1e-12 to 1e-15.
Added special case tests for p=0, p=1, n=0, n=1.
Updated code to handle p=-0.0.
Alex Herbert [Wed, 26 Jan 2022 10:00:05 +0000 (10:00 +0000)]
Reuse temp n result
Remove invalid code comment
Alex Herbert [Wed, 26 Jan 2022 01:21:44 +0000 (01:21 +0000)]
Fix typo
Alex Herbert [Wed, 26 Jan 2022 01:21:30 +0000 (01:21 +0000)]
Updated PMF vs log PMF assertion at the support bound
Alex Herbert [Tue, 25 Jan 2022 21:31:04 +0000 (21:31 +0000)]
Test file formatting
aherbert [Wed, 26 Jan 2022 18:05:11 +0000 (18:05 +0000)]
Update F distribution with the beta function
Added second PDF form to the class javadoc.
Use the beta function following the Boost documentation for the F
distribution.
The CDF and SF can use the regularized beta or its complement. This is
chosen depending on the x argument.
The PDF can use the beta derivative using the same logic for choosing
the value passed to the derivative function.
Updated the log PDF to reduce the number of terms.
Added SF values to test files.
Large degrees of freedom data (n=100, m=100) requires a more accurate
source implementation.
aherbert [Wed, 26 Jan 2022 14:20:33 +0000 (14:20 +0000)]
Fix test javadoc
aherbert [Mon, 24 Jan 2022 12:53:10 +0000 (12:53 +0000)]
Add a t-distribution sampler
aherbert [Mon, 24 Jan 2022 13:38:53 +0000 (13:38 +0000)]
Remove trailing whitespace
aherbert [Mon, 24 Jan 2022 13:37:19 +0000 (13:37 +0000)]
Add t distribution test cases for more degrees of freedom
Add df=0.25, 0.5, 10, 1e2, 1e4, 1e6, 1e10, 1e14
Alex Herbert [Sun, 23 Jan 2022 22:47:06 +0000 (22:47 +0000)]
STATISTICS-25: Use beta function for the PDF
Added PDF test data for large degrees of freedom.
Alex Herbert [Sun, 23 Jan 2022 09:13:27 +0000 (09:13 +0000)]
Remove RegularizedBetaUtils
The complement function is provided by Commons Numbers.
Alex Herbert [Sun, 23 Jan 2022 09:07:57 +0000 (09:07 +0000)]
Beta distribution to use RegularizedBeta.complement function
Lower test tolerance from 1e-14 to 8e-15.
Alex Herbert [Sun, 23 Jan 2022 08:28:31 +0000 (08:28 +0000)]
Beta distribution to use beta functions from Commons Numbers
Use RegularizedBeta.derivative.
Use LogBeta in place of 3 LogGamma calls.
Test data has been updated to 17 significant figures. Test tolerance
lowered from 5e-14 to 1e-14.
Alex Herbert [Sun, 23 Jan 2022 07:14:54 +0000 (07:14 +0000)]
Use ParameterizedTest
Alex Herbert [Sat, 22 Jan 2022 14:32:20 +0000 (14:32 +0000)]
STATISTICS-25: T-dist to use the complement of the regularized beta
NUMBERS-181 improved the regularized beta function. This allows
increasing the threshold for switching to the normal approximation.
Alex Herbert [Sat, 22 Jan 2022 09:37:10 +0000 (09:37 +0000)]
Drop redundant test.
aherbert [Fri, 21 Jan 2022 16:59:10 +0000 (16:59 +0000)]
Remove checkedProbability method from the inverse
This method is only of use for bug reporting extremely an unlikely
event. It will negatively impact performance for all distributions
during computation of the inverse.
The extreme bounds of discrete distributions are tested in unit tests.
Any distributions where NaN values are potentially expected (e.g. zero
divide by zero; infinity - infinity) should use units tests to check
extreme parameterisations.
aherbert [Fri, 21 Jan 2022 16:42:45 +0000 (16:42 +0000)]
Remove HTML for the PDF/CDF in the javadoc
The PDF is defined in the class header using mathjax.
No CDFs are defined for any distribution and removing this is more
consistent with the current javadoc.
aherbert [Fri, 21 Jan 2022 16:37:18 +0000 (16:37 +0000)]
Consistent javadoc for property getters
Use 'Gets the X parameter of this distribution'.
Removes the use of 'Access the X parameter' for some of the javadocs.
Using 'Gets' matches the same usage in the interface for the mean,
variance and lower/upper bounds.
Removes the use of parameter names within @code tags unless these are
defined in the same javadoc, e.g. LogNormalDistribution mu parameter.
aherbert [Fri, 21 Jan 2022 15:01:00 +0000 (15:01 +0000)]
STATISTICS-52: Add a high precision PDF to the normal distribution
Exploit information in the round-off from x*x to increase the precision
of exp(-0.5*x*x) when x is large.
Add a benchmark to demonstrate this has minor impact on the runtime
performance. Accuracy is increased to within 3 ULP (down from hundreds)
for large x values.
aherbert [Mon, 17 Jan 2022 15:38:38 +0000 (15:38 +0000)]
Fix maxima script to generated normpdf.csv
The constant for sqrt(2*pi) was not correctly declared as a big float.
The regenerated data is different from the 17th significant figure.
Alex Herbert [Mon, 3 Jan 2022 22:00:11 +0000 (22:00 +0000)]
Update NOTICE to 2022
Alex Herbert [Mon, 3 Jan 2022 21:57:33 +0000 (21:57 +0000)]
Add second reference data for additional moments
Update computation of variance to avoid the use of the mean squared.
This avoids loss of precision in (omega/mu) by not having to square the
sqrt(omega/mu).
Alex Herbert [Fri, 24 Dec 2021 15:08:39 +0000 (15:08 +0000)]
Increase Exponential PDF test tolerance
The Math.exp function is platform dependent and has lower accuracy than
1 ULP on the given test data depending on the JDK and OS.
Alex Herbert [Fri, 24 Dec 2021 14:36:09 +0000 (14:36 +0000)]
Javadoc parameter range for mu
Alex Herbert [Fri, 24 Dec 2021 14:29:14 +0000 (14:29 +0000)]
Sample Nakagami using a related gamma distribution
Alex Herbert [Fri, 24 Dec 2021 14:23:53 +0000 (14:23 +0000)]
Update to use GammaRatio to compute the moments
Added additional test cases added where the simple computation of
gamma(mu + 0.5) / gamma(mu) will overflow the gamma function.
Alex Herbert [Tue, 21 Dec 2021 12:26:25 +0000 (12:26 +0000)]
Cache mean and variance for the inverse probability implementation
Alex Herbert [Mon, 20 Dec 2021 22:22:30 +0000 (22:22 +0000)]
Javadoc mean and variance using abstract methods
Alex Herbert [Mon, 20 Dec 2021 22:15:43 +0000 (22:15 +0000)]
STATISTICS-36: Update accuracy of inverse probability function
Changed the accuracy tolerances used by the BrentSolver. Inversion
should be accurate to close to 1 ULP.
Update test tolerances for the following distributions using the default
inverse probability method:
Beta
ChiSquared
F
Gamma
Nakagami
T
Fix the Beta distribution sampling test. Extreme parameterisations
cannot be sampled using uniform expected bins due to low precision of
values close to 1.
More accurate test tolerances required:
- SF values to be added for beta, chi-squared, gamma and t
distributions.
- Log density for F distribution test.
- Extended precision of the test data for Nakagami(1.5, 2) including SF
values.
- Updated T distribution test for infinite degrees of freedom. The
normal distribution from matlab more closely agrees with the current
implementation than scipy or R.
Alex Herbert [Mon, 20 Dec 2021 09:55:28 +0000 (09:55 +0000)]
Add JMH benchmarking module
Add benchmark for distribution inverse probability functions.
Alex Herbert [Sun, 19 Dec 2021 13:01:24 +0000 (13:01 +0000)]
Remove redundant licenses from LICENSE
The RNG components do not apply in statistics.
Alex Herbert [Fri, 17 Dec 2021 21:41:08 +0000 (21:41 +0000)]
Consistent javadoc for the lower/upper bounds/mean/variance
If the bounds are defined for all parameters then put this value in the
return statement.
Alex Herbert [Fri, 17 Dec 2021 20:31:25 +0000 (20:31 +0000)]
Update javadoc for mean and variance
Remove the statement 'or {@code Double.NaN} if it is not defined'.
This only applies to a few distributions. These distributions have been
updated to state they return NaN if the mean/variance is undefined.
This change prevents all distributions inheriting the javadoc and having
to override the return tag to state the mean/variance is finite.
Alex Herbert [Fri, 17 Dec 2021 19:26:14 +0000 (19:26 +0000)]
Create consistent web links in distribution class javadoc.
Update web links to https where applicable.
Alex Herbert [Fri, 17 Dec 2021 00:50:14 +0000 (00:50 +0000)]
Update exponential distribution PDF
Directly implement the PDF using exp and do not use exp(logDensity(x)).
This is more accurate on extended precision test data.
Remove check that x >= +infinity in the log density. The result is the
same (-infinity) without this edge case check.
Alex Herbert [Wed, 15 Dec 2021 22:36:15 +0000 (22:36 +0000)]
Add PDF and PMF to distribution class javadoc
aherbert [Thu, 16 Dec 2021 12:18:46 +0000 (12:18 +0000)]
Correct javadoc for isSupportConnected
aherbert [Thu, 16 Dec 2021 12:16:30 +0000 (12:16 +0000)]
Reduce isSupportConnected to package-private
It is a package level implementation detail.
aherbert [Thu, 16 Dec 2021 11:44:49 +0000 (11:44 +0000)]
Reduce getMedian to package-private
If protected this is exposed in javadoc. It is a package level
implementation detail.
aherbert [Thu, 16 Dec 2021 11:18:25 +0000 (11:18 +0000)]
Avoid overflow in the continuous uniform distribution
The range (upper - lower) must be finite for the current implementation
to function. If infinite (or nan) then throw an exception.
Ensure the mean is robust to overflow if (lower + upper) is infinite.
aherbert [Thu, 16 Dec 2021 12:12:54 +0000 (12:12 +0000)]
PMD fix: Refactor adjustment of infinite bounds to methods
This reduces complexity in the inverseProbability method
aherbert [Thu, 16 Dec 2021 10:53:22 +0000 (10:53 +0000)]
Ensure default inverse probability is robust to distribution truncation
This was discovered as the F-distribution with numerator DF=1 and
denominator DF=2 computed a survival probability:
sf(infinity) = 0.0
sf(max_value) = 5.56e-309
If the inverse SF was called with a p-value lower than 5.56e-309 (e.g.
Double.MIN_VALUE) the required value of x is outside the finite range of
a double.
aherbert [Wed, 15 Dec 2021 17:44:03 +0000 (17:44 +0000)]
Fix typo
aherbert [Wed, 15 Dec 2021 17:17:46 +0000 (17:17 +0000)]
Change survival to sf in the distribution examples application
aherbert [Wed, 15 Dec 2021 17:03:42 +0000 (17:03 +0000)]
Removed redundant tolerance from beta distribution test files
The tests pass using the global tolerance for the beta distribution.
Accuracy of the RegularisedBeta function used for the CDF and SF may
have been improved by changes to the gamma functions in [NUMBERS-174].
Alex Herbert [Mon, 13 Dec 2021 23:14:43 +0000 (23:14 +0000)]
STATISTICS-39: Update chisq distribution tests
The gamma distribution density function was fixed for small shape
values. This fixes the known failing tests for the chi-squared
distribution.
This requires a change to the BaseContinuousDistributionTest to exclude
extremely steep density integrals.
Alex Herbert [Mon, 13 Dec 2021 20:34:44 +0000 (20:34 +0000)]
STATISTICS-39: Update Gamma distribution to use Number's gamma
The PDF is computed using RegularizedGamma.P.derivative.
Additional test data added for shape 0.25, 0.5, 0.75.
Existing PDF test tolerances lowered where improvements have been made.
Reinstate disabled test of a very small shape parameter.
Alex Herbert [Sat, 11 Dec 2021 08:43:29 +0000 (08:43 +0000)]
Update test tolerances.
The updated gamma function introduced by NUMBERS-174 improves the
accuracy of the Poisson distribution. Values for very large mean for the
CDF and SF agree with the R implementation to 1e-14 relative error. This
improves from the previous gamma function which required 1e-5 relative
error.
Alex Herbert [Sat, 11 Dec 2021 08:36:15 +0000 (08:36 +0000)]
STATISTICS-38: Remove configurable epsilon and iterations
These are not required since the gamma function has been updated to
avoid the series and continued fraction computations when convergence is
slow. See NUMBERS-174.
aherbert [Fri, 26 Nov 2021 18:10:06 +0000 (18:10 +0000)]
STATISTICS-46: Update the truncated normal computation of moments
The moments cannot be computed when the truncation is very close to zero
as the PDF is flat. Use a uniform distribution in this case.
Extract the logic for the computation of the mean and variance to
methods called on demand.
aherbert [Thu, 9 Dec 2021 17:56:23 +0000 (17:56 +0000)]
Remove non-applicable RNG components from the LICENSE
aherbert [Thu, 25 Nov 2021 16:16:28 +0000 (16:16 +0000)]
Update normal distribution pdf to compute directly.
Use of exp(logDensity) can suffer errors as x increases in magnitude for
the standard normal distribution (mu=0, sd=1).
Approximate errors:
x ulp
3 3
6 10
12 >40
Added extended precision test data for the standard normal distribution
computed using maxima. ULP errors are below 2 for x <= 12 when computing
the PDF directly.
Added extended precision computation of x * sqrt(2 * pi) for use in the
PDF computation for normal distributions with a mean other than 1.
This computes more accurately than x * Math.sqrt(2 * Math.PI), or x *
SQRT2PI if using a pre-computed constant.
aherbert [Mon, 22 Nov 2021 13:38:43 +0000 (13:38 +0000)]
STATISTICS-48: Clarify test usage of isSupportConnected method
aherbert [Mon, 22 Nov 2021 13:38:13 +0000 (13:38 +0000)]
STATISTICS-48: Remove connected property from example property file
aherbert [Mon, 22 Nov 2021 13:09:51 +0000 (13:09 +0000)]
[STATISTICS-48] Remove isSupportConnected from distribution interface
This method is an implementation detail only used by the
AbstractContinuousDistribution to invert the cumulative or survival
probability. It has been demoted to a protected member of that class and
removed as a public interface member.
All distributions currently only return true for this method. Removal
from the public interface has no impact on the library.
aherbert [Mon, 22 Nov 2021 12:30:38 +0000 (12:30 +0000)]
Add probability range implementation for continuous uniform distribution
aherbert [Mon, 22 Nov 2021 12:04:32 +0000 (12:04 +0000)]
Add probability range implementation for discrete uniform distribution.
Alex Herbert [Fri, 19 Nov 2021 22:12:20 +0000 (22:12 +0000)]
STATISTIC-47: Add isf command to distribution examples application
Update check command to check the isf.
Change the check command to use Precision to test relative equality.
If the inverse probability does not map to the original input then do a
forward mapping to test the value maps back to the same probability.
This accommodates function pairs that are not bijections, i.e. are
many-to-one.
Alex Herbert [Fri, 19 Nov 2021 08:15:14 +0000 (08:15 +0000)]
Use high-precision sqrt(2 * sd * sd)
Alex Herbert [Thu, 18 Nov 2021 09:29:29 +0000 (09:29 +0000)]
STATISTICS-47: Add implementations for inverse survival probability
Add tests for inverse survival probability.
Update test tolerances for cases where the test tolerance is now limited
by the inverse SF.
Add test data for properties files that have icdf.points to test the
complement of the icdf.
Alex Herbert [Thu, 18 Nov 2021 00:43:22 +0000 (00:43 +0000)]
STATISTICS-47: Add inverse survival probability
Add default methods to the distribution interface.
Add default implementations to the abstract distribution classes.
Add tests for the default implementations.
aherbert [Fri, 12 Nov 2021 14:03:45 +0000 (14:03 +0000)]
Special case of CDF/SF for the Poisson distribution with x=0
The value can be computed exactly without having to call the
RegularizedGamma function.
Alex Herbert [Wed, 27 Oct 2021 11:44:51 +0000 (12:44 +0100)]
Sonar fix: Comment disabled test
Alex Herbert [Wed, 27 Oct 2021 11:43:05 +0000 (12:43 +0100)]
Sonar fix: rename var for temporary variance variables
var is a restricted identifier.
Alex Herbert [Wed, 27 Oct 2021 11:32:20 +0000 (12:32 +0100)]
Remove spurious @Test annotation
Alex Herbert [Wed, 27 Oct 2021 11:31:30 +0000 (12:31 +0100)]
Sonar fix: Create double from sum of integers for power function
Alex Herbert [Wed, 27 Oct 2021 11:04:54 +0000 (12:04 +0100)]
Remove obsolete test tolerance property
tolerance.hp has been replaced with
tolerance.hp.relative
tolerance.hp.absolute
aherbert [Mon, 25 Oct 2021 13:42:47 +0000 (14:42 +0100)]
Add more reference data for Poisson distribution test
aherbert [Mon, 25 Oct 2021 11:05:59 +0000 (12:05 +0100)]
Use factory constructors
Alex Herbert [Sun, 24 Oct 2021 18:36:41 +0000 (19:36 +0100)]
STATISTICS-35: Poisson dist to use a Gaussian sampler for large mean
Alex Herbert [Sun, 24 Oct 2021 18:30:39 +0000 (19:30 +0100)]
Use survival probability to compute upper quartile
Alex Herbert [Sun, 24 Oct 2021 18:30:01 +0000 (19:30 +0100)]
Add sampling test for quartiles to discrete distributions
The test is only run if the distribution is not concentrated at a
single point.
Alex Herbert [Sat, 23 Oct 2021 11:47:09 +0000 (12:47 +0100)]
Consistent constant names and values
Alex Herbert [Fri, 22 Oct 2021 19:02:53 +0000 (20:02 +0100)]
STATISTICS-25: Specialise t-distribution for infinite degrees freedom
If the variance of the distribution matches the standard normal
distribution then delegate to a standard normal distribution.
Alex Herbert [Fri, 22 Oct 2021 13:16:30 +0000 (14:16 +0100)]
STATISTICS-37: Update Levy distribution test
Commons Numbers 1.1 improved the implementation of erfc and its inverse.
Update the levy distribution unit tests to verify the increased
precision.
Alex Herbert [Fri, 22 Oct 2021 13:16:16 +0000 (14:16 +0100)]
STATISTICS-37: Update Normal distribution with high precision erfc
Commons Numbers 1.1 improved the implementation of erfc and its inverse.
Update the truncated normal distribution and unit tests to verify the
increased
precision.
Added better use of the probability in a range for the truncated normal.
Add notes about possible cancellation is computation of the moments for
the truncated normal.
Alex Herbert [Fri, 22 Oct 2021 13:14:40 +0000 (14:14 +0100)]
STATISTICS-37: Update Normal distribution with high precision erfc
Commons Numbers 1.1 improved the implementation of erfc and its inverse.
Update the normal normal distribution and unit tests to verify the
increased
precision.
The increase in precision of the error function is such that the
computation of sd * sqrt(2) is a source of error against reference data.
This has been updated to compute exactly and verified against high
precision data. It is a small additional computation cost during class
initialisation.
Change extended precision CDF and SF data to use matlab variable
precision arithmetic.
Alex Herbert [Thu, 21 Oct 2021 16:43:17 +0000 (17:43 +0100)]
Report relative and absolute error in assertion messages