pyinterp.Histogram2D#
- class pyinterp.Histogram2D(x: pyinterp.core.Axis, y: pyinterp.core.Axis, bin_counts: Optional[int] = None, dtype: Optional[numpy.dtype] = dtype('float64'))[source]#
Bases:
object
Group a number of more or less continuous values into a smaller number of “bins” located on a grid.
This class will build for each pixel of the defined grid, a histogram. This histogram will be used to compute the statistics.
Histogram used uses the algorithm described in the paper A Streaming Parallel Decision Tree Algorithm. Therefore, if the number of observations to be taken into account in a pixel exceeds the maximum number of bins, the calculated statistics will be an approximate value of the exact statistical variable. This algorithm is useful if you want to know the statistical distribution per pixel or the value of a quantile, like the median. Otherwise, use the
pyinterp.Binning2D
class.- Parameters
x – Definition of the bin centers for the X axis of the grid.
y – Definition of the bin centers for the Y axis of the grid.
bin_counts – The number of bins to use. If not set, the number of bins is 100.
dtype – Data type of the instance to create.
Note
The axes define the centers of the different cells where the statistics will be calculated, as shown in the figure below.
In this example, to calculate the statistics in the different cells defined, the coordinates of the axes must be shifted by half a grid step, 0.5 in this example.
Note
Yael Ben-Haim and Elad Tom-Tov, A Streaming Parallel Decision Tree Algorithm, Journal of Machine Learning Research, 11, 28, 849-872 http://jmlr.org/papers/v11/ben-haim10a.html
Attributes
Gets the bin centers for the X Axis of the grid.
Gets the bin centers for the Y Axis of the grid.
Public Methods
clear
()Clears the data inside each bin.
push
(x, y, z)Push new samples into the defined bins.
push_delayed
(x, y, z)Push new samples into the defined bins from dask array.
variable
([statistics])Gets the regular grid containing the calculated statistics.
Special Methods