Releases: histogrammar/histogrammar-scala
v1.0.30
v1.0.20
1.0.4
Added features to sparksql so that it can be called like this:
val result = myDataFrame.Bin(100, -5, 5, myColumn)
and a Py4J-friendly interface so that it can be called from PySpark as well. Running SparkSQL Histogrammar in PySpark will actually call Histogrammar-Scala for better performance.
1.0.3
1.0.2
Fixes bug on encountering NaN/Inf in ASCII plots.
Adds scalar multiplication to reweight filled aggregators. (Multiplying an aggregation tree by a scalar factor has the same effect as filling with weight * factor would have. In particular, multiplying by a non-positive number is equivalent to zero
.)
1.0.1
This version only changes poms so that the version on Maven Central exactly matches the version in GitHub. Until bokeh-scala issue #28 is fixed, we do not support Bokeh plotting in Scala 2.11.
1.0.0
Histogrammar is a suite of data aggregation primitives designed for use in parallel processing. In the simplest case, you can use this to compute histograms in distributed processors like Apache Spark, but the generality of the primitives allows much more.
This Scala implementation of Histogrammar adheres to version 1.0 of the specification and has been tested to guarantee compatibility with the Python implementation. The test suite includes empty datasets, NaN/infinity handling, associativity tests, and numerical agreement at the level of one part in a trillion (double precision). Several common histogram types can be plotted in Bokeh with a single method call.
It is the first version to be distributed in Maven Central for easy inclusion in Maven, sbt, and Spark jobs. See http://histogrammar.org/ for more.