Skip to content
Jason Baldridge edited this page Mar 26, 2014 · 9 revisions


  • added binomial logistic regression
  • updated to Breeze 0.6


  • breeze-learn pulled into Nak
  • K-means from breeze-learn and Nak merged.
  • Added locality sensitive hashing


  • Enabled the hashing trick to be used for linear models. See and
  • PCA now supported directly in Nak for dimensionality reduction w/ kmeans.
  • All OpenNLP Maxent code purged from Nak.


  • Incorporated classes into and updated liblinear training and model use to work with these rather than the legacy opennlp classes.
  • Added and nak.core.FeaturizedClassifier, which handle turning raw data into collections of features. When combined with IndexedClassifier, the indexation of these features is handled automatically so that a user of the API doesn't need to ever worry about the low level and can focus on the data and feature extraction.
  • Added NakContext object (inspired by SparkContext), which provides a number of utility methods for getting classifiers up and running.
  • Added nak.example package, with example implementations for prepositional phrase attachment (PpaExample) and text classification (TwentyNewsGroupsExample).
  • Added nak.util.ConfusionMatrix class, which provides detailed error output.
  • Added ScalaTest and started writing BDD tests.
  • Refactored a lot of code to get rid of duplication.
  • Added code documentation to many classes and functions.


  • Massive reorganization of the sub-packages.
  • Added nak.liblinear (using the Java liblinear package) and added that uses liblinear logistic regression solvers as well as GIS.


  • The classification code from the OpenNLP Maxent package, slightly reorganized.
  • The k-means clustering code from Scalabha.
  • Changed project id from com.jasonbaldridge to org.scalanlp.


  • The original OpenNLP Maxent code, pulled out of Chalk and then renamed to nak.*
Clone this wiki locally