Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameters estimation for LDA #193

Open
wants to merge 36 commits into
base: develop
Choose a base branch
from

Conversation

alex2304
Copy link

@alex2304 alex2304 commented Dec 4, 2017

Hello, we want to contribute to MeTa several new features. Namely, realization of three methods for estimating constant mu of the current realization of Dirichlet prior smoothing.

The ranker based on Dirichlet prior smoothing implemented in MeTa uses parameter mu for smoothing. For now, the only way to use it is to either pass own value for the parameter or to use default mu = 2000. However, it's possible to find optimal value of the parameter for a particular set of documents (see H. Wallach, 2008, p. 18) which will provide the most effective smoothing. In our contribution, we implemented three methods for estimating such optimal value of the parameter mu using given parameters of the documents set.

Implemented methods are originally introduced by (H. Wallach, 2008, pages 26-30). In fact, these methods are based on several modifications of Fixed-Point Iteration method and provide better performance.

Considering project architecture, we implemented each new method as separate ranker (see picture with classes hierarchy). Also, we added ability to use such new rankers by specifying the following in the .toml config file:

[ranker]
method = "dirichlet-digamma-rec"

Full list of methods available:

  • dirichlet-digamma-rec - Fixed-Point Iteration by (Minka, 2003) using digamma recurrence relation
  • digamma-log-approx - Fixed-Point Iteration by (Minka, 2003) using logarithmic approximation of digamma differences
  • digamma-mackay-peto - Fixed-Point Iteration by (MacKay and Peto, 1995) with efficient computing of some inner parameter

We also verified that methods work as expected, i.e. found parameter mu is really optimal. To do this, we generated synthetic data using Dirichlet distribution with predefined parameters, and then compared results with predefined values, as it was done in H. Wallach, 2008. As in the work of H. Wallach, we used three metrics for evaluating methods performance:

  • Execution time
  • Kullback-Leibler Divergence between "true" and computed distributions
  • Relative error of mu

Parameters of synthetic data we used and results of methods comparison are presented here.

1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants