Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

variable names in importance plots #6

Open
rdiaz02 opened this issue Nov 24, 2014 · 0 comments
Open

variable names in importance plots #6

rdiaz02 opened this issue Nov 24, 2014 · 0 comments

Comments

@rdiaz02
Copy link
Owner

rdiaz02 commented Nov 24, 2014

Asked for by Xiaowei Guan: " there are not variable names only the dots".
My answer back then:
Right now, no, there is no direct way to get those names in the plot.
Let me ellaborate:

a) In the second plot (OOB error vs. number of variables) that would
not make sense (since its not individual variables that are plotted).

b) In the first plot, we are plotting just the importances of the very
first forest. That is something you could get from the usual random
forest, as you show.

c) b) is really not a very statisfactory answer. It should be easy to
modify the code of plot.varSelRF, because we just do a simple call

> varSelRF:::plot.varSelRF
function (x, nvar = NULL, which = c(1, 2), ...)
{
   if (length(which) == 2 && dev.interactive()) {
       op <- par(ask = TRUE, las = 1)
   }
   else {
       op <- par(las = 1)
   }
   on.exit(par(op))
   if (is.null(nvar))
       nvar <- min(30, length(x$initialOrderedImportances))
   show <- c(FALSE, FALSE)
   show[which] <- TRUE
   if (show[1]) {
       dotchart(rev(x$initialOrderedImportances[1:nvar]), 
                           main = "Initial importances",
                           xlab = "Importances (unscaled)")
   }
   if (show[2]) {
       ylim <- c(0, max(0.5, x$selec.history$OOB))
       plot(x$selec.history$Number.Variables, x$selec.history$OOB,
           type = "b", xlab = "Number of variables used", ylab = "OOB error",
           log = "x", ylim = ylim, ...)
       lines(x$selec.history$Number.Variables, x$selec.history$OOB +
           2 * x$selec.history$sd.OOB, lty = 2)
       lines(x$selec.history$Number.Variables, x$selec.history$OOB -
           2 * x$selec.history$sd.OOB, lty = 2)
   }
}

So we could just modify the call to dotchart, to add labels. For now, however, you
can either use random forest directly, or modify the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant