Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with checkConsistency when coding a probabilistic PCA #24

Open
balglave opened this issue Apr 10, 2024 · 2 comments
Open

Issue with checkConsistency when coding a probabilistic PCA #24

balglave opened this issue Apr 10, 2024 · 2 comments

Comments

@balglave
Copy link

balglave commented Apr 10, 2024

I want to code a probabilistic PCA in RTMB and when I run the function checkconsistency it looks that I have huge biases.
Here is the code. If anyone can explain me what is happending, I'd be interested.

library(MASS)
library(RTMB)


n_ind <- 125 # number f individuals
n_var <- 5 # number of variables
Y <- mvrnorm(n_ind,mu = rep(0,n_var),Sigma = diag(1,n_var))

data <- list(Y = Y)

parameters <- list(b = rep(0,n_var)) # define intercepts
maps_param <- list()

parameters[["E"]] <- diag(1,nrow = n_var,ncol = n_var)
m2 <- diag(1:n_var,nrow = n_var,ncol = n_var) # make it a diagonal matrix
m2[which(m2==0)] <- NA
maps_param[["E"]] <- as.factor(m2)

parameters[["Lambda"]] <- matrix(runif(ncol(Y)*n_var,0,2),nrow = ncol(Y),ncol = n_var)
parameters[["Lambda"]][!lower.tri(parameters[["Lambda"]],diag = T)] <- 0 # make it triangular inferior

map_lambda <- matrix(1:(ncol(Y)*n_var),ncol(Y),n_var) # Constraints on loadings
lower.tri(map_lambda)
map_lambda[!lower.tri(map_lambda,diag = T)] <- NA
maps_param[["Lambda"]] <- as.factor(map_lambda)

parameters[["delta"]] <- matrix(rnorm(nrow(Y)*n_var,0,1),nrow = nrow(Y),ncol = n_var)
random_vec <- "delta"

f <- function(parms) {
  
  nll <- 0
  
  Y <- data$Y

  b <- parms$b
  L <- parms$L
  E <- parms$E
  Lambda <- parms$Lambda
  delta <- parms$delta

  mu <- matrix(0,nrow = nrow(Y),ncol = ncol(Y)) # Trend/intercept
  for(i in 1:nrow(mu)){
    
    nll <- nll - sum(RTMB::dmvnorm(x=delta[i,], mu=rep(0,n_var), diag(1,ncol(delta)), TRUE))
    mu[i,] <- b + Lambda %*% delta[i,] # PCA
    
  }
  
  nll = nll - sum(RTMB::dmvnorm(Y, mu, E, TRUE))
  
  REPORT(Lambda)
  
  return(nll)
  
}

obj <- MakeADFun(f, parameters,map = maps_param,random = random_vec)

opt = nlminb( start=obj$par, objective=obj$fn, gradient=obj$gr)
rep <- sdreport(obj)
report <- obj$report()

checkConsistency(obj)

@kaskr
Copy link
Owner

kaskr commented Apr 11, 2024

To have all the automated stuff working (simulation, checking, etc) you should currently (to be relaxed) follow the introduction vignette fairly closely. I.e. re-code the data/parameter fetching by replacing

  Y <- data$Y
  b <- parms$b
  L <- parms$L
  E <- parms$E
  Lambda <- parms$Lambda
  delta <- parms$delta

by

  getAll(data, parms, warn=FALSE)
  Y <- OBS(Y)

By making this change it should almost work, except you hit an edge case here because RTMB simply redirects mvnorm simulation to MASS::mvrnorm which doesn't allow matrix mu (to be fixed in RTMB) (EDIT: The RTMB documentation states that mu must be a vector, so an error should be thrown):

nll = nll - sum(RTMB::dmvnorm(Y, mu, E, TRUE))

Simple fix is to be replace by

nll = nll - sum(RTMB::dmvnorm(Y - mu, 0, E, TRUE))

BTW, the consistency check should point you to the fact that your model is over-parameterized...

@balglave
Copy link
Author

@kaskr It worked thanks !

Do you have any suggestions on how I could avoid over-parameterization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants