Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
svkucheryavski committed Sep 13, 2021
2 parents 1418b8a + b5a274f commit 815c4ed
Show file tree
Hide file tree
Showing 44 changed files with 991 additions and 166 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: mdatools
Title: Multivariate Data Analysis for Chemometrics
Version: 0.11.5
Date: 2021-04-26
Version: 0.12.0
Date: 2021-09-12
Author: Sergey Kucheryavskiy (<https://orcid.org/0000-0002-3145-7244>)
Maintainer: Sergey Kucheryavskiy <[email protected]>
Description: Projection based methods for preprocessing,
Expand Down
9 changes: 7 additions & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ S3method(as.matrix,simcares)
S3method(categorize,pca)
S3method(categorize,pls)
S3method(confint,regcoeffs)
S3method(employ,constraint)
S3method(getCalibrationData,simcam)
S3method(getConfusionMatrix,classres)
S3method(getProbabilities,pca)
Expand Down Expand Up @@ -171,12 +170,14 @@ export(dd.crit)
export(ddmoments.param)
export(ddrobust.param)
export(ellipse)
export(employ)
export(employ.constraint)
export(employ.prep)
export(eye)
export(fprintf)
export(getCalibrationData)
export(getConfusionMatrix)
export(getImplementedConstraints)
export(getImplementedPrepMethods)
export(getProbabilities)
export(getPureVariables)
export(getRegcoeffs)
Expand Down Expand Up @@ -295,13 +296,17 @@ export(pls.run)
export(plsda)
export(plsdares)
export(plsres)
export(prep)
export(prep.alsbasecorr)
export(prep.autoscale)
export(prep.list)
export(prep.msc)
export(prep.norm)
export(prep.ref2km)
export(prep.savgol)
export(prep.snv)
export(prep.transform)
export(prep.varsel)
export(prepCalData)
export(randtest)
export(regcoeffs)
Expand Down
34 changes: 33 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
v.0.12.0
========
This release is mostly about preprocessing - added some new methods, improved the existent once and implemented a possibility to combine preprocessing methods together (including parameter values) and apply them all together in a correct sequence. See [preprocessing section](https://mda.tools/docs/preprocessing.html) in the tutorials for details

## New features and improvements

* method `prep.norm()` for normalization of spectra (or any other signals) is more versatile now and supports normalization to unit sum, length, area, to height or area under internal standard peak, and SNV. SNV via `prop.snv()` is still supported for compatibility.

* `prep.savgol()` has been rewritten to fix a minor bug when first derivative was inverted, but also to make the handling of the edge points better. See details in help text for the function and in the tutorial.

* added a new method `prep.transform()` which can be used for transformation of values of e.g. response variable to handle non-linearity.

* added a new method `prep.varsel()` which makes possible to select particular variables as a part of preprocessing framework. For example you can apply baseline correction, normalization and noise suppression to the whole spectra and after that select only a particular part for modelling.

* added new method `prep()` which let you to combine several preprocessing methods and their parameters into a list and use e.g. it as a part of model.


## Bug fixes

* fixed a bug in `mcrals()` which in rare occasions could lead to a wrong error message.

* fixed a bug when attribute `yaxis.value` was used as `ylab` when creating line and bar plots.

* fixed an earlier reported issue with plotXYResiduals ([#100](https://github.com/svkucheryavski/mdatools/issues/100))


## Other changes

* function `employ()` which was used to employ constraints in MCR-ALS has been renamed to `employ.constraint()`. The function is for internal use and this change should not give any issues in your code.

* the user guides have been revised and improved.

v.0.11.5
========

Expand Down Expand Up @@ -108,7 +140,7 @@ Finally, all model results (calibration, cross-validation and test set validatio
into a single list, `model$res`. This makes a lot of things easier. However, the old way of
accessing the result objects (e.g. `model$calres` or `model$cvres`) still works, you can access e.g. calibration results both using `model$res$cal` and `model$calres`, so this change will not break the compatibility.

Below is more detailed list of changes. The [tutorial](https://mdatools.com/docs/) has been updated accordingly.
Below is more detailed list of changes. The [tutorial](https://mda.tools/docs/) has been updated accordingly.

## Breaking changes

Expand Down
4 changes: 3 additions & 1 deletion R/constraints.R
Original file line number Diff line number Diff line change
Expand Up @@ -276,9 +276,11 @@ constraint <- function(name, params = NULL, method = NULL) {
#' matrix with pure spectra or contributions
#' @param d
#' matrix with original spectral values
#' @param ...
#' other arguments
#'
#' @export
employ.constraint <- function(obj, x, d) {
employ.constraint <- function(obj, x, d, ...) {
return(do.call(obj$method, c(list(x = x, d = d), obj$params)))
}

13 changes: 0 additions & 13 deletions R/defaults.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,3 @@
#' Generic function to apply a method (e.g. constraint)
#'
#' @param obj
#' constraint object
#' @param x
#' data in question (e.g. resolved spectra)
#' @param d
#' matrix with original data
#'
#' @export
employ <- function(obj, x, d) {
UseMethod("employ")
}

#' Plot purity spectra
#' @param obj
Expand Down
11 changes: 6 additions & 5 deletions R/ldecomp.R
Original file line number Diff line number Diff line change
Expand Up @@ -107,17 +107,19 @@ plotCumVariance.ldecomp <- function(obj, type = "b", labels = "values", show.plo
#' vector with ticks for x-axis
#' @param show.plot
#' logical, shall plot be created or just plot series object is needed
#' @param ylab
#' label for y-axis
#' @param ...
#' most of graphical parameters from \code{\link{mdaplot}} function can be used.
#'
#' @export
plotVariance.ldecomp <- function(obj, type = "b", variance = "expvar", labels = "values",
xticks = seq_len(obj$ncomp), show.plot = TRUE, ...) {
xticks = seq_len(obj$ncomp), show.plot = TRUE, ylab = "Explained variance, %", ...) {

if (!show.plot) return(obj[[variance]])

return(
mdaplot(obj[[variance]], xticks = xticks, labels = labels, type = type, ...)
mdaplot(obj[[variance]], xticks = xticks, labels = labels, type = type, ylab = ylab, ...)
)
}

Expand Down Expand Up @@ -387,7 +389,6 @@ ldecomp.getVariances <- function(scores, loadings, residuals, Q) {
attr(expvar, "name") <- "Variance"
attr(cumexpvar, "name") <- "Cumulative variance"
attr(expvar, "xaxis.name") <- attr(cumexpvar, "xaxis.name") <- "Components"
attr(expvar, "yaxis.name") <- attr(cumexpvar, "yaxis.name") <- "Explained variance, %"

return(list(expvar = expvar, cumexpvar = cumexpvar))
}
Expand Down Expand Up @@ -826,7 +827,7 @@ ldecomp.getT2Limits <- function(lim.type, alpha, gamma, params) {
if (lim.type %in% c("jm", "chisq")) DoF <- pT2$nobj

lim <- switch(lim.type,
"jm" = ,
"jm" = hotelling.crit(pT2$nobj, seq_len(ncomp), alpha, gamma),
"chisq" = hotelling.crit(pT2$nobj, seq_len(ncomp), alpha, gamma),
"ddmoments" = scale(dd.crit(pQ, pT2, alpha, gamma), center = FALSE, scale = DoF / pT2$u0),
"ddrobust" = scale(dd.crit(pQ, pT2, alpha, gamma), center = FALSE, scale = DoF / pT2$u0),
Expand Down Expand Up @@ -1001,7 +1002,7 @@ ldecomp.plotResiduals <- function(res, Qlim, T2lim, ncomp, log = FALSE, norm = F
getPlotLim <- function(lim, pd, ld, dim) {
if (!is.null(lim) || all(!show.limits)) return(lim)
limits <- if (show.limits[[2]]) max(ld$outliers[, dim]) else max(ld$extremes[, dim])
return( c(0, max(sapply(pd, function(x) { max(c(getValues(x, dim), limits)) * 1.05}))) )
return(c(0, max(sapply(pd, function(x) max(c(getValues(x, dim), limits)) * 1.05))))
}

# check that show.limits is logical
Expand Down
14 changes: 11 additions & 3 deletions R/mcrals.R
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ mcrals.cal <- function(D, ncomp, cont.constraints, spec.constraints, spec.ini,

## apply constraints to the resolved contributions
for (cc in cont.constraints) {
Ct <- employ(cc, x = Ct, d = D)
Ct <- employ.constraint(cc, x = Ct, d = D)
}

## resolve spectra
Expand All @@ -481,15 +481,23 @@ mcrals.cal <- function(D, ncomp, cont.constraints, spec.constraints, spec.ini,

## apply constraints to the resolved spectra
for (sc in spec.constraints) {
St <- employ(sc, x = St, d = D)
St <- employ.constraint(sc, x = St, d = D)
}

if (verbose) {
cat(sprintf("Iteration %4d, R2 = %.6f\n", i, var))
}

var_old <- var
var <- 1 - sum((D - tcrossprod(cont.solver(D, St), St))^2) / totvar
var <- tryCatch(
1 - sum((D - tcrossprod(cont.solver(D, St), St))^2) / totvar,
error = function(e) {
print(e)
stop("Unable to resolve the components, perhaps 'ncomp' is too large.\n
or initial estimates for spectra are not good enough.", call. = FALSE)
}
)

if ( (var - var_old) < tol) {
if (verbose) cat("No more improvements.\n")
break
Expand Down
20 changes: 12 additions & 8 deletions R/pca.R
Original file line number Diff line number Diff line change
Expand Up @@ -1023,6 +1023,8 @@ pca.cal <- function(x, ncomp, center, scale, method, rand = NULL) {
#' vector with ticks for x-axis
#' @param res
#' list with result objects to show the variance for
#' @param ylab
#' label for y-axis
#' @param ...
#' other plot parameters (see \code{mdaplotg} for details)
#'
Expand All @@ -1031,11 +1033,11 @@ pca.cal <- function(x, ncomp, center, scale, method, rand = NULL) {
#'
#' @export
plotVariance.pca <- function(obj, type = "b", labels = "values", variance = "expvar",
xticks = seq_len(obj$ncomp), res = obj$res, ...) {
xticks = seq_len(obj$ncomp), res = obj$res, ylab = "Explained variance, %", ...) {

res <- getRes(res, "ldecomp")
plot_data <- lapply(res, plotVariance, variance = variance, show.plot = FALSE)
mdaplotg(plot_data, xticks = xticks, labels = labels, type = type, ...)
mdaplotg(plot_data, xticks = xticks, labels = labels, type = type, ylab = ylab, ...)
}

#' Cumulative explained variance plot for PCA model
Expand Down Expand Up @@ -1302,14 +1304,16 @@ plotBiplot.pca <- function(obj, comp = c(1, 2), pch = c(16, NA), col = mdaplot.g
#' what to show as data points labels
#' @param xticks
#' vector with tick values for x-axis
#' @param ylab
#' label for y-axis
#' @param ...
#' other plot parameters (see \code{mdaplotg} for details)
#'
#' @details
#' Work only if parameter \code{lim.type} equal to "ddmoments" or "ddrobust".
#'
#' @export
plotT2DoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$ncomp), ...) {
plotT2DoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$ncomp), ylab = "Nh", ...) {

if (!(obj$lim.type %in% c("ddrobust", "ddmoments", "chisq"))) {
stop("This plot can not be made for selected 'lim.type' method.")
Expand All @@ -1318,8 +1322,7 @@ plotT2DoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$n
plot_data <- mda.subset(obj$T2lim, subset = 4)
attr(plot_data, "name") <- "Degrees of freedom"
attr(plot_data, "xaxis.name") <- attr(obj$loadings, "xaxis.name")
attr(plot_data, "yaxis.name") <- "Nh"
mdaplot(plot_data, xticks = xticks, labels = labels, type = type, ...)
mdaplot(plot_data, xticks = xticks, labels = labels, type = type, ylab = ylab, ...)
}

#' Degrees of freedom plot for orthogonal distance (Nh)
Expand All @@ -1336,14 +1339,16 @@ plotT2DoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$n
#' what to show as data points labels
#' @param xticks
#' vector with tick values for x-axis
#' @param ylab
#' label for y-axis
#' @param ...
#' other plot parameters (see \code{mdaplotg} for details)
#'
#' @details
#' Work only if parameter \code{lim.type} equal to "ddmoments" or "ddrobust".
#'
#' @export
plotQDoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$ncomp), ...) {
plotQDoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$ncomp), ylab = "Nq", ...) {

if (!(obj$lim.type %in% c("ddrobust", "ddmoments", "chisq"))) {
stop("This plot can not be made for selected 'lim.type' method.")
Expand All @@ -1352,8 +1357,7 @@ plotQDoF <- function(obj, type = "b", labels = "values", xticks = seq_len(obj$nc
plot_data <- mda.subset(obj$Qlim, subset = 4)
attr(plot_data, "name") <- "Degrees of freedom"
attr(plot_data, "xaxis.name") <- attr(obj$loadings, "xaxis.name")
attr(plot_data, "yaxis.name") <- "Nq"
mdaplot(plot_data, type = type, labels = labels, xticks = xticks, ...)
mdaplot(plot_data, type = type, labels = labels, xticks = xticks, ylab = ylab, ...)
}

#' Degrees of freedom plot for both distances
Expand Down
13 changes: 10 additions & 3 deletions R/plotseries.R
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ plotseries <- function(data, type, cgroup = NULL, col = NULL, opacity = 1,
ps$xlim <- range(ps$x_values)
ps$ylim <- range(ps$y_values)
class(ps) <- "plotseries"

return(ps)
}

Expand Down Expand Up @@ -126,7 +127,7 @@ preparePlotData <- function(data) {
excluded_cols <- NULL
if (length(attrs$exclcols) > 0) {
excluded_cols <- mda.getexclind(attrs$exclcols, colnames(data), ncol(data))
data <- data[, -excluded_cols, drop = F]
data <- data[, -excluded_cols, drop = FALSE]
attrs$xaxis.values <- attrs$xaxis.values[-excluded_cols]
}

Expand All @@ -140,10 +141,10 @@ preparePlotData <- function(data) {
excluded_rows <- NULL
if (length(attrs$exclrows > 0)) {
excluded_rows <- mda.getexclind(attrs$exclrows, rownames(data), nrow(data))
excluded_data <- data[excluded_rows, , drop = F]
excluded_data <- data[excluded_rows, , drop = FALSE]
excluded_yaxis_values <- attrs$yaxis.values[excluded_rows]

data <- data[-excluded_rows, , drop = F]
data <- data[-excluded_rows, , drop = FALSE]
attrs$yaxis.values <- attrs$yaxis.values[-excluded_rows]
}

Expand Down Expand Up @@ -206,6 +207,12 @@ splitPlotData <- function(data, type) {
)
}


# 0.12.0: yaxis.name must not be used as axis label in line and bar plots
if (type %in% c("b", "l", "e", "h")) {
attrs$yaxis.name = NULL
}

# prepare x-axis values for other types of plots
y_values <- data
x_values <- attrs$xaxis.values
Expand Down
10 changes: 6 additions & 4 deletions R/pls.R
Original file line number Diff line number Diff line change
Expand Up @@ -946,6 +946,8 @@ plotYCumVariance.pls <- function(obj, type = "b", main = "Cumulative variance (Y
#' what to show as labels for plot objects.
#' @param res
#' list with result objects to show the plot for (by defaul, model results are used)
#' @param ylab
#' label for y-axis
#' @param ...
#' other plot parameters (see \code{mdaplotg} for details)
#'
Expand All @@ -954,10 +956,10 @@ plotYCumVariance.pls <- function(obj, type = "b", main = "Cumulative variance (Y
#'
#' @export
plotVariance.pls <- function(obj, decomp = "xdecomp", variance = "expvar", type = "b",
labels = "values", res = obj$res, ...) {
labels = "values", res = obj$res, ylab = "Explained variance, %", ...) {

plot_data <- lapply(res, plotVariance, decomp = decomp, variance = variance, show.plot = FALSE)
mdaplotg(plot_data, labels = labels, type = type, ...)
mdaplotg(plot_data, labels = labels, type = type, ylab = ylab, ...)
}

#' X scores plot for PLS
Expand Down Expand Up @@ -1178,9 +1180,9 @@ plotXYResiduals.pls <- function(obj, ncomp = obj$ncomp.selected, norm = TRUE, lo

# make plot
if (length(plot_data) == 1) {
mdaplot(plot_data[[1]], type = "p", xlim = xlim, ylim = ylim, cgroup = cgroup, ...)
mdaplot(plot_data[[1]], type = "p", xlim = xlim, ylim = ylim, cgroup = cgroup, main = main, ...)
} else {
mdaplotg(plot_data, type = "p", xlim = xlim, ylim = ylim, show.legend = show.legend,
mdaplotg(plot_data, type = "p", xlim = xlim, ylim = ylim, show.legend = show.legend, main = main,
legend.position = legend.position, ...)
}

Expand Down
9 changes: 3 additions & 6 deletions R/plsres.R
Original file line number Diff line number Diff line change
Expand Up @@ -466,7 +466,7 @@ plotXYScores.plsres <- function(obj, ncomp = 1, show.plot = TRUE, ...) {
#'
#' @export
plotXResiduals.plsres <- function(obj, ncomp = obj$ncomp.selected, norm = TRUE, log = FALSE,
main = sprintf("X-residuals (ncomp = %d)", ncomp), ...) {
main = sprintf("X-distances (ncomp = %d)", ncomp), ...) {

if (is.null(obj$xdecomp)) return(invisible(NULL))

Expand All @@ -487,20 +487,17 @@ plotXResiduals.plsres <- function(obj, ncomp = obj$ncomp.selected, norm = TRUE,
#' PLS results (object of class \code{plsres})
#' @param ncomp
#' how many components to use (if NULL - user selected optimal value will be used)
#' @param main
#' main title for the plot
#' @param ...
#' other plot parameters (see \code{mdaplot} for details)
#'
#' @details
#' Proxy for \code{\link{plotResiduals.regres}} function.
#'
#' @export
plotYResiduals.plsres <- function(obj, ncomp = obj$ncomp.selected,
main = sprintf("Y-residuals (ncomp = %d)", ncomp), ...) {
plotYResiduals.plsres <- function(obj, ncomp = obj$ncomp.selected, ...) {

if (is.null(obj$y.ref)) return(invisible(NULL))
return(plotResiduals.regres(obj, ncomp = ncomp, main = main, ...))
return(plotResiduals.regres(obj, ncomp = ncomp, ...))
}


Expand Down
Loading

0 comments on commit 815c4ed

Please sign in to comment.