generate_taxa_test_pair() issue with time.var = NULL in random effect #33

16svale · 2024-02-18T17:59:47Z

Dear developer,
I am encountering an issue when using the generate_taxa_test_pair() function to test for random effect.

When I use the basic linda function as below it works:

linda.obj.all <- linda(asv_table[, ind], meta[ind, ], formula = '~SampleType1 + (1|SSF)',
feature.dat.type = 'count',
prev.filter = 0.1,
mean.abund.filter = 0.004 ,
is.winsor = TRUE, outlier.pct = 0.03,
p.adj.method = "BH", alpha = 0.05)

However, the generate_taxa_test_pair() that should work the same but used as below gives me an error:
#create test.list
test.list <- generate_taxa_test_pair(
data.obj = data.obj,
group.var = "sampletype1", adj.vars = NULL,
subject.var = "SSF", time.var = NULL,
feature.dat.type = "count",
feature.level = "Genus",
prev.filter = 0.1,
abund.filter = 0.004)

Output:
Rule 1 passed: data.obj is a list.
Rule 2 passed: meta.dat is a data.frame.
Rule 3 passed: The row names of feature.tab match the row names of feature.ann.
Rule 4 passed: The column names of feature.tab match the row names of meta.dat.
Rule 5 passed: feature.tab is a matrix.
Rule 6 passed: feature.ann is a matrix.
Please note: The data components should follow base R data.frame and matrix structures, not phyloseq's formal class.
Validation passed.
Note: Passing validation does not guarantee the absence of all data issues. Further data exploration may be needed.
Your data is in raw format ('Raw'). Normalization is crucial for further analyses. Now, 'mStat_normalize_data' function is automatically applying 'TSS' transformation.
Data has been successfully normalized using TSS method.
0 features are filtered!

The filtered data has 98 samples and 49 features that will be tested!

Fit linear mixed effects models ...
Completed.
Error in grepl(pattern = time.var, x = df_name) :
invalid 'pattern' argument

The generate_taxa_test_pair() function seems designed to optionally handle longitudinal data, given it accepts a time.var parameter, which I set to NULL. This design choice suggests it should be capable of running analyses without a time component, but the implementation does not gracefully handle a NULL time.var.

I write here a piece of the core code of the generate_taxa_test_pair() function:

Case where time.var is NULL

  if (is.null(time.var)) {
    if (is.null(group.var)) {
      fixed_effects <- adj.vars_str
      if (is.null(fixed_effects)) {
        fixed_effects <- "1"  # Intercept-only model
      }
    } else {
      if (!is.null(adj.vars_str)) {
        fixed_effects <- paste(adj.vars_str, "+", group.var)
      } else {
        fixed_effects <- group.var
      }
    }
    random_effects <- paste("(1 |", subject.var, ")")

Do you have any solution for this issue?

Thank you in advance I hope I can improve to implement the use of this package.
Best Regards,
Valentina

The text was updated successfully, but these errors were encountered:

16svale · 2024-02-18T18:27:40Z

Is it possible that you should modify the part of the generate_taxa_test_pair function that handles time.var to ensure it only attempts to use time.var in grepl if time.var is not NULL?

A suggestion foe the core code but not sure if it works:

Before using time.var in grepl, check it's not NULL

if (!is.null(time.var) && !is.null(df_name)) {
if (grepl(pattern = time.var, x = df_name)) {
#rest of the code
}
}

cafferychen777 · 2024-02-19T08:40:49Z

Dear Valentina,

Thank you for your detailed description of the issue you encountered while using the generate_taxa_test_pair() function.

Indeed, the generate_taxa_test_pair() function is designed primarily for paired samples, which inherently involves a time component, such as "pre" and "post" measurements. Therefore, the parameter time.var cannot be set to NULL as the function expects this information to appropriately handle the paired nature of the data.

While I understand your suggestion regarding modifying the function to handle NULL time.var gracefully, it contradicts the fundamental design principle of the function, which requires a time component for paired sample analysis.

Thank you for your understanding, and I hope this clarifies the situation. If you have any further questions or need assistance with alternative analyses, please don't hesitate to ask.

Best Regards,

Chen YANG

16svale · 2024-02-19T09:33:52Z

Dear @cafferychen777 ,
I understand your answer.
I am wondering if there is another function within the package that you developed that can be used for this purpose.
I mean to get the same results as if I use the linda function and take into consideration the random effect of only one variable:

linda.obj.all <- linda(asv_table[, ind], meta[ind, ], formula = '~SampleType1 + (1|SSF)',
feature.dat.type = 'count',
prev.filter = 0.1,
mean.abund.filter = 0.004 ,
is.winsor = TRUE, outlier.pct = 0.03,
p.adj.method = "BH", alpha = 0.05)

or as the LinDa documentation example report:
linda.obj <- linda(otu.tab, meta, formula = '~Smoke + (1|SubjectID)', alpha = 0.1,
prev.cut = 0.1, lib.cut = 1000, winsor.quan = 0.97)

I believe it must be possible to implement a random effect variable even if is not a timepoint.

Best,
Valentina

cafferychen777 · 2024-02-19T12:17:39Z

Dear Valentina,

Thank you for your follow-up question.

I understand your desire to replicate the functionality of the linda function while considering a random effect that is not explicitly a timepoint. However, as the linda function is maintained by other members of our team, any modifications to its functionality would require coordination and discussion among team members.

I apologize for any inconvenience this may cause. I will communicate your request to the relevant team members and discuss the possibility of implementing such functionality in future updates of the package.

Thank you for your understanding and patience in this matter. If you have any further questions or concerns, please feel free to reach out.

Best Regards,

Chen YANG

16svale · 2024-02-19T14:26:53Z

Dear Chen Yang,

I understand. However, the linda function already allows the inclusion of a random effect variable that is not explicitly a timepoint (as shown in the documentation of linda() function). Is it correct?

Best Regards.

16svale added the bug Something isn't working label Feb 18, 2024

16svale changed the title ~~Issue with random effect in the~~ Issue with random effect in the generate_taxa_test_paired() Feb 18, 2024

16svale changed the title ~~Issue with random effect in the generate_taxa_test_paired()~~ Issue with random effect in the generate_taxa_test_pair() Feb 18, 2024

16svale changed the title ~~Issue with random effect in the generate_taxa_test_pair()~~ generate_taxa_test_pair() issue with time.var = NULL in random effect Feb 18, 2024

cafferychen777 added Not a Bug - Solution Provided This behavior is intentional as part of our design. However, we've provided a solution to accommodat and removed bug Something isn't working labels Feb 19, 2024

cafferychen777 added the By Design This behavior is intentional and is part of the package's design. It's not considered a bug or an is label Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generate_taxa_test_pair() issue with time.var = NULL in random effect #33

generate_taxa_test_pair() issue with time.var = NULL in random effect #33

16svale commented Feb 18, 2024 •

edited

Loading

16svale commented Feb 18, 2024

cafferychen777 commented Feb 19, 2024

16svale commented Feb 19, 2024 •

edited

Loading

cafferychen777 commented Feb 19, 2024

16svale commented Feb 19, 2024

generate_taxa_test_pair() issue with time.var = NULL in random effect #33

generate_taxa_test_pair() issue with time.var = NULL in random effect #33

Comments

16svale commented Feb 18, 2024 • edited Loading

Case where time.var is NULL

16svale commented Feb 18, 2024

Before using time.var in grepl, check it's not NULL

cafferychen777 commented Feb 19, 2024

16svale commented Feb 19, 2024 • edited Loading

cafferychen777 commented Feb 19, 2024

16svale commented Feb 19, 2024

16svale commented Feb 18, 2024 •

edited

Loading

16svale commented Feb 19, 2024 •

edited

Loading