Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using BatchtoolsParam() in fitGAM() #261

Open
Alexis-Varin opened this issue Jul 4, 2024 · 0 comments
Open

Using BatchtoolsParam() in fitGAM() #261

Alexis-Varin opened this issue Jul 4, 2024 · 0 comments

Comments

@Alexis-Varin
Copy link

Alexis-Varin commented Jul 4, 2024

Hello, I am opening this issue as a follow up from #43 as I am encountering the exact same problems as mentioned here with BatchtoolsParam() in fitGAM() on a slurm cluster and can bring some insight after some tests

the error names(res) <- nms is because res is length of cells and nms is length of genes. To test this I had an sds object and counts matrix of 100 genes and 37 cells and got the error

Error in names(res) <- nms :
'names' attribute [100] must be the same length as the vector [37]

When increasing to 200 genes it became

Error in names(res) <- nms :
'names' attribute [200] must be the same length as the vector [37]

Then I tried 37 cells and 37 genes and it gave the same second error as trebbiano :

Error: BiocParallel errors
4 remote errors, element index: 1, 3, 4, 12
33 unevaluated and other errors
first remote error:
Error in FUN(...): unused arguments (AL627309.1 = 0, AL669831.5 = 0, FAM41C = 0, AL645608.1 = 0, NOC2L = 0, KLHL17 = 0, PLEKHN1 = 0, AL645608.8 = 0, HES4 = 0, ISG15 = 1, AL645608.2 = 0, AGRN = 0, C1orf159 = 0, AL390719.2 = 0, TNFRSF18 = 0, TNFRSF4 = 0, SDF4 = 0, B3GALT6 = 0, C1QTNF12 = 0, UBE2J2 = 0, SCNN1D = 0, ACAP3 = 0, PUSL1 = 0, INTS11 = 0, CPTP = 0, TAS1R3 = 0, DVL1 = 0, AURKAIP1 = 1, CCNL2 = 0, MRPL20 = 2, ANKRD65 = 0, VWA1 = 0, ATAD3C = 0, ATAD3B = 0, ATAD3A = 0, TMEM240 = 0, SSU72 = 0)

Now I am pretty sure this is purely an error on how batchtools sends jobs and collects them, as using any other option such as MulticoreParam() does not throw any error, I think it might come from the fact that in bplapply(), .fitGAM() sets the first parameter as as.data.frame(t(as.matrix(counts)[id, ])) while in most use case of bplapply() it is a simple integer vector such as 1:6 and I think that it completely throws off batchtools called by BiocParallel.

I am entirely dependent on bioconda for all packages version unfortunately, versions I use are
r-batchtools 0.9.17
bioconductor-biocparallel 1.36.0
bioconductor-tradeseq 1.16.0

Which look like they are the latest as these versions are the same in RStudio

What seems very strange is that while my object has for example 100 genes and 37 cells, I would expect that 100 jobs would be submitted to the cluster, 1 for each GAM fitting, instead it submits 37 jobs, 1 for each cell.

@Alexis-Varin Alexis-Varin changed the title Using BatchtoolsParam in fitGAM() Using BatchtoolsParam() in fitGAM() Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant