Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed inoperable in Docker Hub version #65

Open
dimalvovs opened this issue Nov 27, 2023 · 1 comment
Open

Distributed inoperable in Docker Hub version #65

dimalvovs opened this issue Nov 27, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@dimalvovs
Copy link
Contributor

There is an issue with operation of the Dockerhub version of pycogaps. There is a newer version of docker image available in the ghcr, it would make sense to continue maintaining just one of them.

Steps to reproduce:

  1. pull the image docker pull fertiglab/pycogaps
  2. run the image docker run -it --entrypoint /bin/bash fertiglab/pycogaps
  3. validate that standard version works well:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3
    })

    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test2.py

python3 test2.py 

______      _____       _____   ___  ______  _____ 
| ___ \    /  __ \     |  __ \ / _ \ | ___ \/  ___|
| |_/ /   _| /  \/ ___ | |  \// /_\ \| |_/ /\ `--. 
|  __/ | | | |    / _ \| | __ |  _  ||  __/  `--. |
| |  | |_| | \__/\ (_) | |_\ \| | | || |    /\__/ /
\_|   \__, |\____/\___/ \____/\_| |_/\_|    \____/ 
       __/ |                                       
      |___/             
                                 
                    

-- Standard Parameters --
nPatterns:  3
nIterations:  1000
seed:  0
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

setting distributed parameters - call this again if you change nPatterns
if you wish to perform genome-wide distributed cogaps, please run setParams(params, "distributed", "genome-wide")

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

This is pycogaps version  0.0.1
Running Standard CoGAPS on ModSimData.txt ( 25 genes and 20 samples) with parameters: 

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
-- Equilibration Phase --
1000 of 10000, Atoms: 64(A), 45(P), ChiSq: 1830, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 69(A), 42(P), ChiSq: 1466, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1229, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1212, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 86(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 75(A), 48(P), ChiSq: 1178, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 70(A), 57(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1173, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 79(A), 58(P), ChiSq: 1159, Time: 00:00:00 / 00:00:00
-- Sampling Phase --
1000 of 10000, Atoms: 74(A), 51(P), ChiSq: 1125, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 78(A), 56(P), ChiSq: 1161, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 79(A), 57(P), ChiSq: 1166, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 69(A), 55(P), ChiSq: 1176, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 80(A), 55(P), ChiSq: 1175, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 48(P), ChiSq: 1168, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 73(A), 56(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 72(A), 51(P), ChiSq: 1156, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 75(A), 60(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1179, Time: 00:00:00 / 00:00:00

GapsResult result object with 25 features and 20 samples
3 patterns were learned

TIME: 0.9086663722991943
  1. provide the distributed config file:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3,
        'useSparseOptimization': True,
        'distributed': 'genome-wide'
    })

    params.setDistributedParams(nSets=2)
    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test.py
  1. run the program python3 test.py
  2. Observed output contains an error:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 313, in callInternalCoGAPS
    gapsresult = standardCoGAPS(adata, params, uncertainty, transposeData=params.coparams["transposeData"])
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 166, in standardCoGAPS
    result = GapsResultToAnnData(gapsresultobj, adata, prm)
  File "/home/user/pycogaps-docker/PyCoGAPS/helper_functions.py", line 434, in GapsResultToAnnData
    Pmean = toNumpy(gapsresult.Pmean)[prm.coparams["subsetIndices"], :]
IndexError: index 22 is out of bounds for axis 0 with size 20
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 23, in <module>
    result = CoGAPS(modsimpath, params)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 44, in CoGAPS
    result = distributedCoGAPS(path, params, uncertainty=None)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 197, in distributedCoGAPS
    result = list(result)
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
IndexError: index 22 is out of bounds for axis 0 with size 20
@dimalvovs dimalvovs added the bug Something isn't working label Nov 27, 2023
@tomsing1
Copy link

@dimalvovs Thank you for flagging this bug, and for pointing out the ghcr image. It seems like that image doesn't have a functional vignette_from_args.py script, though:

Traceback (most recent call last):
  File "/pycogaps/vignette_from_args.py", line 45, in <module>
    setParams(params, prm['run_params'])
  File "/pycogaps/PyCoGAPS/parameters.py", line 168, in setParams
    setParam(paramobj, k, v)
  File "/pycogaps/PyCoGAPS/parameters.py", line 247, in setParam
    setattr(paramobj.gaps, whichParam, value)
AttributeError: 'pycogaps.GapsParameters' object has no attribute 'uncertainty'

Do you have advice on how to build / obtain a docker image that can be used to process jobs based on a custom YAML file with parameters? Thanks for any pointers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants