-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SparseOptimization pattern discrepancy #77
Comments
Are you filtering genes with zero expression? It’s notable that ChiSq is negative.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: rpalaganas ***@***.***>
Sent: Wednesday, January 10, 2024 12:22:27 PM
To: FertigLab/CoGAPS ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [FertigLab/CoGAPS] SparseOptimization pattern discrepancy (Issue #77)
Good afternoon! I recently ran into an issue where there is pattern discrepancy between runs with sparseOptimization set to TRUE versus FALSE. The code I ran and the output is below. With sparseOptimization set to TRUE I noticed that the ChiSq value was -nan and during the equilibration phase, the P matrix was 0. With sparseOptimization set to FALSE there seemed to be no problems, however the number of patterns learned differed in either case, i.e. SparseOptimization = TRUE gave 5 patterns while SparseOptimization = FALSE gave 6 patterns. This was true for a range of patterns that I ran (5-50)
SPARSE OPTIMIZATION ENABLED
params <- CogapsParams(nPatterns=5, nIterations=30000, seed=42,
sparseOptimization=TRUE,
distributed="genome-wide")
params <- setDistributedParams(params, nSets=6)
Hoxd10_matnp5 <- CoGAPS(Hoxd10_mat, params)
This is CoGAPS version 3.19.1
Running genome-wide CoGAPS on Hoxd10_mat (30407 genes and 380 samples) with parameters:
-- Standard Parameters --
nPatterns 5
nIterations 30000
seed 42
sparseOptimization TRUE
distributed genome-wide
-- Sparsity Parameters --
alpha 0.01
maxGibbsMass 100
-- Distributed CoGAPS Parameters --
nSets 6
cut 5
minNS 3
maxNS 9
Creating subsets...
set sizes (min, mean, max): (5067, 5067.833, 5072)
Running Across Subsets...
Data Model: Sparse, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
worker 1 is starting!
worker 2 is starting!
worker 4 is starting!
worker 6 is starting!
worker 3 is starting!
worker 5 is starting!
-- Equilibration Phase --
1000 of 30000, Atoms: 13376(A), 1242(P), ChiSq: -nan, Time: 00:00:45 / 01:16:13
2000 of 30000, Atoms: 16106(A), 1260(P), ChiSq: -nan, Time: 00:01:49 / 01:22:36
3000 of 30000, Atoms: 17128(A), 1283(P), ChiSq: -nan, Time: 00:02:55 / 01:23:17
4000 of 30000, Atoms: 17792(A), 1349(P), ChiSq: -nan, Time: 00:04:02 / 01:22:58
5000 of 30000, Atoms: 18330(A), 1347(P), ChiSq: -nan, Time: 00:05:11 / 01:22:46
6000 of 30000, Atoms: 18639(A), 1351(P), ChiSq: -nan, Time: 00:06:20 / 01:22:16
7000 of 30000, Atoms: 19052(A), 1371(P), ChiSq: -nan, Time: 00:07:30 / 01:21:52
8000 of 30000, Atoms: 19157(A), 1356(P), ChiSq: -nan, Time: 00:08:42 / 01:21:42
9000 of 30000, Atoms: 19484(A), 1406(P), ChiSq: -nan, Time: 00:09:53 / 01:21:18
10000 of 30000, Atoms: 19616(A), 1439(P), ChiSq: -nan, Time: 00:11:05 / 01:21:01
11000 of 30000, Atoms: 19909(A), 1431(P), ChiSq: -nan, Time: 00:12:18 / 01:20:47
12000 of 30000, Atoms: 20044(A), 1442(P), ChiSq: -nan, Time: 00:13:31 / 01:20:32
13000 of 30000, Atoms: 20212(A), 1431(P), ChiSq: -nan, Time: 00:14:44 / 01:20:16
14000 of 30000, Atoms: 20338(A), 1433(P), ChiSq: -nan, Time: 00:15:57 / 01:19:59
15000 of 30000, Atoms: 20592(A), 1425(P), ChiSq: -nan, Time: 00:17:10 / 01:19:43
16000 of 30000, Atoms: 20532(A), 1439(P), ChiSq: -nan, Time: 00:18:24 / 01:19:30
17000 of 30000, Atoms: 20538(A), 1411(P), ChiSq: -nan, Time: 00:19:39 / 01:19:21
18000 of 30000, Atoms: 20627(A), 1413(P), ChiSq: -nan, Time: 00:20:54 / 01:19:12
19000 of 30000, Atoms: 20681(A), 1416(P), ChiSq: -nan, Time: 00:22:08 / 01:18:58
20000 of 30000, Atoms: 20596(A), 1430(P), ChiSq: -nan, Time: 00:23:21 / 01:18:41
21000 of 30000, Atoms: 20505(A), 1448(P), ChiSq: -nan, Time: 00:24:35 / 01:18:28
22000 of 30000, Atoms: 20471(A), 1451(P), ChiSq: -nan, Time: 00:25:49 / 01:18:15
23000 of 30000, Atoms: 20642(A), 1431(P), ChiSq: -nan, Time: 00:27:03 / 01:18:02
24000 of 30000, Atoms: 20576(A), 1432(P), ChiSq: -nan, Time: 00:28:16 / 01:17:47
25000 of 30000, Atoms: 20688(A), 1430(P), ChiSq: -nan, Time: 00:29:30 / 01:17:35
26000 of 30000, Atoms: 20671(A), 1434(P), ChiSq: -nan, Time: 00:30:44 / 01:17:23
27000 of 30000, Atoms: 20618(A), 1447(P), ChiSq: -nan, Time: 00:31:58 / 01:17:12
28000 of 30000, Atoms: 20643(A), 1434(P), ChiSq: -nan, Time: 00:33:12 / 01:17:00
29000 of 30000, Atoms: 20711(A), 1422(P), ChiSq: -nan, Time: 00:34:26 / 01:16:49
30000 of 30000, Atoms: 20636(A), 1461(P), ChiSq: -nan, Time: 00:35:40 / 01:16:38
-- Sampling Phase --
1000 of 30000, Atoms: 20671(A), 1460(P), ChiSq: -nan, Time: 00:36:54 / 01:16:28
2000 of 30000, Atoms: 20618(A), 1465(P), ChiSq: -nan, Time: 00:38:09 / 01:16:19
3000 of 30000, Atoms: 20494(A), 1442(P), ChiSq: -nan, Time: 00:39:24 / 01:16:11
4000 of 30000, Atoms: 20716(A), 1466(P), ChiSq: -nan, Time: 00:40:38 / 01:16:01
5000 of 30000, Atoms: 20628(A), 1434(P), ChiSq: -nan, Time: 00:41:55 / 01:15:57
6000 of 30000, Atoms: 20625(A), 1449(P), ChiSq: -nan, Time: 00:43:13 / 01:15:54
7000 of 30000, Atoms: 20637(A), 1447(P), ChiSq: -nan, Time: 00:44:31 / 01:15:51
8000 of 30000, Atoms: 20557(A), 1478(P), ChiSq: -nan, Time: 00:45:50 / 01:15:49
9000 of 30000, Atoms: 20707(A), 1485(P), ChiSq: -nan, Time: 00:47:08 / 01:15:46
10000 of 30000, Atoms: 20689(A), 1438(P), ChiSq: -nan, Time: 00:48:25 / 01:15:41
11000 of 30000, Atoms: 20825(A), 1465(P), ChiSq: -nan, Time: 00:49:40 / 01:15:33
12000 of 30000, Atoms: 20607(A), 1460(P), ChiSq: -nan, Time: 00:50:55 / 01:15:25
13000 of 30000, Atoms: 20588(A), 1446(P), ChiSq: -nan, Time: 00:52:10 / 01:15:18
14000 of 30000, Atoms: 20595(A), 1443(P), ChiSq: -nan, Time: 00:53:24 / 01:15:09
15000 of 30000, Atoms: 20624(A), 1428(P), ChiSq: -nan, Time: 00:54:39 / 01:15:01
16000 of 30000, Atoms: 20586(A), 1435(P), ChiSq: -nan, Time: 00:55:53 / 01:14:52
17000 of 30000, Atoms: 20684(A), 1440(P), ChiSq: -nan, Time: 00:57:08 / 01:14:45
18000 of 30000, Atoms: 20730(A), 1456(P), ChiSq: -nan, Time: 00:58:23 / 01:14:38
19000 of 30000, Atoms: 20796(A), 1470(P), ChiSq: -nan, Time: 00:59:39 / 01:14:33
20000 of 30000, Atoms: 20701(A), 1493(P), ChiSq: -nan, Time: 01:00:53 / 01:14:25
21000 of 30000, Atoms: 20613(A), 1461(P), ChiSq: -nan, Time: 01:02:08 / 01:14:18
22000 of 30000, Atoms: 20701(A), 1486(P), ChiSq: -nan, Time: 01:03:24 / 01:14:13
23000 of 30000, Atoms: 20688(A), 1463(P), ChiSq: -nan, Time: 01:04:39 / 01:14:06
24000 of 30000, Atoms: 20581(A), 1466(P), ChiSq: -nan, Time: 01:05:53 / 01:13:59
25000 of 30000, Atoms: 20649(A), 1463(P), ChiSq: -nan, Time: 01:07:08 / 01:13:52
26000 of 30000, Atoms: 20539(A), 1469(P), ChiSq: -nan, Time: 01:08:23 / 01:13:46
27000 of 30000, Atoms: 20712(A), 1462(P), ChiSq: -nan, Time: 01:09:37 / 01:13:39
28000 of 30000, Atoms: 20668(A), 1479(P), ChiSq: -nan, Time: 01:10:52 / 01:13:33
29000 of 30000, Atoms: 20645(A), 1469(P), ChiSq: -nan, Time: 01:12:07 / 01:13:27
worker 2 is finished! Time: 01:12:22
30000 of 30000, Atoms: 20670(A), 1484(P), ChiSq: -nan, Time: 01:13:21 / 01:13:21
worker 1 is finished! Time: 01:13:21
worker 3 is finished! Time: 01:13:24
worker 5 is finished! Time: 01:15:26
worker 4 is finished! Time: 01:15:26
worker 6 is finished! Time: 01:19:08
Matching Patterns Across Subsets...
Running Final Stage...
Data Model: Sparse, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
worker 1 is starting!
worker 2 is starting!
worker 6 is starting!
worker 4 is starting!
worker 3 is starting!
worker 5 is starting!
-- Equilibration Phase --
1000 of 30000, Atoms: 10022(A), 0(P), ChiSq: -nan, Time: 00:00:27 / 00:45:43
2000 of 30000, Atoms: 11479(A), 0(P), ChiSq: -nan, Time: 00:01:03 / 00:47:44
3000 of 30000, Atoms: 12276(A), 0(P), ChiSq: -nan, Time: 00:01:41 / 00:48:04
4000 of 30000, Atoms: 12769(A), 0(P), ChiSq: -nan, Time: 00:02:20 / 00:48:00
5000 of 30000, Atoms: 13197(A), 0(P), ChiSq: -nan, Time: 00:03:00 / 00:47:54
6000 of 30000, Atoms: 13532(A), 0(P), ChiSq: -nan, Time: 00:03:41 / 00:47:51
7000 of 30000, Atoms: 13666(A), 0(P), ChiSq: -nan, Time: 00:04:22 / 00:47:40
8000 of 30000, Atoms: 13951(A), 0(P), ChiSq: -nan, Time: 00:05:04 / 00:47:35
9000 of 30000, Atoms: 14232(A), 0(P), ChiSq: -nan, Time: 00:05:47 / 00:47:34
10000 of 30000, Atoms: 14359(A), 0(P), ChiSq: -nan, Time: 00:06:29 / 00:47:23
11000 of 30000, Atoms: 14621(A), 0(P), ChiSq: -nan, Time: 00:07:12 / 00:47:17
12000 of 30000, Atoms: 14662(A), 0(P), ChiSq: -nan, Time: 00:07:56 / 00:47:16
13000 of 30000, Atoms: 14816(A), 0(P), ChiSq: -nan, Time: 00:08:39 / 00:47:07
14000 of 30000, Atoms: 14864(A), 0(P), ChiSq: -nan, Time: 00:09:25 / 00:47:13
15000 of 30000, Atoms: 15042(A), 0(P), ChiSq: -nan, Time: 00:10:18 / 00:47:49
16000 of 30000, Atoms: 15118(A), 0(P), ChiSq: -nan, Time: 00:11:12 / 00:48:23
17000 of 30000, Atoms: 15167(A), 0(P), ChiSq: -nan, Time: 00:12:05 / 00:48:48
18000 of 30000, Atoms: 15174(A), 0(P), ChiSq: -nan, Time: 00:12:59 / 00:49:12
19000 of 30000, Atoms: 15163(A), 0(P), ChiSq: -nan, Time: 00:13:52 / 00:49:28
20000 of 30000, Atoms: 15057(A), 0(P), ChiSq: -nan, Time: 00:14:45 / 00:49:42
21000 of 30000, Atoms: 15151(A), 0(P), ChiSq: -nan, Time: 00:15:37 / 00:49:51
22000 of 30000, Atoms: 15116(A), 0(P), ChiSq: -nan, Time: 00:16:29 / 00:49:58
23000 of 30000, Atoms: 14997(A), 0(P), ChiSq: -nan, Time: 00:17:20 / 00:50:00
24000 of 30000, Atoms: 15199(A), 0(P), ChiSq: -nan, Time: 00:18:11 / 00:50:02
25000 of 30000, Atoms: 15141(A), 0(P), ChiSq: -nan, Time: 00:19:02 / 00:50:03
26000 of 30000, Atoms: 15071(A), 0(P), ChiSq: -nan, Time: 00:19:46 / 00:49:46
27000 of 30000, Atoms: 15179(A), 0(P), ChiSq: -nan, Time: 00:20:31 / 00:49:32
28000 of 30000, Atoms: 15099(A), 0(P), ChiSq: -nan, Time: 00:21:15 / 00:49:17
29000 of 30000, Atoms: 15177(A), 0(P), ChiSq: -nan, Time: 00:22:00 / 00:49:05
30000 of 30000, Atoms: 15126(A), 0(P), ChiSq: -nan, Time: 00:22:44 / 00:48:51
-- Sampling Phase --
1000 of 30000, Atoms: 15203(A), 0(P), ChiSq: -nan, Time: 00:23:29 / 00:48:39
2000 of 30000, Atoms: 15156(A), 0(P), ChiSq: -nan, Time: 00:24:14 / 00:48:29
3000 of 30000, Atoms: 15221(A), 0(P), ChiSq: -nan, Time: 00:24:58 / 00:48:16
4000 of 30000, Atoms: 15172(A), 0(P), ChiSq: -nan, Time: 00:25:43 / 00:48:06
5000 of 30000, Atoms: 15299(A), 0(P), ChiSq: -nan, Time: 00:26:28 / 00:47:57
6000 of 30000, Atoms: 15111(A), 0(P), ChiSq: -nan, Time: 00:27:13 / 00:47:48
7000 of 30000, Atoms: 15172(A), 0(P), ChiSq: -nan, Time: 00:27:58 / 00:47:39
8000 of 30000, Atoms: 15091(A), 0(P), ChiSq: -nan, Time: 00:28:42 / 00:47:29
9000 of 30000, Atoms: 15083(A), 0(P), ChiSq: -nan, Time: 00:29:27 / 00:47:20
10000 of 30000, Atoms: 15126(A), 0(P), ChiSq: -nan, Time: 00:30:12 / 00:47:12
11000 of 30000, Atoms: 15115(A), 0(P), ChiSq: -nan, Time: 00:30:56 / 00:47:03
12000 of 30000, Atoms: 15152(A), 0(P), ChiSq: -nan, Time: 00:31:46 / 00:47:03
13000 of 30000, Atoms: 15181(A), 0(P), ChiSq: -nan, Time: 00:32:40 / 00:47:09
14000 of 30000, Atoms: 15125(A), 0(P), ChiSq: -nan, Time: 00:33:34 / 00:47:14
15000 of 30000, Atoms: 15193(A), 0(P), ChiSq: -nan, Time: 00:34:28 / 00:47:19
16000 of 30000, Atoms: 15146(A), 0(P), ChiSq: -nan, Time: 00:35:21 / 00:47:22
17000 of 30000, Atoms: 15143(A), 0(P), ChiSq: -nan, Time: 00:36:15 / 00:47:26
18000 of 30000, Atoms: 15155(A), 0(P), ChiSq: -nan, Time: 00:37:07 / 00:47:27
19000 of 30000, Atoms: 15201(A), 0(P), ChiSq: -nan, Time: 00:38:00 / 00:47:29
20000 of 30000, Atoms: 15142(A), 0(P), ChiSq: -nan, Time: 00:38:52 / 00:47:30
21000 of 30000, Atoms: 15243(A), 0(P), ChiSq: -nan, Time: 00:39:43 / 00:47:29
22000 of 30000, Atoms: 15220(A), 0(P), ChiSq: -nan, Time: 00:40:35 / 00:47:30
23000 of 30000, Atoms: 15173(A), 0(P), ChiSq: -nan, Time: 00:41:26 / 00:47:29
24000 of 30000, Atoms: 15192(A), 0(P), ChiSq: -nan, Time: 00:42:16 / 00:47:27
25000 of 30000, Atoms: 15186(A), 0(P), ChiSq: -nan, Time: 00:43:06 / 00:47:25
26000 of 30000, Atoms: 15160(A), 0(P), ChiSq: -nan, Time: 00:43:55 / 00:47:22
27000 of 30000, Atoms: 15284(A), 0(P), ChiSq: -nan, Time: 00:44:45 / 00:47:20
worker 3 is finished! Time: 00:45:34
28000 of 30000, Atoms: 15238(A), 0(P), ChiSq: -nan, Time: 00:45:35 / 00:47:18
worker 4 is finished! Time: 00:46:23
29000 of 30000, Atoms: 15219(A), 0(P), ChiSq: -nan, Time: 00:46:24 / 00:47:16
worker 6 is finished! Time: 00:47:10
30000 of 30000, Atoms: 15174(A), 0(P), ChiSq: -nan, Time: 00:47:13 / 00:47:13
worker 1 is finished! Time: 00:47:13
worker 2 is finished! Time: 00:47:28
worker 5 is finished! Time: 00:47:34
Warning message:
In checkInputs(data, uncertainty, allParams) :
running distributed cogaps without mtx/tsv/csv/gct data
SPARSE OPTIMIZATION DISABLED
params <- CogapsParams(nPatterns=5, nIterations=30000, seed=42,
distributed="genome-wide")
params <- setDistributedParams(params, nSets=6)
Hoxd10_matnp5 <- CoGAPS(Hoxd10_mat, params)
This is CoGAPS version 3.19.1
Running genome-wide CoGAPS on Hoxd10_mat (30407 genes and 380 samples) with parameters:
-- Standard Parameters --
nPatterns 5
nIterations 30000
seed 42
sparseOptimization FALSE
distributed genome-wide
-- Sparsity Parameters --
alpha 0.01
maxGibbsMass 100
-- Distributed CoGAPS Parameters --
nSets 6
cut 5
minNS 3
maxNS 9
Creating subsets...
set sizes (min, mean, max): (5067, 5067.833, 5072)
Running Across Subsets...
worker 2 is starting!
worker 3 is starting!
Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
worker 1 is starting!
worker 4 is starting!
worker 5 is starting!
worker 6 is starting!
-- Equilibration Phase --
1000 of 30000, Atoms: 4665(A), 966(P), ChiSq: 5137063, Time: 00:01:16 / 02:08:43
2000 of 30000, Atoms: 5336(A), 1572(P), ChiSq: 4995452, Time: 00:02:48 / 02:07:18
3000 of 30000, Atoms: 6053(A), 1918(P), ChiSq: 4954336, Time: 00:04:33 / 02:09:55
4000 of 30000, Atoms: 6773(A), 2085(P), ChiSq: 4934452, Time: 00:06:21 / 02:10:37
5000 of 30000, Atoms: 7298(A), 2185(P), ChiSq: 4922336, Time: 00:07:57 / 02:06:56
6000 of 30000, Atoms: 7738(A), 2254(P), ChiSq: 4913117, Time: 00:09:34 / 02:04:17
7000 of 30000, Atoms: 7975(A), 2267(P), ChiSq: 4907556, Time: 00:11:13 / 02:02:27
8000 of 30000, Atoms: 8336(A), 2298(P), ChiSq: 4902574, Time: 00:12:52 / 02:00:51
9000 of 30000, Atoms: 8594(A), 2319(P), ChiSq: 4899279, Time: 00:14:31 / 01:59:26
10000 of 30000, Atoms: 8896(A), 2391(P), ChiSq: 4896176, Time: 00:16:12 / 01:58:25
11000 of 30000, Atoms: 9176(A), 2420(P), ChiSq: 4893982, Time: 00:17:54 / 01:57:35
12000 of 30000, Atoms: 9445(A), 2430(P), ChiSq: 4891372, Time: 00:19:36 / 01:56:47
13000 of 30000, Atoms: 9672(A), 2456(P), ChiSq: 4890006, Time: 00:21:18 / 01:56:03
14000 of 30000, Atoms: 9830(A), 2482(P), ChiSq: 4888700, Time: 00:23:02 / 01:55:31
15000 of 30000, Atoms: 9974(A), 2487(P), ChiSq: 4886906, Time: 00:24:45 / 01:54:56
16000 of 30000, Atoms: 10077(A), 2447(P), ChiSq: 4887154, Time: 00:26:29 / 01:54:26
17000 of 30000, Atoms: 10078(A), 2473(P), ChiSq: 4886879, Time: 00:28:12 / 01:53:53
18000 of 30000, Atoms: 10051(A), 2493(P), ChiSq: 4886432, Time: 00:29:55 / 01:53:22
19000 of 30000, Atoms: 10066(A), 2448(P), ChiSq: 4886908, Time: 00:31:39 / 01:52:56
20000 of 30000, Atoms: 10114(A), 2509(P), ChiSq: 4886625, Time: 00:33:23 / 01:52:30
21000 of 30000, Atoms: 10015(A), 2499(P), ChiSq: 4887112, Time: 00:35:04 / 01:51:56
22000 of 30000, Atoms: 10140(A), 2471(P), ChiSq: 4886580, Time: 00:36:42 / 01:51:15
23000 of 30000, Atoms: 10087(A), 2486(P), ChiSq: 4886636, Time: 00:38:21 / 01:50:39
24000 of 30000, Atoms: 10067(A), 2510(P), ChiSq: 4887080, Time: 00:40:00 / 01:50:05
25000 of 30000, Atoms: 10029(A), 2531(P), ChiSq: 4886377, Time: 00:41:38 / 01:49:30
26000 of 30000, Atoms: 10049(A), 2488(P), ChiSq: 4887044, Time: 00:43:17 / 01:49:00
27000 of 30000, Atoms: 9991(A), 2494(P), ChiSq: 4886824, Time: 00:44:56 / 01:48:31
28000 of 30000, Atoms: 10019(A), 2502(P), ChiSq: 4887262, Time: 00:46:34 / 01:48:01
29000 of 30000, Atoms: 10085(A), 2506(P), ChiSq: 4886958, Time: 00:48:13 / 01:47:34
30000 of 30000, Atoms: 9933(A), 2460(P), ChiSq: 4886798, Time: 00:49:52 / 01:47:09
-- Sampling Phase --
1000 of 30000, Atoms: 10033(A), 2514(P), ChiSq: 4886740, Time: 00:51:31 / 01:46:45
2000 of 30000, Atoms: 9989(A), 2494(P), ChiSq: 4886868, Time: 00:53:10 / 01:46:22
3000 of 30000, Atoms: 10105(A), 2526(P), ChiSq: 4886859, Time: 00:54:49 / 01:46:00
4000 of 30000, Atoms: 10055(A), 2479(P), ChiSq: 4886471, Time: 00:56:28 / 01:45:38
5000 of 30000, Atoms: 10075(A), 2534(P), ChiSq: 4886635, Time: 00:58:07 / 01:45:18
6000 of 30000, Atoms: 10086(A), 2499(P), ChiSq: 4887080, Time: 00:59:46 / 01:44:58
7000 of 30000, Atoms: 10015(A), 2535(P), ChiSq: 4886512, Time: 01:01:25 / 01:44:39
8000 of 30000, Atoms: 10083(A), 2539(P), ChiSq: 4886850, Time: 01:02:47 / 01:43:52
9000 of 30000, Atoms: 10084(A), 2491(P), ChiSq: 4887106, Time: 01:04:08 / 01:43:06
10000 of 30000, Atoms: 9993(A), 2546(P), ChiSq: 4887135, Time: 01:05:29 / 01:42:22
11000 of 30000, Atoms: 10005(A), 2534(P), ChiSq: 4887056, Time: 01:06:50 / 01:41:40
12000 of 30000, Atoms: 10041(A), 2547(P), ChiSq: 4887020, Time: 01:08:11 / 01:41:00
13000 of 30000, Atoms: 10045(A), 2481(P), ChiSq: 4887188, Time: 01:09:32 / 01:40:22
14000 of 30000, Atoms: 10055(A), 2539(P), ChiSq: 4886859, Time: 01:10:53 / 01:39:45
15000 of 30000, Atoms: 10036(A), 2567(P), ChiSq: 4887087, Time: 01:12:14 / 01:39:09
16000 of 30000, Atoms: 9985(A), 2498(P), ChiSq: 4886410, Time: 01:13:36 / 01:38:37
17000 of 30000, Atoms: 10057(A), 2519(P), ChiSq: 4886582, Time: 01:15:04 / 01:38:13
18000 of 30000, Atoms: 10071(A), 2527(P), ChiSq: 4887043, Time: 01:16:32 / 01:37:51
19000 of 30000, Atoms: 10106(A), 2556(P), ChiSq: 4886944, Time: 01:18:00 / 01:37:29
20000 of 30000, Atoms: 10112(A), 2527(P), ChiSq: 4887032, Time: 01:19:28 / 01:37:07
21000 of 30000, Atoms: 10021(A), 2532(P), ChiSq: 4887194, Time: 01:20:56 / 01:36:47
22000 of 30000, Atoms: 10107(A), 2564(P), ChiSq: 4886800, Time: 01:22:24 / 01:36:27
23000 of 30000, Atoms: 10078(A), 2541(P), ChiSq: 4886892, Time: 01:23:52 / 01:36:08
24000 of 30000, Atoms: 10044(A), 2533(P), ChiSq: 4887120, Time: 01:25:19 / 01:35:48
25000 of 30000, Atoms: 10116(A), 2498(P), ChiSq: 4886992, Time: 01:26:47 / 01:35:30
26000 of 30000, Atoms: 10056(A), 2490(P), ChiSq: 4886794, Time: 01:28:15 / 01:35:12
27000 of 30000, Atoms: 10134(A), 2494(P), ChiSq: 4886833, Time: 01:29:42 / 01:34:54
28000 of 30000, Atoms: 9968(A), 2510(P), ChiSq: 4887242, Time: 01:31:10 / 01:34:37
29000 of 30000, Atoms: 10069(A), 2502(P), ChiSq: 4886577, Time: 01:32:37 / 01:34:20
30000 of 30000, Atoms: 9953(A), 2489(P), ChiSq: 4886819, Time: 01:34:05 / 01:34:05
worker 1 is finished! Time: 01:34:05
worker 5 is finished! Time: 01:44:52
worker 4 is finished! Time: 01:54:06
worker 2 is finished! Time: 01:54:29
worker 6 is finished! Time: 01:54:31
worker 3 is finished! Time: 01:54:38
Matching Patterns Across Subsets...
Running Final Stage...
worker 5 is starting!
worker 4 is starting!
worker 3 is starting!
worker 2 is starting!
worker 6 is starting!
Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
worker 1 is starting!
-- Equilibration Phase --
1000 of 30000, Atoms: 5928(A), 0(P), ChiSq: 14908930, Time: 00:00:10 / 00:16:56
2000 of 30000, Atoms: 7023(A), 0(P), ChiSq: 14908930, Time: 00:00:25 / 00:18:56
3000 of 30000, Atoms: 7726(A), 0(P), ChiSq: 14908930, Time: 00:00:41 / 00:19:30
4000 of 30000, Atoms: 8082(A), 0(P), ChiSq: 14908930, Time: 00:00:58 / 00:19:53
5000 of 30000, Atoms: 8496(A), 0(P), ChiSq: 14908930, Time: 00:01:16 / 00:20:13
6000 of 30000, Atoms: 8718(A), 0(P), ChiSq: 14908930, Time: 00:01:34 / 00:20:21
7000 of 30000, Atoms: 8994(A), 0(P), ChiSq: 14908930, Time: 00:01:53 / 00:20:33
8000 of 30000, Atoms: 9211(A), 0(P), ChiSq: 14908930, Time: 00:02:13 / 00:20:49
9000 of 30000, Atoms: 9400(A), 0(P), ChiSq: 14908930, Time: 00:02:33 / 00:20:58
10000 of 30000, Atoms: 9600(A), 0(P), ChiSq: 14908930, Time: 00:02:53 / 00:21:04
11000 of 30000, Atoms: 9735(A), 0(P), ChiSq: 14908930, Time: 00:03:14 / 00:21:14
12000 of 30000, Atoms: 9853(A), 0(P), ChiSq: 14908930, Time: 00:03:35 / 00:21:21
13000 of 30000, Atoms: 10025(A), 0(P), ChiSq: 14908930, Time: 00:03:57 / 00:21:31
14000 of 30000, Atoms: 10229(A), 0(P), ChiSq: 14908930, Time: 00:04:18 / 00:21:34
15000 of 30000, Atoms: 10315(A), 0(P), ChiSq: 14908930, Time: 00:04:40 / 00:21:40
16000 of 30000, Atoms: 10331(A), 0(P), ChiSq: 14908930, Time: 00:05:03 / 00:21:49
17000 of 30000, Atoms: 10359(A), 0(P), ChiSq: 14908930, Time: 00:05:25 / 00:21:52
18000 of 30000, Atoms: 10353(A), 0(P), ChiSq: 14908930, Time: 00:05:47 / 00:21:54
19000 of 30000, Atoms: 10302(A), 0(P), ChiSq: 14908930, Time: 00:06:09 / 00:21:56
20000 of 30000, Atoms: 10407(A), 0(P), ChiSq: 14908930, Time: 00:06:31 / 00:21:57
21000 of 30000, Atoms: 10354(A), 0(P), ChiSq: 14908930, Time: 00:06:53 / 00:21:58
22000 of 30000, Atoms: 10263(A), 0(P), ChiSq: 14908930, Time: 00:07:08 / 00:21:37
23000 of 30000, Atoms: 10294(A), 0(P), ChiSq: 14908930, Time: 00:07:22 / 00:21:15
24000 of 30000, Atoms: 10435(A), 0(P), ChiSq: 14908930, Time: 00:07:34 / 00:20:49
25000 of 30000, Atoms: 10340(A), 0(P), ChiSq: 14908930, Time: 00:07:46 / 00:20:25
26000 of 30000, Atoms: 10369(A), 0(P), ChiSq: 14908930, Time: 00:07:58 / 00:20:03
27000 of 30000, Atoms: 10358(A), 0(P), ChiSq: 14908930, Time: 00:08:10 / 00:19:43
28000 of 30000, Atoms: 10344(A), 0(P), ChiSq: 14908930, Time: 00:08:23 / 00:19:26
29000 of 30000, Atoms: 10374(A), 0(P), ChiSq: 14908930, Time: 00:08:35 / 00:19:09
30000 of 30000, Atoms: 10469(A), 0(P), ChiSq: 14908930, Time: 00:08:47 / 00:18:52
-- Sampling Phase --
1000 of 30000, Atoms: 10403(A), 0(P), ChiSq: 14908930, Time: 00:09:00 / 00:18:39
2000 of 30000, Atoms: 10386(A), 0(P), ChiSq: 14908930, Time: 00:09:13 / 00:18:26
3000 of 30000, Atoms: 10370(A), 0(P), ChiSq: 14908930, Time: 00:09:26 / 00:18:14
4000 of 30000, Atoms: 10378(A), 0(P), ChiSq: 14908930, Time: 00:09:39 / 00:18:03
5000 of 30000, Atoms: 10296(A), 0(P), ChiSq: 14908930, Time: 00:09:52 / 00:17:52
6000 of 30000, Atoms: 10343(A), 0(P), ChiSq: 14908930, Time: 00:10:05 / 00:17:42
7000 of 30000, Atoms: 10357(A), 0(P), ChiSq: 14908930, Time: 00:10:19 / 00:17:34
8000 of 30000, Atoms: 10301(A), 0(P), ChiSq: 14908930, Time: 00:10:31 / 00:17:24
9000 of 30000, Atoms: 10242(A), 0(P), ChiSq: 14908930, Time: 00:10:44 / 00:17:15
10000 of 30000, Atoms: 10355(A), 0(P), ChiSq: 14908930, Time: 00:10:57 / 00:17:07
11000 of 30000, Atoms: 10280(A), 0(P), ChiSq: 14908930, Time: 00:11:10 / 00:16:59
12000 of 30000, Atoms: 10422(A), 0(P), ChiSq: 14908930, Time: 00:11:23 / 00:16:51
13000 of 30000, Atoms: 10369(A), 0(P), ChiSq: 14908930, Time: 00:11:36 / 00:16:44
14000 of 30000, Atoms: 10388(A), 0(P), ChiSq: 14908930, Time: 00:11:49 / 00:16:37
15000 of 30000, Atoms: 10250(A), 0(P), ChiSq: 14908930, Time: 00:12:02 / 00:16:31
16000 of 30000, Atoms: 10434(A), 0(P), ChiSq: 14908930, Time: 00:12:15 / 00:16:24
17000 of 30000, Atoms: 10371(A), 0(P), ChiSq: 14908930, Time: 00:12:28 / 00:16:18
18000 of 30000, Atoms: 10377(A), 0(P), ChiSq: 14908930, Time: 00:12:41 / 00:16:12
19000 of 30000, Atoms: 10382(A), 0(P), ChiSq: 14908930, Time: 00:12:54 / 00:16:07
20000 of 30000, Atoms: 10333(A), 0(P), ChiSq: 14908930, Time: 00:13:07 / 00:16:01
21000 of 30000, Atoms: 10395(A), 0(P), ChiSq: 14908930, Time: 00:13:20 / 00:15:56
22000 of 30000, Atoms: 10385(A), 0(P), ChiSq: 14908930, Time: 00:13:33 / 00:15:51
23000 of 30000, Atoms: 10361(A), 0(P), ChiSq: 14908930, Time: 00:13:46 / 00:15:46
24000 of 30000, Atoms: 10239(A), 0(P), ChiSq: 14908930, Time: 00:13:59 / 00:15:42
25000 of 30000, Atoms: 10390(A), 0(P), ChiSq: 14908930, Time: 00:14:12 / 00:15:37
26000 of 30000, Atoms: 10356(A), 0(P), ChiSq: 14908930, Time: 00:14:25 / 00:15:33
27000 of 30000, Atoms: 10397(A), 0(P), ChiSq: 14908930, Time: 00:14:38 / 00:15:28
28000 of 30000, Atoms: 10272(A), 0(P), ChiSq: 14908930, Time: 00:14:51 / 00:15:24
29000 of 30000, Atoms: 10388(A), 0(P), ChiSq: 14908930, Time: 00:15:04 / 00:15:20
30000 of 30000, Atoms: 10379(A), 0(P), ChiSq: 14908930, Time: 00:15:17 / 00:15:17
worker 1 is finished! Time: 00:15:17
worker 5 is finished! Time: 00:16:29
worker 3 is finished! Time: 00:19:47
worker 2 is finished! Time: 00:20:37
worker 4 is finished! Time: 00:20:38
worker 6 is finished! Time: 00:20:45
Warning message:
In checkInputs(data, uncertainty, allParams) :
running distributed cogaps without mtx/tsv/csv/gct data
After obtaining the patterns, I ran patternMarkers on patterns learned with sparseOptimization = TRUE. When I set threshold = “all”, I would get this error.
test <- patternMarkers_all(Hoxd10_matnp5, threshold = "all")
Error in colnames(markerScores)[apply(markerScores, 1, which.min)] :
invalid subscript type 'list'
This error would not trigger when threshold was set to “cut”.
PatternMarkers worked normally when run on patterns learned without sparseOptimization.
—
Reply to this email directly, view it on GitHub<#77>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AATMMK3XOVXMQNS4N4DXWTLYN3E5HAVCNFSM6AAAAABBVE7NLGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3TIOBTGM4TKMI>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
I did not filter genes with zero expression |
Can you let us know what happens and the difference if you do that?
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: rpalaganas ***@***.***>
Sent: Wednesday, January 10, 2024 1:02:53 PM
To: FertigLab/CoGAPS ***@***.***>
Cc: Elana Fertig ***@***.***>; Comment ***@***.***>
Subject: Re: [FertigLab/CoGAPS] SparseOptimization pattern discrepancy (Issue #77)
I did not filter genes with zero expression
—
Reply to this email directly, view it on GitHub<#77 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AATMMK5SEXEXSCKJ5XYQ7HLYN3JU3AVCNFSM6AAAAABBVE7NLGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVGM2TQNBTGU>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Good morning,
I reran CoGAPS with and without sparse optimization. Looks like it gave a similar result with -nan ChiSq
While running without sparse optimization looked normal.
This time, each run generated the same number of patterns, however the values differed.
PatternMarkers with threshold = 'all' also did not work on the CoGAPS object generated with sparseOptimization = "TRUE". PatternMarkers worked on the object generated without sparse optimization. |
Thanks! We will look into this and get back to you.
UPDATE @dimalvovs deleted quoted rows for readability
|
Chisq is still not nan if we run on exact same dimensions and parameters
|
Making data 50% sparse still runs fine. @rpalaganas what's the sparsity of your data?
|
The sparsity of the matrix that gave the -nans is 0.71. I also do not get -nan ChiSq when testing a matrix that is almost exactly as sparse.
|
I thought that something is wrong in some genes' distributions, and used this fun to remove genes that would yield
afterwards, the chisq is not nan anymore, but the value itself is huge:
compared to results on the same data with sparseOptimization = FALSE:
sparse sampler cannot find a proper solution? btw is that normal for |
hey all-- a few notes regarding this issue |
Is this data more than 80% sparse? |
Slightly less, ~71% sparse |
so, we have technically addressed all the points addressed in the issue report:
The unsolved problem that is motivated by this issue is why the ChisQ is so large for a given dataset compared to a simulated dataset with similar dimensions and sparsity parameters, as demonstrated here. |
Interestingly the boostrapped version of the original data also fails
|
Also sparseOptimization=TRUE fails for non-distributed mode:
|
Interestingly, sampling from a histogram does not fail:
|
the problem is that for some datasets, including the dataset where the error was generated, yield a 0 in the denominator for chi^2 calculation in the following line:
adding 1 to
|
It took some time but it looks we're approaching the resolution. @rpalaganas would you be open to a functional test by installing from the feature branch and running on your dataset? Please do not use |
sure, see below
|
Nice, |
I don't see any issues |
Good afternoon! I recently ran into an issue where there is pattern discrepancy between runs with sparseOptimization set to TRUE versus FALSE. The code I ran and the output is below. With sparseOptimization set to TRUE I noticed that the ChiSq value was -nan and during the equilibration phase, the P matrix was 0. With sparseOptimization set to FALSE there seemed to be no problems, however the number of patterns learned differed in either case, i.e. SparseOptimization = TRUE gave 5 patterns while SparseOptimization = FALSE gave 6 patterns. This was true for a range of patterns that I ran (5-50)
SPARSE OPTIMIZATION ENABLED
SPARSE OPTIMIZATION DISABLED
After obtaining the patterns, I ran patternMarkers on patterns learned with sparseOptimization = TRUE. When I set threshold = “all”, I would get this error.
The text was updated successfully, but these errors were encountered: