Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clumping problem´s #54

Open
mortensallingolesen opened this issue Aug 10, 2023 · 3 comments
Open

clumping problem´s #54

mortensallingolesen opened this issue Aug 10, 2023 · 3 comments

Comments

@mortensallingolesen
Copy link

Dear Python_convert team
When I run sumstats.py clump, I get the following error, do you have any ideas of what's going wrong?
Call:
./sumstats.py clump
--sumstats ASD_2023.mat.csv
--out result.clump
--force
--clump-p1 0.01
--bfile-chr /ref_1kG_phase3_EUR/chr@
--plink /cm/local/.modulefiles_cache/tools/modulefiles/plink2/1.90beta7

Beginning analysis at Thu Aug 10 11:24:06 2023 by salling, host g-12-g0025
Reading ASD_2023.mat.csv...
Traceback (most recent call last):
File "sumstats.py", line 2290, in
args.func(args, log)
File "sumstats.py", line 1533, in make_clump
df_sumstats = pd.read_csv(args.sumstats, delim_whitespace=True)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 460, in _read
data = parser.read(nrows)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 1198, in read
ret = self._engine.read(nrows)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 2157, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 14 fields in line 15562, saw 15

Analysis finished at Thu Aug 10 11:24:06 2023
Total time elapsed: 0.05s
Traceback (most recent call last):
File "sumstats.py", line 2290, in
args.func(args, log)
File "sumstats.py", line 1533, in make_clump
df_sumstats = pd.read_csv(args.sumstats, delim_whitespace=True)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv
return _read(filepath_or_buffer, kwds)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 460, in _read
data = parser.read(nrows)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 1198, in read
ret = self._engine.read(nrows)
File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 2157, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 14 fields in line 15562, saw 15

@espenhgn
Copy link

Hi @mortensallingolesen;
There could be a file issue preventing pandas from reading the .csv file, such as mixed space and tab characters on line 15562.

In Python, does the following result in the same error?

import pandas as pd
pd.read_csv('ASD_2023.mat.csv', delim_whitespace=True)

You may control this behavior by using delimiter='\t' for tab spaces or delimiter=' ' for spaces as argument rather than delim_whitespace=True, and change this accordingly in the sumstats.py file; cf. https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Or better yet, make sure that the .csv file is correctly formatted.

@zainab-cpu
Copy link

Hello dear! After successfully performing pleioFDR analysis, I proceeded with the clumping step and subsequently converted the result.mat file into result.mat.csv file. but the clumping process did not yield any significant SNPs. I have reviewed the data and the steps I followed, but I am unable to determine why this happened. I would greatly appreciate your assistance in understanding the possible reasons for this outcome and any guidance on how to address this issue. I also did this step on pleiofdr example GWAS summary statistics but didnot get any significant variant. Kindly help me in this regard

@zainab-cpu
Copy link

Dear Python_convert team When I run sumstats.py clump, I get the following error, do you have any ideas of what's going wrong? Call: ./sumstats.py clump --sumstats ASD_2023.mat.csv --out result.clump --force --clump-p1 0.01 --bfile-chr /ref_1kG_phase3_EUR/chr@ --plink /cm/local/.modulefiles_cache/tools/modulefiles/plink2/1.90beta7

Beginning analysis at Thu Aug 10 11:24:06 2023 by salling, host g-12-g0025 Reading ASD_2023.mat.csv... Traceback (most recent call last): File "sumstats.py", line 2290, in args.func(args, log) File "sumstats.py", line 1533, in make_clump df_sumstats = pd.read_csv(args.sumstats, delim_whitespace=True) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv return _read(filepath_or_buffer, kwds) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 460, in _read data = parser.read(nrows) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 1198, in read ret = self._engine.read(nrows) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 2157, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 14 fields in line 15562, saw 15

Analysis finished at Thu Aug 10 11:24:06 2023 Total time elapsed: 0.05s Traceback (most recent call last): File "sumstats.py", line 2290, in args.func(args, log) File "sumstats.py", line 1533, in make_clump df_sumstats = pd.read_csv(args.sumstats, delim_whitespace=True) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 688, in read_csv return _read(filepath_or_buffer, kwds) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 460, in _read data = parser.read(nrows) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 1198, in read ret = self._engine.read(nrows) File "/services/tools/anaconda3/4.4.0/lib/python3.6/site-packages/pandas/io/parsers.py", line 2157, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 14 fields in line 15562, saw 15
you need to enter this command in Python. Edit it according to the path of bfile and PLINK, It will run successfully

python sumstats.py clump
--clump-field FDR
--force
--plink /home/oleksandr/plink/plink
--sumstats ASD_2023/result.mat.csv
--bfile-chr /full/path/to/ref_1kG_phase3_EUR/chr@
--exclude-ranges '6:25119106-33854733' '8:7200000-12500000'
--clump-p1 0.01
--out ASD_2023/result.clump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants