forked from duty-machine/duty-machine
-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
15 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "一个批量做LDblock的脚本" | ||
date: 2024-05-28T10:56:05Z | ||
draft: ["false"] | ||
tags: [ | ||
"fetched", | ||
"育种数据分析之放飞自我" | ||
] | ||
categories: ["Acdemic"] | ||
--- | ||
一个批量做LDblock的脚本 by 育种数据分析之放飞自我 | ||
------ | ||
<div><p data-mpa-powered-by="yiban.io">大家好,我是邓飞。</p><p>今天,有小伙伴在星球询问GWAS分析的显著性位点,如何批量绘制LDblock,之前写过LDblock的安装和使用说明:(<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247492617&idx=1&sn=4a2331c0e29ec4d26f4823c545549fd8&chksm=e90c0b1bde7b820d56cb9fd9ac69f487699ab21efc46c76deca5f32ec5f1d1a70c1023c6ed0e&scene=21#wechat_redirect" textvalue="LDblock绘制连锁不平衡和单体型图" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2">LDblock绘制连锁不平衡和单体型图</a>)</p><p><img data-galleryid="" data-imgfileid="100011933" data-ratio="0.4758241758241758" data-s="300,640" data-src="https://mmbiz.qpic.cn/mmbiz_png/XEicwVA08daBEojUyMqUQMWODohic09hKzibVsEjfDLbGnhSDX7oz7gUsvUYbR1aspJ3osT6CYLb0KQ7lJT4ZgibibQ/640?wx_fmt=png&from=appmsg" data-type="png" data-w="910" src="https://mmbiz.qpic.cn/mmbiz_png/XEicwVA08daBEojUyMqUQMWODohic09hKzibVsEjfDLbGnhSDX7oz7gUsvUYbR1aspJ3osT6CYLb0KQ7lJT4ZgibibQ/640?wx_fmt=png&from=appmsg"></p><p>问题有3个核心点:<br></p><p>1,snp文件总是有^M符号,怎么去除</p><p>2,如何批量对显著位点进行LDblock分析</p><p>3,有没有现成代码</p><p>今天就搞定这个问题。<br></p><p><br></p><p>首先,看一下正常LDblock分析的代码:<br></p><p>LDBlockShow -InVCF hebing1.vcf -Region 3:400986543:401186543 -OutPng -SeleVar 1 <span>--OutPut re3</span></p><p><span>有vcf文件,染色体,物理位置区间,结果文件名称。<br></span></p><p><span>来一个脚本:</span></p><p><span><br></span></p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="bash"><code><span>$ cat ~/bin/plot_ld_plot_vcf.sh </span></code><code><span><span>#!/bin/bash</span></span></code><code><span><br></span></code><code><span><span>if</span> [ <span>$#</span> != 4 ]</span></code><code><span><span>then</span></span></code><code><span> <span>echo</span> <span>$0</span> name.vcf chronome_number location 20000; 第一个是vcf或者vcf.gz文件,第二个是染色体位置,第三个是物理位置,第四个是上下游区间</span></code><code><span> <span>exit</span> 1</span></code><code><span><span>fi</span></span></code><code><span><br></span></code><code><span>vcf=<span>$1</span>;chr=<span>$2</span>;pos=<span>$3</span>;qujian=<span>$4</span></span></code><code><span>st=$(( <span>$pos</span> - <span>$qujian</span>))</span></code><code><span>en=$(( <span>$pos</span> + qujian))</span></code><code><span><br></span></code><code><span>ot=re_<span>${chr}</span>_<span>${pos}</span></span></code><code><span><span>echo</span> <span>$chr</span> <span>$pos</span> <span>$pos</span> >temp.txt</span></code><code><span>~/bin/LDBlockShow -InVCF <span>$vcf</span> -Region <span>${chr}</span>:<span>${st}</span>:<span>${en}</span> -SeleVar 1 -BlockType 2 -NoShowLDist 10000000 -OutPut <span>${ot}</span></span></code><code><span>~/bin/ShowLDSVG -InPreFix <span>${ot}</span> -SpeSNPName temp.txt -OutPut <span>${ot}</span>_a</span></code><code><span><br></span></code></pre></section><p><br></p><p>上面的脚本,给定vcf文件,染色体,物理位置,上下游区间,就会自动分析。<br></p><p>如何批量处理呢?<br></p><p>比如snp.txt文件,如下图所示:</p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="properties"><code><span><span>1</span> <span>280719882</span></span></code><code><span><span>7</span> <span>143547413</span></span></code><code><span><span>2</span> <span>141942814</span></span></code><code><span><span>2</span> <span>141942735</span></span></code><code><span><span>1</span> <span>175504926</span></span></code><code><span><span>10</span> <span>33619582</span></span></code><code><span><span>1</span> <span>245136244</span></span></code><code><span><span>10</span> <span>134034686</span></span></code><code><span><span>1</span> <span>54049220</span></span></code><code><span><span>1</span> <span>285273845</span></span></code><code><span><span>5</span> <span>18240536</span></span></code><code><span><span>5</span> <span>83233487</span></span></code><code><span><span>8</span> <span>123272935</span></span></code><code><span><span>6</span> <span>93263605</span></span></code><code><span><span>10</span> <span>144172316</span></span></code><code><span><span>10</span> <span>3598262</span></span></code><code><span><span>8</span> <span>4766593</span></span></code><code><span><span>5</span> <span>200813276</span></span></code><code><span><span>8</span> <span>160536300</span></span></code><code><span><span>1</span> <span>286271514</span></span></code><code><span><br></span></code></pre></section><p><br></p><p>上下游区间是100kb,那么先将每个snp生成一个命令行,然后运行就行了。<br></p><p>awk走起:<br></p><section><ul><li><li></ul><pre data-lang="nginx"><code><span><span>awk</span> <span>'{print "bash plot_ld_block_vcf.sh ../geno/genotype.vcf", <span>$1</span>,<span>$2</span>,100000}'</span> snp.txt >temp.sh</span></code><code><span><br></span></code></pre></section><p><br></p><p>生成的脚本文件:</p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="properties"><code><span><span>$</span> <span>cat temp.sh </span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 280719882 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 7 143547413 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 2 141942814 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 2 141942735 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 175504926 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 33619582 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 245136244 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 134034686 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 54049220 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 285273845 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 18240536 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 83233487 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 123272935 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 6 93263605 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 144172316 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 3598262 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 4766593 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 200813276 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 160536300 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 286271514 100000</span></span></code><code><span><br></span></code></pre></section><p>然后运行temp.sh文件就可以了:</p><section><ul><li></ul><pre data-lang="css"><code><span><span>bash</span> <span>temp</span><span>.sh</span></span></code></pre></section><p>静等结果就可以了。</p><p>666<br></p><p><strong>推荐阅读:</strong><br></p><p><strong><br></strong></p><p><strong><strong><span>想要更好的学习和交流,快来加入</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247491476&idx=1&sn=8512962ea1264b569db4aa8d15b8712f&chksm=e90ff086de7879908a1fa151e81004993a15650bae3f68e4d518a332beb18a130373c492043e&scene=21#wechat_redirect" textvalue="飞哥的知识星球" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">飞哥的知识星球</a><span>,这是一个生物统计+数量遗传学+GWAS+GS的社区,在这里你可以向飞哥提问、帮你指定学习计划、跟着飞哥一起做实战项目,冲冲冲。</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247491476&idx=1&sn=8512962ea1264b569db4aa8d15b8712f&chksm=e90ff086de7879908a1fa151e81004993a15650bae3f68e4d518a332beb18a130373c492043e&scene=21#wechat_redirect" textvalue="点击这里加入吧:飞哥的学习圈子" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1"><span><strong>点击这里加入吧:</strong></span>飞哥的学习圈子</a></strong></strong></p><p><strong><img data-imgfileid="100011936" data-ratio="1.2462264150943396" data-src="https://mmbiz.qpic.cn/mmbiz_jpg/XEicwVA08daCyoYQooxoH1tCzgN5xxevlQokBib7RiaStK9nPw4qo80OfqYAuRHsPAU4DJ2StOCUnHicqZNuG4fibow/640?wx_fmt=other&wxfrom=5&wx_lazy=1&wx_co=1&tp=webp" data-w="1060" src="https://mmbiz.qpic.cn/mmbiz_jpg/XEicwVA08daCyoYQooxoH1tCzgN5xxevlQokBib7RiaStK9nPw4qo80OfqYAuRHsPAU4DJ2StOCUnHicqZNuG4fibow/640?wx_fmt=other&wxfrom=5&wx_lazy=1&wx_co=1&tp=webp"></strong></p><p><br></p><p>1,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247489866&idx=1&sn=e8631d343a33ae956b23d97a6b2b3fe4&chksm=e90ff658de787f4e3b52929c85adadbe8c6f54bf0943780ea94f9e19e4cc6ef8756e068761f9&scene=21#wechat_redirect" textvalue="快来领取 | 飞哥的GWAS分析教程" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">快来领取 | 飞哥的GWAS分析教程</a></p><p><br></p><p>2,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247489124&idx=1&sn=923b2d827f0e4b6c70494ebd458bdfa5&chksm=e90ff976de78706065f72955113db192e9f74682c503dc69c3ebf8d17e9b8726db82be1ffa0a&scene=21#wechat_redirect" textvalue="飞哥汇总 | 入门数据分析资源推荐" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">飞哥汇总 | 入门数据分析资源推荐</a><br></p><p><br></p><p>3,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247488460&idx=3&sn=b3acd30a2ab34e10415dacc90f596e4a&chksm=e90ffcdede7875c8ba669382e38e91f5d1e10960448a1b88dbbe60bde1e5c2880e8ab918e914&scene=21#wechat_redirect" textvalue="数量遗传学,分享几本书的电子版" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">数量遗传学,分享几本书的电子版</a></p><p><br></p><p>4,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247490492&idx=1&sn=24434d7598cf1993bbece938b63cc7e8&chksm=e90ff4aede787db837d5de895a24bb0cc055868d4345a5974a24534827aaff9882c86bbe6550&scene=21#wechat_redirect" textvalue="R语言学习看最新版的电子书不香嘛?" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">R语言学习看最新版的电子书不香嘛?</a></p><p><br></p><p><span>5,</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&mid=2247490897&idx=1&sn=2836fdb9bf636213d2ac9438c5d4d575&chksm=e90ff243de787b55c501cf2ecc5bf3eb23f462b303ed4c861f0ade5c9d34741d338f18db3976&scene=21#wechat_redirect" textvalue="书籍及配套代码领取--统计遗传分析导论" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1"><span>书籍及配套代码领取--统计遗传分析导论</span></a></p><p><mp-style-type data-value="3"></mp-style-type></p></div> | ||
<hr> | ||
<a href="https://mp.weixin.qq.com/s/TO-Q65AA90mSQ_NM7yg2xg",target="_blank" rel="noopener noreferrer">原文链接</a> |