Skip to content

Commit

Permalink
一个批量做LDblock的脚本
Browse files Browse the repository at this point in the history
  • Loading branch information
ixxmu committed May 28, 2024
1 parent 4da76ce commit f81e8b2
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions docs/2024-05/一个批量做LDblock的脚本.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "一个批量做LDblock的脚本"
date: 2024-05-28T10:56:05Z
draft: ["false"]
tags: [
"fetched",
"育种数据分析之放飞自我"
]
categories: ["Acdemic"]
---
一个批量做LDblock的脚本 by 育种数据分析之放飞自我
------
<div><p data-mpa-powered-by="yiban.io">大家好,我是邓飞。</p><p>今天,有小伙伴在星球询问GWAS分析的显著性位点,如何批量绘制LDblock,之前写过LDblock的安装和使用说明:(<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247492617&amp;idx=1&amp;sn=4a2331c0e29ec4d26f4823c545549fd8&amp;chksm=e90c0b1bde7b820d56cb9fd9ac69f487699ab21efc46c76deca5f32ec5f1d1a70c1023c6ed0e&amp;scene=21#wechat_redirect" textvalue="LDblock绘制连锁不平衡和单体型图" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2">LDblock绘制连锁不平衡和单体型图</a>)</p><p><img data-galleryid="" data-imgfileid="100011933" data-ratio="0.4758241758241758" data-s="300,640" data-src="https://mmbiz.qpic.cn/mmbiz_png/XEicwVA08daBEojUyMqUQMWODohic09hKzibVsEjfDLbGnhSDX7oz7gUsvUYbR1aspJ3osT6CYLb0KQ7lJT4ZgibibQ/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="910" src="https://mmbiz.qpic.cn/mmbiz_png/XEicwVA08daBEojUyMqUQMWODohic09hKzibVsEjfDLbGnhSDX7oz7gUsvUYbR1aspJ3osT6CYLb0KQ7lJT4ZgibibQ/640?wx_fmt=png&amp;from=appmsg"></p><p>问题有3个核心点:<br></p><p>1,snp文件总是有^M符号,怎么去除</p><p>2,如何批量对显著位点进行LDblock分析</p><p>3,有没有现成代码</p><p>今天就搞定这个问题。<br></p><p><br></p><p>首先,看一下正常LDblock分析的代码:<br></p><p>LDBlockShow -InVCF hebing1.vcf -Region 3:400986543:401186543 -OutPng -SeleVar 1 <span>--OutPut re3</span></p><p><span>有vcf文件,染色体,物理位置区间,结果文件名称。<br></span></p><p><span>来一个脚本:</span></p><p><span><br></span></p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="bash"><code><span>$ cat ~/bin/plot_ld_plot_vcf.sh </span></code><code><span><span>#!/bin/bash</span></span></code><code><span><br></span></code><code><span><span>if</span> [ <span>$#</span> != 4 ]</span></code><code><span><span>then</span></span></code><code><span> <span>echo</span> <span>$0</span> name.vcf chronome_number location 20000; 第一个是vcf或者vcf.gz文件,第二个是染色体位置,第三个是物理位置,第四个是上下游区间</span></code><code><span> <span>exit</span> 1</span></code><code><span><span>fi</span></span></code><code><span><br></span></code><code><span>vcf=<span>$1</span>;chr=<span>$2</span>;pos=<span>$3</span>;qujian=<span>$4</span></span></code><code><span>st=$(( <span>$pos</span> - <span>$qujian</span>))</span></code><code><span>en=$(( <span>$pos</span> + qujian))</span></code><code><span><br></span></code><code><span>ot=re_<span>${chr}</span>_<span>${pos}</span></span></code><code><span><span>echo</span> <span>$chr</span> <span>$pos</span> <span>$pos</span> &gt;temp.txt</span></code><code><span>~/bin/LDBlockShow -InVCF <span>$vcf</span> -Region <span>${chr}</span>:<span>${st}</span>:<span>${en}</span> -SeleVar 1 -BlockType 2 -NoShowLDist 10000000 -OutPut <span>${ot}</span></span></code><code><span>~/bin/ShowLDSVG -InPreFix <span>${ot}</span> -SpeSNPName temp.txt -OutPut <span>${ot}</span>_a</span></code><code><span><br></span></code></pre></section><p><br></p><p>上面的脚本,给定vcf文件,染色体,物理位置,上下游区间,就会自动分析。<br></p><p>如何批量处理呢?<br></p><p>比如snp.txt文件,如下图所示:</p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="properties"><code><span><span>1</span> <span>280719882</span></span></code><code><span><span>7</span> <span>143547413</span></span></code><code><span><span>2</span> <span>141942814</span></span></code><code><span><span>2</span> <span>141942735</span></span></code><code><span><span>1</span> <span>175504926</span></span></code><code><span><span>10</span> <span>33619582</span></span></code><code><span><span>1</span> <span>245136244</span></span></code><code><span><span>10</span> <span>134034686</span></span></code><code><span><span>1</span> <span>54049220</span></span></code><code><span><span>1</span> <span>285273845</span></span></code><code><span><span>5</span> <span>18240536</span></span></code><code><span><span>5</span> <span>83233487</span></span></code><code><span><span>8</span> <span>123272935</span></span></code><code><span><span>6</span> <span>93263605</span></span></code><code><span><span>10</span> <span>144172316</span></span></code><code><span><span>10</span> <span>3598262</span></span></code><code><span><span>8</span> <span>4766593</span></span></code><code><span><span>5</span> <span>200813276</span></span></code><code><span><span>8</span> <span>160536300</span></span></code><code><span><span>1</span> <span>286271514</span></span></code><code><span><br></span></code></pre></section><p><br></p><p>上下游区间是100kb,那么先将每个snp生成一个命令行,然后运行就行了。<br></p><p>awk走起:<br></p><section><ul><li><li></ul><pre data-lang="nginx"><code><span><span>awk</span> <span>'{print "bash plot_ld_block_vcf.sh ../geno/genotype.vcf", <span>$1</span>,<span>$2</span>,100000}'</span> snp.txt &gt;temp.sh</span></code><code><span><br></span></code></pre></section><p><br></p><p>生成的脚本文件:</p><section><ul><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li><li></ul><pre data-lang="properties"><code><span><span>$</span> <span>cat temp.sh </span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 280719882 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 7 143547413 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 2 141942814 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 2 141942735 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 175504926 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 33619582 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 245136244 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 134034686 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 54049220 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 285273845 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 18240536 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 83233487 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 123272935 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 6 93263605 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 144172316 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 10 3598262 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 4766593 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 5 200813276 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 8 160536300 100000</span></span></code><code><span><span>bash</span> <span>plot_ld_block_vcf.sh ../geno/genotype.vcf 1 286271514 100000</span></span></code><code><span><br></span></code></pre></section><p>然后运行temp.sh文件就可以了:</p><section><ul><li></ul><pre data-lang="css"><code><span><span>bash</span> <span>temp</span><span>.sh</span></span></code></pre></section><p>静等结果就可以了。</p><p>666<br></p><p><strong>推荐阅读:</strong><br></p><p><strong><br></strong></p><p><strong><strong><span>想要更好的学习和交流,快来加入</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247491476&amp;idx=1&amp;sn=8512962ea1264b569db4aa8d15b8712f&amp;chksm=e90ff086de7879908a1fa151e81004993a15650bae3f68e4d518a332beb18a130373c492043e&amp;scene=21#wechat_redirect" textvalue="飞哥的知识星球" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">飞哥的知识星球</a><span>,这是一个生物统计+数量遗传学+GWAS+GS的社区,在这里你可以向飞哥提问、帮你指定学习计划、跟着飞哥一起做实战项目,冲冲冲。</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247491476&amp;idx=1&amp;sn=8512962ea1264b569db4aa8d15b8712f&amp;chksm=e90ff086de7879908a1fa151e81004993a15650bae3f68e4d518a332beb18a130373c492043e&amp;scene=21#wechat_redirect" textvalue="点击这里加入吧:飞哥的学习圈子" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1"><span><strong>点击这里加入吧:</strong></span>飞哥的学习圈子</a></strong></strong></p><p><strong><img data-imgfileid="100011936" data-ratio="1.2462264150943396" data-src="https://mmbiz.qpic.cn/mmbiz_jpg/XEicwVA08daCyoYQooxoH1tCzgN5xxevlQokBib7RiaStK9nPw4qo80OfqYAuRHsPAU4DJ2StOCUnHicqZNuG4fibow/640?wx_fmt=other&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1&amp;tp=webp" data-w="1060" src="https://mmbiz.qpic.cn/mmbiz_jpg/XEicwVA08daCyoYQooxoH1tCzgN5xxevlQokBib7RiaStK9nPw4qo80OfqYAuRHsPAU4DJ2StOCUnHicqZNuG4fibow/640?wx_fmt=other&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1&amp;tp=webp"></strong></p><p><br></p><p>1,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247489866&amp;idx=1&amp;sn=e8631d343a33ae956b23d97a6b2b3fe4&amp;chksm=e90ff658de787f4e3b52929c85adadbe8c6f54bf0943780ea94f9e19e4cc6ef8756e068761f9&amp;scene=21#wechat_redirect" textvalue="快来领取 | 飞哥的GWAS分析教程" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">快来领取 | 飞哥的GWAS分析教程</a></p><p><br></p><p>2,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247489124&amp;idx=1&amp;sn=923b2d827f0e4b6c70494ebd458bdfa5&amp;chksm=e90ff976de78706065f72955113db192e9f74682c503dc69c3ebf8d17e9b8726db82be1ffa0a&amp;scene=21#wechat_redirect" textvalue="飞哥汇总 | 入门数据分析资源推荐" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">飞哥汇总 | 入门数据分析资源推荐</a><br></p><p><br></p><p>3,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247488460&amp;idx=3&amp;sn=b3acd30a2ab34e10415dacc90f596e4a&amp;chksm=e90ffcdede7875c8ba669382e38e91f5d1e10960448a1b88dbbe60bde1e5c2880e8ab918e914&amp;scene=21#wechat_redirect" textvalue="数量遗传学,分享几本书的电子版" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1">数量遗传学,分享几本书的电子版</a></p><p><br></p><p>4,<a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247490492&amp;idx=1&amp;sn=24434d7598cf1993bbece938b63cc7e8&amp;chksm=e90ff4aede787db837d5de895a24bb0cc055868d4345a5974a24534827aaff9882c86bbe6550&amp;scene=21#wechat_redirect" textvalue="R语言学习看最新版的电子书不香嘛?" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">R语言学习看最新版的电子书不香嘛?</a></p><p><br></p><p><span>5,</span><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzI0MTIzNjYwNQ==&amp;mid=2247490897&amp;idx=1&amp;sn=2836fdb9bf636213d2ac9438c5d4d575&amp;chksm=e90ff243de787b55c501cf2ecc5bf3eb23f462b303ed4c861f0ade5c9d34741d338f18db3976&amp;scene=21#wechat_redirect" textvalue="书籍及配套代码领取--统计遗传分析导论" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" wah-hotarea="click" hasload="1"><span>书籍及配套代码领取--统计遗传分析导论</span></a></p><p><mp-style-type data-value="3"></mp-style-type></p></div>
<hr>
<a href="https://mp.weixin.qq.com/s/TO-Q65AA90mSQ_NM7yg2xg",target="_blank" rel="noopener noreferrer">原文链接</a>

0 comments on commit f81e8b2

Please sign in to comment.