Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qcxms #421

Merged
merged 21 commits into from
Feb 22, 2024
Merged

Qcxms #421

Show file tree
Hide file tree
Changes from 19 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions tools/qcxms/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: QCxMS
owner: recetox
remote_repository_url: "https://github.com/RECETOX/galaxytools/tree/master/tools/qcxms"
homepage_url: "https://github.com/grimme-lab/QCxMS"
categories:
- Computational chemistry
- Molecular Dynamics
description: "QCxMS is a quantum chemical (QC) based program that enables users to calculate mass spectra (MS) using Born-Oppenheimer Molecular Dynamics (MD)."
long_description: |
"QCxMS is a quantum chemical (QC) based program that enables users to calculate mass spectra (MS) using Born-Oppenheimer Molecular Dynamics (MD).
It is the successor of the QCEIMS program, in which the EI part is exchanged to x (x=EI, CID) to account for the greater general applicability of the program.
The program was originally developed to calculate Electron Ionization (EI) mass spectra,
in which a (typically 70 eV) electron beam is focused on a molecule in order to create an open-shell radical ion (uneven number of valence electrons).
This process not only ionizes the molecule, but simultaneously increases the internal energy of the species, which in turn leads to bond breaking,
fragmentation, rearrangement, etc of the ion."
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "{{ tool_name }} tool from the QCxMS package"
suite:
name: suite_qcxms
description: tools from QCxMS are used for molecular geometry optimization and in silico calculation of mass spectra using quantum chemistry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tools are only for spectrum prediction

type: repository_suite_definition
26 changes: 26 additions & 0 deletions tools/qcxms/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<macros>
<token name="@TOOL_VERSION@">5.2.1</token>
<xml name="requirements">
<requirements>
<container type="docker">recetox/qcxms-docker:@TOOL_VERSION@</container>
</requirements>
</xml>
<xml name="edam">
<edam_topics>
<edam_topic>topic_3332</edam_topic>
</edam_topics>
<edam_operations>
<edam_operation>operation_0297</edam_operation>
</edam_operations>
</xml>

<xml name="creator">
<creator>
<yield/>
<organization
url="https://www.recetox.muni.cz/"
email="[email protected]"
name="RECETOX MUNI" />
</creator>
</xml>
</macros>
11 changes: 11 additions & 0 deletions tools/qcxms/msp_out.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/sh

molname=`sed -n '2{p;q}' TMPQCXMS/TMP.1/start.xyz`
kword=$(grep 'NPOINTS' result.jdx)
num_peaks=$(echo "$kword" | sed 's/^[^=]*=//')
echo `pwd`
sed -n '/PEAK/,/END/{/PEAK/!{/END/!p}}' result.jdx > temp.dat
awk '{print $1, $2}' temp.dat > tempa.dat
sed "1s/^/NAME: $molname\nNum Peaks: $num_peaks\n/" tempa.dat >> simulated_spectra.msp
sed -i '$a\ ' simulated_spectra.msp
rm temp.dat tempa.dat
140 changes: 140 additions & 0 deletions tools/qcxms/qcxms_neutral_run.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
<tool id="qcxms_neutral_run" name="QCxMS neutral run" version="@TOOL_VERSION@+galaxy0" profile="21.05">
<description>required as first step to prepare for the production runs</description>

<macros>
<import>macros.xml</import>
</macros>

<expand macro="edam"/>
<expand macro="creator"/>
<expand macro="requirements"/>

<command detect_errors="exit_code"><![CDATA[
ln -s '$mol' molecule.xyz &&
cat qcxms.in &&
/qcxms_bin/qcxms -i molecule.xyz >> '$log' &&
/qcxms_bin/qcxms -i molecule.xyz >> '$log' &&
python3 rename.py

]]></command>

<environment_variables>
<environment_variable name="OMP_NUM_THREADS">1,2,1</environment_variable>
</environment_variables>

<configfiles>
<configfile filename="qcxms.in"><![CDATA[
${QC_Level}
#if $keywords.ntraj
ntraj ${keywords.ntraj}
#end if
tmax ${keywords.tmax}
tinit ${keywords.tinit}
ieeatm ${keywords.ieeatm}]]>
</configfile>
<configfile filename="rename.py">
import os

def rename_files_with_folder_name(folder_path):
if not os.path.exists(folder_path):
print(f"The folder '{folder_path}' does not exist.")
return

for root, _, files in os.walk(folder_path):
for filename in files:
folder_name = os.path.basename(root)
new_filename = f"{folder_name}_{filename}"

old_path = os.path.join(root, filename)
new_path = os.path.join(root, new_filename)

os.rename(old_path, new_path)

path = os.getcwd() + "/TMPQCXMS"
rename_files_with_folder_name(path)
</configfile>
</configfiles>

<inputs>
<param type="data" name="mol" label="Molecule 3D structure [.xzy]" format="xyz,txt" />
<param name="QC_Level" type="select" display="radio" label="QC Method">
<option value="xtb2" selected="true">GFN2-xTB</option>
<option value="xtb">GFN-xTB</option>
</param>
<section name="keywords" title="Advanced method parameters" expanded="false"
help="List of advanced keywords to specify the method - for more information see [1].">
<param name="tmax" type="float" value="20.0" label="Maximum MD time (sampling) [ps]"
help="MD time for the mean-free-path (mfp) simulation in the EI mode. In the CID mode, this sets the number of time steps for the simulation
after fragmentation during internal energy scaling (implicit run type). For the explicit run type, the time for the collision MDs is fixed at 50 fs * number_of_atoms."/>
<param name="tinit" type="float" value="500.0" label="Initial Temperature [K]"/>
<param name="ieeatm" type="float" value="0.6" label="Impact excess energy (IEE) per atom [eV/atom]" />
<param name="ntraj" type="integer" optional="true" min="2" label="Number of trajectories[#]" help="Default is 25 * no. of atoms if unspecified."/>
</section>
<param name="store_extended_output" type="boolean" value="false" label="Store additional outputs?" help="Output the logfile and generated trajectory."/>
</inputs>

<outputs>
<data name="qcxms_out" format="txt" from_work_dir="qcxms.gs" label="qcxms.gs generated by ${tool.name} on ${on_string}">
<filter>store_extended_output</filter>
</data>
<data name="trajectory" from_work_dir="trjM" format="txt" label="trajectories generated by ${tool.name} on ${on_string}">
<filter>store_extended_output</filter>
</data>
<data name="log" format="txt" label="logfile of ${tool.name} on ${on_string}">
<filter>store_extended_output</filter>
</data>

<collection name="coords1" format="txt" type="list" label="coords in files generated by ${tool.name} on ${on_string}" >
<discover_datasets pattern="(?P&lt;designation&gt;.+)\.in" format="txt" directory="TMPQCXMS" recurse="true"/>
</collection>
<collection name="coords2" format="txt" type="list" label="coords start files generated by ${tool.name} on ${on_string}" >
<discover_datasets pattern="(?P&lt;designation&gt;.+)\.start" format="txt" directory="TMPQCXMS" recurse="true"/>
</collection>
<collection name="coords3" format="txt" type="list" label="coords xyz files generated by ${tool.name} on ${on_string}" >
<discover_datasets pattern="(?P&lt;designation&gt;.+)\.xyz" format="txt" directory="TMPQCXMS" recurse="true"/>
</collection>
</outputs>

<tests>
<test expect_num_outputs="6">
<param name="mol" value="mol.xyz" ftype="txt"/>
<section name="keywords">
<param name="ntraj" value="2"/>
</section>
<param name="store_extended_output" value="true"/>
<output_collection name="coords1" type="list" count="2"/>
<output_collection name="coords2" type="list" count="2"/>
<output_collection name="coords3" type="list" count="2"/>
<output name="qcxms_out">
<assert_contents>
<has_size value="174613" delta="300"/>
</assert_contents>
</output>
<output name="trajectory">
<assert_contents>
<has_size value="22150" delta="300"/>
</assert_contents>
</output>
<output name="log">
<assert_contents>
<has_size value="10518" delta="300"/>
</assert_contents>
</output>
</test>
</tests>

<help><![CDATA[
The QCxMS Neutral Run tool serves as the first step in preparing for production runs. The tool execute neutral runs for mass
spectrometry simulations using the GFN2-xTB and GFN-xTB quantum chemistry methods. For detail information visit the documentation
at https://xtb-docs.readthedocs.io/en/latest/qcxms_doc/qcxms_run.html#excecuting-the-production-runs
]]>
</help>

<citations>
<citation type="doi">10.1002/anie.201300158</citation>
<citation type="doi">10.1039/C4OB01668H</citation>
<citation type="doi">10.1021/jp5096618</citation>
<citation type="doi">10.1255/ejms.1313</citation>
<citation type="doi">10.1021/acs.jpca.6b02907</citation>
</citations>
</tool>
86 changes: 86 additions & 0 deletions tools/qcxms/qcxms_prod_run.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
<tool id="qcxms_production_run" name="QCxMS production run" version="@TOOL_VERSION@+galaxy0" profile="21.05">
<description>Production run to obtain a QCxMS simulated mass spectrum</description>

<macros>
<import>macros.xml</import>
</macros>
<expand macro="edam"/>
<expand macro="creator"/>
<expand macro="requirements"/>

<command detect_errors="exit_code"><![CDATA[
python3 '${create_folder_structure}' &&
find TMPQCXMS/*/ -type d | xargs -I {} -P 4 sh -c 'cd {} && /qcxms_bin/qcxms --prod >> $log' &&
hechth marked this conversation as resolved.
Show resolved Hide resolved
/qcxms_bin/getres &&
/plotms_bin/PlotMS.v.6.2.0/plotms &&
sh ${__tool_directory__}/msp_out.sh
]]></command>

<environment_variables>
<environment_variable name="OMP_NUM_THREADS">1,2,1</environment_variable>
</environment_variables>

<configfiles>
<configfile name="create_folder_structure">
import os
import shutil

#set in_collection = str("', '").join([str($f) for $f in $in_files])
#set start_collection = str("', '").join([str($f) for $f in $start_files])
#set xyz_collection = str("', '").join([str($f) for $f in $xyz_files])

#set names = str("', '").join([str($f.name) for $f in $xyz_files])
names = '$names'
folder_names = [x.split("_")[0] for x in names]

in_collection = '$in_collection'
start_collection = '$start_collection'
xyz_collection = '$xyz_collection'

# Create a new output folder to store the result
output_path = 'TMPQCXMS'
os.makedirs(output_path, exist_ok=True)

for folder_name, in_file, start_file, xyz_file in zip(folder_names, in_collection, start_collection, xyz_collection):
new_folder_path = os.path.join(output_path, folder_name)
os.makedirs(new_folder_path, exist_ok=True)

shutil.copy2(os.path.join(os.path.dirname(in_collection[0]), in_file), os.path.join(new_folder_path, 'qcxms.in'))
shutil.copy2(os.path.join(os.path.dirname(start_collection[0]), start_file), os.path.join(new_folder_path, 'qcxms.start'))
shutil.copy2(os.path.join(os.path.dirname(xyz_collection[0]), xyz_file), os.path.join(new_folder_path, 'start.xyz'))

</configfile>
</configfiles>

<inputs>
<param type="data_collection" collection_type="list" name="in_files" label="in files [.in]" format="in,txt,text"/>
<param type="data_collection" collection_type="list" name="start_files" label="start files [.start]" format="start,txt,text"/>
<param type="data_collection" collection_type="list" name="xyz_files" label="xyz files [.xyz]" format="xyz,txt,text"/>
<param name="store_extended_output" type="boolean" value="false" label="Store additional outputs" help="Output the logfile."/>
</inputs>

<outputs>
<data name="msp_output" format="msp" from_work_dir="simulated_spectra.msp" label="simulated_spectra.msp generated by ${tool.name} on ${on_string}"/>
<data name="log" format="txt" label="logfile of ${tool.name} on ${on_string}">
<filter>store_extended_output</filter>
</data>
</outputs>

<tests>
</tests>
zargham-ahmad marked this conversation as resolved.
Show resolved Hide resolved

<help><![CDATA[
The QCxMS production run tool is used to simulate mass spectra for a given molecule using the QCxMS (Quantum Chemistry by Mass Spectrometry) method.
This tool generates simulated mass spectra based on the equilibrium structure of a molecule and allows you to perform QCxMS production runs.
For detail information visit the documentation at https://xtb-docs.readthedocs.io/en/latest/qcxms_doc/qcxms_run.html#excecuting-the-production-runs
]]>
</help>

<citations>
<citation type="doi">10.1002/anie.201300158</citation>
<citation type="doi">10.1039/C4OB01668H</citation>
<citation type="doi">10.1021/jp5096618</citation>
<citation type="doi">10.1255/ejms.1313</citation>
<citation type="doi">10.1021/acs.jpca.6b02907</citation>
</citations>
</tool>
20 changes: 20 additions & 0 deletions tools/qcxms/test-data/mol.xyz
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
18
Lindane
CL -2.3574740887 0.2795224786 1.4453580379
C -1.5060254335 -0.0564753152 -0.1808833480
C -0.7409992814 1.1908990145 -0.6060928106
C -0.6154388189 -1.2913844585 -0.1359211653
CL -0.4997346103 1.0723730326 -2.4801609516
C 0.6426193118 1.3751488924 0.0051864041
CL -1.5938731432 -2.7512128353 0.4462303221
C 0.6645445824 -1.0841370821 0.6675278544
CL 0.4618762434 2.1580321789 1.6895099878
C 1.4344482422 0.0744384378 0.0415960811
CL 1.6999230385 -2.6236054897 0.5561774969
CL 3.0750505924 0.3596521318 0.8505522013
H -2.3252866268 -0.2277424186 -0.8540474772
H -1.3395941257 2.0737268925 -0.4863004684
H -0.3587814867 -1.5482350588 -1.1510624886
H 1.1898183823 2.1143426895 -0.5496439934
H 0.4745289683 -0.9351251125 1.7152343988
H 1.6943985224 -0.1802178770 -0.9732601047
Loading