Skip to content

Commit

Permalink
docs: update README, guides
Browse files Browse the repository at this point in the history
  • Loading branch information
balajtimate committed Oct 31, 2024
1 parent 797c596 commit db6bd5c
Show file tree
Hide file tree
Showing 4 changed files with 52 additions and 102 deletions.
102 changes: 14 additions & 88 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,41 +11,27 @@
HTSinfer infers metadata from Illumina high-throughput sequencing (HTS) data.

## Quick start
## Installation

For a more in-depth guide please refer to the [HTSinfer documentation][docs-documentation].

### Installation

In order to use the HTSinfer, clone the repository and install the
dependencies via [Conda][conda]:
dependencies via [Conda][conda] or [Mamba][mamba]:

```sh
git clone https://github.com/zavolanlab/htsinfer
cd htsinfer
conda env create --file environment.yml
# Alternatively, to install with development dependencies,
# run the following instead
conda env create --file environment-dev.yml
```

> Note that creating the environment takes non-trivial time and it is strongly
> recommended that you install [Mamba][mamba] and replace `conda` with `mamba`
> in the previous command.
Then, activate the `htsinfer` Conda environment with:

```sh
conda activate htsinfer
```

If you have installed the development/testing dependencies, you may first want
to verify that HTSinfer was installed correctly by executing the tests shipped
with the package:

```sh
python -m pytest
```

Otherwise just go ahead and try one of the [examples](#Examples).

## General usage
### General usage

```sh
htsinfer [--output-directory PATH]
Expand All @@ -69,15 +55,15 @@ htsinfer [--output-directory PATH]
PATH [PATH]
```

## Examples
### Examples

**Single-ended library***
**Single-ended library**

```sh
htsinfer tests/files/adapter_single.fastq
```

**Paired-ended library***
**Paired-ended library**

```sh
htsinfer tests/files/adapter_1.fastq tests/files/adapter_2.fastq
Expand Down Expand Up @@ -146,82 +132,21 @@ example library:
```

To better understand the output, please refer to the [`Results`
model][docs-api-results] in the [API documentation][badge-url-docs]. Note that
`Results` model has several nested child models, such as enumerators of
possible outcomes. Simply follow the references in each parent model for
detailed descriptions of each child model's attributes.

## General usage

```sh
htsinfer [--output-directory PATH]
[--temporary-directory PATH]
[--cleanup-regime {DEFAULT,KEEP_ALL,KEEP_NONE,KEEP_RESULTS}]
[--records INT]
[--threads INT]
[--transcripts FASTA]
[--read-layout-adapters PATH]
[--read-layout-min-match-percentage FLOAT]
[--read-layout-min-frequency-ratio FLOAT]
[--library-source-min-match-percentage FLOAT]
[--library-source-min-frequency-ratio FLOAT]
[--library-type-max-distance INT]
[--library-type-mates-cutoff FLOAT]
[--read-orientation-min-mapped-reads INT]
[--read-orientation-min-fraction FLOAT]
[--tax-id INT]
[--verbosity {DEBUG,INFO,WARN,ERROR,CRITICAL}]
[-h] [--version]
PATH [PATH]
```

## Installation

In order to use the HTSinfer, clone the repository and install the
dependencies via [Conda][conda]:

```sh
git clone https://github.com/zavolanlab/htsinfer
cd htsinfer
conda env create --file environment.yml
# Alternatively, to install with development dependencies,
# run the following instead
conda env create --file environment-dev.yml
```

> Note that creating the environment takes non-trivial time and it is strongly
> recommended that you install [Mamba][mamba] and replace `conda` with `mamba`
> in the previous command.
Then, activate the `htsinfer` Conda environment with:

```sh
conda activate htsinfer
```

If you have installed the development/testing dependencies, you may first want
to verify that HTSinfer was installed correctly by executing the tests shipped
with the package:

```sh
python -m pytest
```

Otherwise just go ahead and try one of the [examples](#Examples).
model][docs-api-results] in the [API documentation][badge-url-docs].

## API documentation
### API documentation

Auto-built API documentation is hosted on [ReadTheDocs][badge-url-docs].

## Contributing
### Contributing

This project lives off your contributions, be it in the form of bug reports,
feature requests, discussions, or fixes and other code changes. Please refer
to the [contributing guidelines](CONTRIBUTING.md) if you are interested to
contribute. Please mind the [code of conduct](CODE_OF_CONDUCT.md) for all
interactions with the community.

## Contact
### Contact

For questions or suggestions regarding the code, please use the
[issue tracker][issue-tracker]. For any other inquiries, please contact us
Expand All @@ -245,6 +170,7 @@ by email: <[email protected]>
[badge-url-doi-zenodo]: <https://doi.org/10.5281/zenodo.13985958>
[conda]: <https://docs.conda.io/en/latest/miniconda.html>
[contact]: <https://zavolan.biozentrum.unibas.ch/>
[docs-documentation]: <https://htsinfer.readthedocs.io/>
[docs-api-results]: <https://htsinfer.readthedocs.io/en/latest/modules/htsinfer.html#htsinfer.models.Results>
[issue-tracker]: <https://github.com/zavolanlab/htsinfer/issues>
[mamba]: <https://mamba.readthedocs.io/en/latest/installation.html>
10 changes: 5 additions & 5 deletions docs/guides/examples.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
Examples
========

HTSinfer provides easy-to-use commands for analyzing single- and paired-ended RNA-Seq libraries.
`HTSinfer` provides easy-to-use commands for analyzing single- and paired-ended RNA-Seq libraries.


Single-ended Library Example
----------------------------

To run HTSinfer on a single-ended RNA-Seq library, use the following command:
To run `HTSinfer` on a single-ended RNA-Seq library, use the following command:

.. code-block:: bash
Expand All @@ -16,13 +16,13 @@ To run HTSinfer on a single-ended RNA-Seq library, use the following command:
Paired-ended Library Example
----------------------------

To run HTSinfer on a paired-ended RNA-Seq library, use the following command:
To run `HTSinfer` on a paired-ended RNA-Seq library, use the following command:

.. code-block:: bash
htsinfer tests/files/adapter_1.fastq tests/files/adapter_2.fastq
Both commands will output the results in JSON format to `STDOUT` and the log to `STDERR`.
Both commands will output the results in JSON format to :code:`STDOUT` and the log to :code:`STDERR`.

Example Output
--------------
Expand Down Expand Up @@ -84,4 +84,4 @@ Here is a sample output for the paired-ended library:
}
}
For more details on the output structure, refer to the `Results` model in the API documentation.
For more details on the output structure, refer to the :code:`Results` model in the API documentation.
4 changes: 2 additions & 2 deletions docs/guides/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ To install `HTSinfer`, first clone the repository and install the dependencies v
.. note::

Creating the environment may take some time. It is strongly recommended to install `Mamba <https://mamba.readthedocs.io/en/latest/installation.html>`_ and replace ``conda`` with ``mamba`` in the previous commands for faster installation.
Creating the environment may take some time. It is strongly recommended to install `Mamba <https://mamba.readthedocs.io/en/latest/installation.html>`_ and replace :code:`conda` with :code:`mamba` in the previous commands for faster installation.

Activate the Conda Environment
------------------------------

After the installation is complete, activate the `htsinfer` Conda environment with:
After the installation is complete, activate the :code:`htsinfer` Conda environment with:

.. code-block:: bash
Expand Down
38 changes: 31 additions & 7 deletions docs/guides/usage.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Usage
=====

This sections describes the general usage of `HTSinfer`.
This section describes the general usage of `HTSinfer`.

General Usage
-------------
Expand All @@ -28,20 +28,44 @@ General Usage
[-h] [--version]
PATH [PATH]
The above command allows the user to infer metadata for single- or paired-ended RNA-Seq libraries by specifying file paths and relevant parameters. The tool outputs metadata in JSON format to `STDOUT` and logs to `STDERR`.
The above command allows the user to infer metadata for single- or paired-ended RNA-Seq libraries by specifying file paths and relevant parameters. The tool outputs metadata in JSON format to :code:`STDOUT` and logs to :code:`STDERR`.

Command-line Options
---------------------

Available command-line parameters are categorized as follows:

- **General Options**: These include specifying directories, verbosity level, and other global settings.
- **Library-specific Options**: These parameters allow the user to modify settings related to the input data, such as transcript references, adapter sequences, and match thresholds.
- **Output Options**: These settings control the output format, including the number of records and the output destination.
- **Meta Options**: The user can also control the behavior of the tool with meta options such as cleanup regimes, thread count, and version information.
- **General Options**:
- :code:`--output-directory`: Path where output data will be saved.
- :code:`--temporary-directory`: Path for storing temporary files generated during execution.
- :code:`--cleanup-regime`: Specifies which data should be kept after completion. Available options are :code:`DEFAULT`, :code:`KEEP_ALL`, :code:`KEEP_NONE`, and :code:`KEEP_RESULTS`.
- :code:`--verbosity`: Controls the verbosity level of log output; options are :code:`DEBUG`, :code:`INFO`, :code:`WARN`, :code:`ERROR`, and :code:`CRITICAL`.

For a complete list of all available options, use the following command:
- **Library-specific Options**:
- :code:`PATH [PATH]`: Path(s) to the RNA-Seq input data. For paired-end libraries, provide paths to both mate files.
- :code:`--transcripts`: Path to the FASTA file containing transcript sequences for reference.
- :code:`--read-layout-adapters`: Path to a file with 3' adapter sequences (one sequence per line) used to identify adapter content.
- :code:`--read-layout-min-match-percentage`: Minimum percentage of reads containing an adapter for it to be considered as the library’s 3’-end adapter.
- :code:`--read-layout-min-frequency-ratio`: Minimum frequency ratio between the most and second most frequent adapters to select the 3’-end adapter.
- :code:`--library-source-min-match-percentage`: Minimum percentage of reads aligning with a library source for it to be considered representative of the library.
- :code:`--library-source-min-frequency-ratio`: Minimum frequency ratio between primary and secondary library sources, ensuring only the most prominent source is identified.
- :code:`--library-type-max-distance`: Maximum allowable distance between read pairs to classify the library type.
- :code:`--library-type-mates-cutoff`: Ratio cutoff to determine the consistency of mate orientation in paired-end reads.
- :code:`--read-orientation-min-mapped-reads`: Minimum number of mapped reads to ensure reliable inference of read orientation.
- :code:`--read-orientation-min-fraction`: Minimum fraction (must exceed 0.5) of reads supporting a given orientation to confirm its accuracy.

- **Processing and Performance Options**:
- :code:`--records`: Limits the number of input records to process; setting this to 0 will process all records.
- :code:`--threads`: Specifies the number of threads for concurrent processing to optimize performance.
- :code:`--tax-id`: Taxonomy ID for the sample source, aiding in organism-specific analyses.

Meta Options
------------

For help or version information, use the following:

.. code-block:: bash
htsinfer --help
htsinfer --version

0 comments on commit db6bd5c

Please sign in to comment.