Skip to content

Commit

Permalink
rename to gtars
Browse files Browse the repository at this point in the history
  • Loading branch information
nleroy917 committed Jun 11, 2024
1 parent a481311 commit c0fadea
Show file tree
Hide file tree
Showing 63 changed files with 62 additions and 62 deletions.
2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"rust-analyzer.linkedProjects": [
"./genimtools/Cargo.toml",
"./gtars/Cargo.toml",
"./bindings/Cargo.toml",
]
}
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
<h1 align="center">
<img src="genimtools/docs/logo.svg" alt="genimtools logo" height="100px">
<img src="gtars/docs/logo.svg" alt="gtars logo" height="100px">
</h1>

`genimtools` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide processors for our python package, [`geniml`](https:github.com/databio/geniml), a library for machine learning on genomic intervals. However, it can be used as a standalone library for working with genomic intervals as well.
`gtars` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide processors for our python package, [`geniml`](https:github.com/databio/geniml), a library for machine learning on genomic intervals. However, it can be used as a standalone library for working with genomic intervals as well.

`genimtools` provides three things:
`gtars` provides three things:

1. A rust library crate.
2. A command-line interface, written in rust.
Expand All @@ -14,24 +14,24 @@

This repo is organized like so:

1. A rust library crate (`/genimtools/lib.rs`) that provides functions, traits, and structs for working with genomic interval data.
2. A rust binary crate (in `/genimtools/main.rs`), a small, wrapper command-line interface for the library crate.
1. A rust library crate (`/gtars/lib.rs`) that provides functions, traits, and structs for working with genomic interval data.
2. A rust binary crate (in `/gtars/main.rs`), a small, wrapper command-line interface for the library crate.
3. A rust crate (in `/bindings`) that provides Python bindings, and a resulting Python package, so that it can be used within Python.

This repository is a work in progress, and still in early development.

## Installation
To install `genimtools`, you must have the rust toolchain installed. You can install it by following the instructions [here](https://www.rust-lang.org/tools/install).
To install `gtars`, you must have the rust toolchain installed. You can install it by following the instructions [here](https://www.rust-lang.org/tools/install).

You may build the binary locally using `cargo build --release`. This will create a binary in `target/release/genimtools`. You can then add this to your path, or run it directly.
You may build the binary locally using `cargo build --release`. This will create a binary in `target/release/gtars`. You can then add this to your path, or run it directly.

## Usage
`genimtools` is very early in development, and as such, it does not have a lot of functionality yet. However, it does have a few useful tools. To see the available tools, run `genimtools --help`. To see the help for a specific tool, run `genimtools <tool> --help`.
`gtars` is very early in development, and as such, it does not have a lot of functionality yet. However, it does have a few useful tools. To see the available tools, run `gtars --help`. To see the help for a specific tool, run `gtars <tool> --help`.

Alternatively, you can link `genimtools` as a library in your rust project. To do so, add the following to your `Cargo.toml` file:
Alternatively, you can link `gtars` as a library in your rust project. To do so, add the following to your `Cargo.toml` file:
```toml
[dependencies]
genimtools = { git = "https://github.com/databio/genimtools" }
gtars = { git = "https://github.com/databio/gtars" }
```

## Testing
Expand All @@ -42,13 +42,13 @@ To run the tests, run `cargo test`.
If you'd like to add a new tool, you can do so by creating a new module within the src folder.

### New public library crate tools
If you want this to be available to users of `genimtools`, you can add it to the `genimtools` library crate as well. To do so, add the following to `src/lib.rs`:
If you want this to be available to users of `gtars`, you can add it to the `gtars` library crate as well. To do so, add the following to `src/lib.rs`:
```rust
pub mod <tool_name>;
```

### New binary crate tools
Finally, if you want to have command-line functionality, you can add it to the `genimtools` binary crate. This requires two steps:
Finally, if you want to have command-line functionality, you can add it to the `gtars` binary crate. This requires two steps:
1. Create a new `cli` using `clap` inside the `interfaces` module of `src/cli.rs`:
```rust
pub fn make_new_tool_cli() -> Command {
Expand Down
6 changes: 3 additions & 3 deletions bindings/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
[package]
name = "genimtools-py"
name = "gtars-py"
version = "0.0.13"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
name = "genimtools"
name = "gtars"
crate-type = ["cdylib"]

[dependencies]
anyhow = "1.0.82"
genimtools = { path = "../genimtools" }
gtars = { path = "../gtars" }
pyo3 = { version = "0.21", features=["anyhow", "extension-module"] }
numpy = "0.21"
# pyo3-tch = { git = "https://github.com/LaurentMazare/tch-rs" }
Expand Down
10 changes: 5 additions & 5 deletions bindings/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# genimtools
This is a python wrapper around the `genimtools` crate. It provides an easy interface for using `genimtools` in python. It is currently in early development, and as such, it does not have a lot of functionality yet, but new tools are being worked on right now.
# gtars
This is a python wrapper around the `gtars` crate. It provides an easy interface for using `gtars` in python. It is currently in early development, and as such, it does not have a lot of functionality yet, but new tools are being worked on right now.

## Installation
You can get `genimtools` from PyPI:
You can get `gtars` from PyPI:
```bash
pip install genimtools
pip install gtars
```

## Usage
Import the package, and use the tools:
```python
import genimtools as gt
import gtars as gt

gt.prune_universe(...)
```
Expand Down
1 change: 0 additions & 1 deletion bindings/genimtools/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion bindings/genimtools/ailist/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion bindings/genimtools/models/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion bindings/genimtools/tokenizers/__init__.py

This file was deleted.

1 change: 0 additions & 1 deletion bindings/genimtools/utils/__init__.py

This file was deleted.

1 change: 1 addition & 0 deletions bindings/gtars/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gtars import * # noqa: F403
File renamed without changes.
1 change: 1 addition & 0 deletions bindings/gtars/ailist/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gtars.ailist import * # noqa: F403
File renamed without changes.
1 change: 1 addition & 0 deletions bindings/gtars/models/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gtars.models import * # noqa: F403
File renamed without changes.
1 change: 1 addition & 0 deletions bindings/gtars/tokenizers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gtars.tokenizers import * # noqa: F403
File renamed without changes.
1 change: 1 addition & 0 deletions bindings/gtars/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .gtars.utils import * # noqa: F403
File renamed without changes.
2 changes: 1 addition & 1 deletion bindings/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ requires = ["maturin>=1.3,<2.0"]
build-backend = "maturin"

[project]
name = "genimtools"
name = "gtars"
requires-python = ">=3.8"
classifiers = [
"Programming Language :: Rust",
Expand Down
2 changes: 1 addition & 1 deletion bindings/src/ailist/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use genimtools::ailist::{AIList, Interval};
use gtars::ailist::{AIList, Interval};
use pyo3::{prelude::*, pyclass};

use crate::models::PyInterval;
Expand Down
10 changes: 5 additions & 5 deletions bindings/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ mod utils;
pub const VERSION: &str = env!("CARGO_PKG_VERSION");

#[pymodule]
fn genimtools(py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
fn gtars(py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
let tokenize_module = pyo3::wrap_pymodule!(tokenizers::tokenizers);
let ailist_module = pyo3::wrap_pymodule!(ailist::ailist);
let utils_module = pyo3::wrap_pymodule!(utils::utils);
Expand All @@ -25,10 +25,10 @@ fn genimtools(py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
let sys_modules: &Bound<'_, PyDict> = binding.downcast()?;

// set names of submodules
sys_modules.set_item("genimtools.tokenizers", m.getattr("tokenizers")?)?;
sys_modules.set_item("genimtools.ailist", m.getattr("ailist")?)?;
sys_modules.set_item("genimtools.utils", m.getattr("utils")?)?;
sys_modules.set_item("genimtools.models", m.getattr("models")?)?;
sys_modules.set_item("gtars.tokenizers", m.getattr("tokenizers")?)?;
sys_modules.set_item("gtars.ailist", m.getattr("ailist")?)?;
sys_modules.set_item("gtars.utils", m.getattr("utils")?)?;
sys_modules.set_item("gtars.models", m.getattr("models")?)?;

// add constants
m.add("__version__", VERSION)?;
Expand Down
2 changes: 1 addition & 1 deletion bindings/src/models/region.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ use pyo3::exceptions::PyTypeError;
use pyo3::prelude::*;

use anyhow::Result;
use genimtools::common::models::region::Region;
use gtars::common::models::region::Region;

use crate::models::PyUniverse;

Expand Down
2 changes: 1 addition & 1 deletion bindings/src/models/region_set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use numpy::ndarray::Array;
use numpy::{IntoPyArray, PyArray1};

use anyhow::Result;
use genimtools::common::utils::extract_regions_from_bed_file;
use gtars::common::utils::extract_regions_from_bed_file;

use crate::models::{PyRegion, PyTokenizedRegion, PyUniverse};

Expand Down
2 changes: 1 addition & 1 deletion bindings/src/models/universe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ use pyo3::prelude::*;
use anyhow::Result;

use crate::models::PyRegion;
use genimtools::common::models::Universe;
use gtars::common::models::Universe;

#[pyclass(name = "Universe")]
#[derive(Clone, Debug)]
Expand Down
8 changes: 4 additions & 4 deletions bindings/src/tokenizers/fragments_tokenizer.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use genimtools::tokenizers::FragmentTokenizer;
use genimtools::tokenizers::TreeTokenizer;
use gtars::tokenizers::FragmentTokenizer;
use gtars::tokenizers::TreeTokenizer;
use pyo3::prelude::*;

use super::PyTokenizedRegionSet;
use super::PyUniverse;

#[pyclass(name = "FragmentTokenizer")]
pub struct PyFragmentTokenizer {
pub tokenizer: genimtools::tokenizers::FragmentTokenizer<TreeTokenizer>,
pub tokenizer: gtars::tokenizers::FragmentTokenizer<TreeTokenizer>,
pub universe: Py<PyUniverse>, // this is a Py-wrapped version self.tokenizer.universe for performance reasons
}

Expand All @@ -17,7 +17,7 @@ impl PyFragmentTokenizer {
pub fn new(path: String) -> PyResult<Self> {
Python::with_gil(|py| {
let path = std::path::Path::new(&path);
let tokenizer = genimtools::tokenizers::TreeTokenizer::try_from(path)?;
let tokenizer = gtars::tokenizers::TreeTokenizer::try_from(path)?;
let frag_tokenizer = FragmentTokenizer::new(tokenizer);
let py_universe: PyUniverse = frag_tokenizer.tokenizer.universe.to_owned().into();
let py_universe_bound = Py::new(py, py_universe)?;
Expand Down
6 changes: 3 additions & 3 deletions bindings/src/tokenizers/tree_tokenizer.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use genimtools::tokenizers::traits::SpecialTokens;
use gtars::tokenizers::traits::SpecialTokens;
use pyo3::prelude::*;
use pyo3::types::PyAny;

use anyhow::Result;

use std::path::Path;

use genimtools::common::models::RegionSet;
use genimtools::tokenizers::{Tokenizer, TreeTokenizer};
use gtars::common::models::RegionSet;
use gtars::tokenizers::{Tokenizer, TreeTokenizer};

use crate::models::{PyRegion, PyTokenizedRegionSet, PyUniverse};
use crate::utils::extract_regions_from_py_any;
Expand Down
8 changes: 4 additions & 4 deletions bindings/src/utils/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use pyo3::prelude::*;
use pyo3::types::{PyAny, PyIterator};

use anyhow::Result;
use genimtools::common::models::{Region, RegionSet};
use gtars::common::models::{Region, RegionSet};

// this is for internal use only
pub fn extract_regions_from_py_any(regions: &Bound<'_, PyAny>) -> Result<RegionSet> {
Expand All @@ -20,7 +20,7 @@ pub fn extract_regions_from_py_any(regions: &Bound<'_, PyAny>) -> Result<RegionS
.into());
}

let regions = genimtools::common::utils::extract_regions_from_bed_file(regions);
let regions = gtars::common::utils::extract_regions_from_bed_file(regions);
match regions {
Ok(regions) => return Ok(RegionSet::from(regions)),
Err(e) => return Err(pyo3::exceptions::PyValueError::new_err(e.to_string()).into()),
Expand Down Expand Up @@ -55,13 +55,13 @@ pub fn extract_regions_from_py_any(regions: &Bound<'_, PyAny>) -> Result<RegionS

#[pyfunction]
pub fn write_tokens_to_gtok(filename: &str, tokens: Vec<u32>) -> PyResult<()> {
genimtools::io::write_tokens_to_gtok(filename, &tokens)?;
gtars::io::write_tokens_to_gtok(filename, &tokens)?;
Ok(())
}

#[pyfunction]
pub fn read_tokens_from_gtok(filename: &str) -> PyResult<Vec<u32>> {
let tokens = genimtools::io::read_tokens_from_gtok(filename)?;
let tokens = gtars::io::read_tokens_from_gtok(filename)?;
Ok(tokens)
}

Expand Down
2 changes: 1 addition & 1 deletion genimtools/Cargo.toml → gtars/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[package]
name = "genimtools"
name = "gtars"
version = "0.0.13"
edition = "2021"
description = "Performance-critical tools to manipulate, analyze, and process genomic interval data. Primarily focused on building tools for geniml - our genomic machine learning python package."
Expand Down
4 changes: 2 additions & 2 deletions genimtools/docs/changelog.md → gtars/docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [0.0.5]
- add many "core utils"
- move `gtokenizers` into this package inside `genimtools::tokenizers`
- move `gtokenizers` into this package inside `gtars::tokenizers`
- create `tokenize` cli
- add tests for core utils and tokenizers
- RegionSet is now backed by a polars DataFrame
Expand All @@ -57,4 +57,4 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [0.0.1]
- initial setup of repository
- two main wrappers: 1) wrapper binary crate, and 2) wrapper library crate
- `genimtools` can be used as a library crate. or as a command line tool
- `gtars` can be used as a library crate. or as a command line tool
2 changes: 1 addition & 1 deletion genimtools/docs/logo.svg → gtars/docs/logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 2 additions & 2 deletions genimtools/src/lib.rs → gtars/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
//! # Genimtools: *<small>Performance-critical tools to manipulate, analyze, and process genomic interval data. </small>*
//! # gtars: *<small>Performance-critical tools to manipulate, analyze, and process genomic interval data. </small>*
//!
//! `genimtools` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide
//! `gtars` is a rust crate that provides a set of tools for working with genomic interval data. Its primary goal is to provide
//! processors for our python package, [`geniml`](https:github.com/databio/geniml), a library for machine learning on genomic intervals.
//! However, it can be used as a standalone library for working with genomic intervals as well.
//!
Expand Down
4 changes: 2 additions & 2 deletions genimtools/src/main.rs → gtars/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ use anyhow::Result;
use clap::Command;

// go through the library crate to get the interfaces
use genimtools::tokenizers;
// use genimtools::uniwig;
use gtars::tokenizers;
// use gtars::uniwig;

pub mod consts {
pub const VERSION: &str = env!("CARGO_PKG_VERSION");
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ pub mod config;

/// constants for the tokenizer module.
pub mod consts {
/// command for the `genimtools` cli
/// command for the `gtars` cli
pub const TOKENIZE_CMD: &str = "tokenize";
pub const UNIVERSE_FILE_NAME: &str = "universe.bed";
}
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
12 changes: 6 additions & 6 deletions genimtools/tests/test.rs → gtars/tests/test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ use std::path::Path;
use rstest::*;
use tempfile::NamedTempFile;

use genimtools::common::models::{Region, RegionSet};
use genimtools::io::{append_tokens_to_gtok_file, init_gtok_file, read_tokens_from_gtok};
use genimtools::tokenizers::{Tokenizer, TreeTokenizer};
use gtars::common::models::{Region, RegionSet};
use gtars::io::{append_tokens_to_gtok_file, init_gtok_file, read_tokens_from_gtok};
use gtars::tokenizers::{Tokenizer, TreeTokenizer};

#[fixture]
fn path_to_data() -> &'static str {
Expand Down Expand Up @@ -50,7 +50,7 @@ fn path_to_gtok_file() -> &'static str {
mod tests {
use std::io::Read;

use genimtools::common::utils::extract_regions_from_bed_file;
use gtars::common::utils::extract_regions_from_bed_file;

use super::*;

Expand Down Expand Up @@ -205,7 +205,7 @@ mod tests {
// let path_to_data = Path::new(path_to_data);
// let outdir = "tests/data/out";

// let res = genimtools::tools::pre_tokenize_data(path_to_data, outdir, &tokenizer);
// let res = gtars::tools::pre_tokenize_data(path_to_data, outdir, &tokenizer);
// assert!(res.is_ok());
// }

Expand All @@ -215,7 +215,7 @@ mod tests {
// let path_to_data = Path::new(path_to_tokenize_bed_file);
// let outdir = "tests/data/out";

// let res = genimtools::tools::pre_tokenize_data(path_to_data, outdir, &tokenizer);
// let res = gtars::tools::pre_tokenize_data(path_to_data, outdir, &tokenizer);
// assert!(res.is_ok());
// }
}

0 comments on commit c0fadea

Please sign in to comment.