Skip to content

Commit

Permalink
initial upload
Browse files Browse the repository at this point in the history
  • Loading branch information
andr11b committed Dec 24, 2024
0 parents commit 6b2b571
Show file tree
Hide file tree
Showing 26 changed files with 1,883 additions and 0 deletions.
63 changes: 63 additions & 0 deletions .github/workflows/python.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: Python package

on:
push:
branches:
- main
tags:
- 'v*'
paths:
- 'fls/**'
- 'env/**'
- '.style.yapf'
pull_request:
paths:
- 'fls/**'
- 'env/**'
- '.style.yapf'
types: [opened, synchronize, reopened]

jobs:
test:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.8'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
- name: Run tests
run: |
python env/tests/run_tests.py
publish:
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
needs: test

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Install flit
run: python -m pip install --upgrade pip flit

- name: Publish to PyPI
env:
FLIT_USERNAME: ${{ secrets.PYPI_USERNAME }}
FLIT_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: flit publish
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
__pycache__
.pytest_cache
/*.code-workspace
*.bak*
36 changes: 36 additions & 0 deletions .style.yapf
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
[style]
BASED_ON_STYLE = pep8
CONTINUATION_ALIGN_STYLE = FIXED
CONTINUATION_INDENT_WIDTH = 4
COLUMN_LIMIT = 119
BLANK_LINE_BEFORE_NESTED_CLASS_OR_DEF = True
DEDENT_CLOSING_BRACKETS = True
EACH_DICT_ENTRY_ON_SEPARATE_LINE = True
ARITHMETIC_PRECEDENCE_INDICATION = True
ALLOW_SPLIT_BEFORE_DICT_VALUE = True
JOIN_MULTIPLE_LINES = True
INDENT_BLANK_LINES = False
INDENT_CLOSING_BRACKETS = False
INDENT_DICTIONARY_VALUE = True
INDENT_WIDTH = 4
NO_SPACES_AROUND_SELECTED_BINARY_OPERATORS = True
SPACES_AROUND_DICT_DELIMITERS = False
SPACES_AROUND_LIST_DELIMITERS = False
SPACES_AROUND_POWER_OPERATOR = False
SPACES_AROUND_SUBSCRIPT_COLON = False
SPACES_AROUND_TUPLE_DELIMITERS = False
SPACES_BEFORE_COMMENT = "2"
SPLIT_ALL_COMMA_SEPARATED_VALUES = True
SPLIT_ARGUMENTS_WHEN_COMMA_TERMINATED = True
SPLIT_BEFORE_ARITHMETIC_OPERATOR = True
SPLIT_BEFORE_BITWISE_OPERATOR = True
SPLIT_BEFORE_CLOSING_BRACKET = True
SPLIT_BEFORE_DICT_SET_GENERATOR = True
SPLIT_BEFORE_DOT = True
SPLIT_BEFORE_EXPRESSION_AFTER_OPENING_PAREN = True
SPLIT_BEFORE_FIRST_ARGUMENT = True
SPLIT_BEFORE_LOGICAL_OPERATOR = True
SPLIT_BEFORE_NAMED_ASSIGNS = True
SPLIT_COMPLEX_COMPREHENSION = True
SPACE_INSIDE_BRACKETS = False
COALESCE_BRACKETS = True
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Changelog

## 2024-11-22 (v0.0.1.241122)
- First release
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024, Andrii Burkatskyi aka andr11b

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
8 changes: 8 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
recursive-include fls *.py

include README.md LICENSE.txt CHANGELOG.md

recursive-exclude env *
exclude *.yapf
exclude *.pyc
exclude __pycache__
208 changes: 208 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
# File List Sieve

## What is it?

**File List Sieve** or simply **FLS** is a Python module for matching file system paths against patterns.

This functionality closely resembles the behavior of `.gitignore` and `.dockerignore`, making it intuitive for developers familiar with those systems. But unlike the mentioned systems, it allows you to choose what exactly you match files for - to ignore or to process.

**FLS** provides a flexible way to configure match rules for any project structure by using a custom rule file (e.g., `.fls`) to determine which files and directories should be processed. The system supports features like:
* Nested directories with inherited rules.
* Pattern negation using `!`.
* Wildcard matching for patterns (`*`, `?`, `**`, `[abc]`, etc.).


## Possible Use Cases

- **Ignoring Temporary Files**: In projects where temporary files or directories are created (e.g., during compilation or testing), FLS can be used to ignore these files when creating archives or versions.
- **Selective File Processing**: If a project contains files that need to be processed but not all files, FLS allows you to precisely define which files to include and which to ignore based on patterns.
- **Release Optimization**: When preparing a project for release, unnecessary files (e.g., logs, temporary files, or other auxiliary data) can be automatically excluded.
- **Working with Large Codebases**: In large projects with many subdirectories and files, FLS allows you to easily create and maintain rules for selective file handling.

## Syntax

As mentioned above the `.fls` rule files are similar to the well-known `.dockerignore` or `.gitignore` files, with some minore differences. Like these systems, rules are defined by specifying paths to files and directories in `.fls` files, which can be placed at any nesting level.

### Key differences and behavior

* **Context awareness**:

In `.gitignore`, patterns like an asterisk `*` or a simple one such as `foo` match every file and directory, regardless of how deeply they are nested. However, in `.fls`, the same rules only match files and directories located in the current directory. In `.fls`, all patterns are **relative to the location of the `.fls` file**, and this location is referred to as the context of a rule (or pattern).
Patterns can also explicitly define deeper levels; for example, `foo/bar` will match `bar` inside a directory named `foo` within the current context.

* **No leading slash (`/`) anchoring**:

Unlike `.gitignore`, `.fls` does not use leading slashes to anchor patterns to the root. Patterns like `/foo` or `/bar/` are treated identically to `foo` or `bar/`.

* **Trailing slash (`/`) for directories**:

A trailing slash (`/`) specifically matches directories only. For example:

- `foo/` matches a directory named `foo` but does not match `foo/bar`.
- `foo` matches both a file or directory named `foo`.

* **Non-greedy directory matching**:

Directory matches are *non-greedy*, meaning they do not extend to the content of the directory unless explicitly specified. For instance:

- `foo/` matches only the foo directory.
- `foo/**` matches `foo` and all its contents.

### Wildcards (globbing patterns)

Standard wildcards, also known as globbing patterns, are used for working with multiple files. Globbing is the process of expanding a wildcard pattern into a list of pathnames that match it. A string qualifies as a wildcard pattern if it includes any of the characters `?`, `*`, or `[`.

- A hash (`#`) signifies a comment. Lines starting with `#` are ignored.
```
# This is just a comment.
```
- A backslash (`\`) is used as an escape character to treat a special character literally.
```
# The pattern below will match a file named "#.txt"
\#.txt
```
- An asterisk (`*`) matches zero or more characters of any kind, excluding a slash (`/`).
```
# This pattern would match "`foobar`", "`foooobar`", and anything that
# starts with `foo` also including "`foo`" itself.
foo*
```
- An exclamation mark (`!`) indicates an exception. It is used to exclude specific files or directories from being matched by previous patterns.
```
# This ruleset matches all files ending with `.txt` but excludes
# `important.txt` from the match.
*.txt
!important.txt
```
- A question mark (`?`) matches exactly one character, excluding a slash (`/`).
```
# This pattern matches `hda`, `hdb`, `hdc`, and any other one-character
# variation, excluding slashes (`/`).
hd?
```
- A double asterisk (`**`) matches zero or more files and directories, including their contents, recursively.
```
# This will match all `.txt` files in any directory or subdirectory.
**/*.txt
```
- Square brackets (`[]`) specify a set or range of characters with an logical `OR` relationship, where any character within the brackets can match. Standard ranges include [0-9], [a-z], and [A-Z]. You can define subsets like `[0-4]` or `[a-d]`, combine ranges (e.g., `[0-9a-f]`), or mix ranges and individual characters (e.g., `[024abcXYZ]`).
```
# The next pattern matches `mam`, `mum`, or `mom`.
m[aou]m
# The next pattern matches `mam`, `mbm`, `mcm`, or `mdm`.
m[a-d]m
```
- `[!]` works as a logical `NOT`, inverting the character set specified in square brackets (`[]`). Unlike `[]`, which matches any character listed inside, `[!]` matches any character not listed between the brackets.
```
# The following pattern will match files starting with `file` that are
# followed by characters other than digits (e.g., `files`, `fileA`), but
# it will exclude files like `file0`, `file4` (those with digits `0-9`).
file[!0-9]
```
### Rule explanation
|           Pattern           |             Example matches             | Explanation |
| ---- | --------------- | ----------- |
| `file0.txt` | ~~`dirA/file0.txt`~~<br />~~`dirA/dirA/file0.txt`~~<br />`file0.txt` | The simplest pattern to match files and directories located at the top level of the context. |
| `dirA/` | `dirA/`<br />~~`dirA/file0.txt`~~<br />~~`dirB/dirA/`~~ | A trailing slash (`/`) indicates that patterns match directories only. Note that directories are matched in a *non-greedy* manner, excluding their contents. |
| `dirA/file0.txt`<br /> | `dirA/file0.txt`<br />~~`dirA/dirA/file0.txt`~~<br />~~`dirB/dirA/file0.txt`~~ | All patterns are anchored to the context level, matching the specified file path relative to it. |
| `*` | `dirA/`<br />~~`dirA/file0.txt`~~<br />`file0.txt` | A positive match for any file or directory located at the root level of the context. |
| `*/` | `dirA/`<br />~~`dirA/file0.txt`~~<br />~~`file0.txt`~~ | A positive match for any directory located at the root level of the context, without including its contents. |
| `*`<br>`!*/` | ~~`dirA/`~~<br />`file0.txt` | A trick to positively match all files at the root level of the context while excluding directories. |
| `*/*` | `dirA/dirA`<br />`dirA/file0.txt` | A positive match for any second-level objects. |
| `*/file0.txt` | `dirA/file0.txt`<br />~~`dirA/dirB/file0.txt`~~<br />`dirB/file0.txt`<br />~~`file0.txt`~~ | A more material case of the previous pattern. |
| `*/*/` | `dirA/dirA/`<br />`dirA/dirB/`<br />~~`dirA/file0.txt`~~ | A positive match for any second-level directory. |
| `foo/*` | ~~`foo/`~~<br />`foo/foo/`<br />`foo/bar` | A pattern to positively match any object located directly inside the `foo` directory, excluding the `foo` directory itself. |
| `*/dirA/` | ~~`dirA/`~~<br />`dirA/dirB/`<br />~~`dirA/file0.txt`~~<br />~~`dirB/`~~<br />`dirB/dirA/` | A more material case of the previous pattern. |
| `foo*` | `foo`<br />`foobar`<br />`foooobar`<br />`foo.bar` | A positive match for any file or directory started with `foo` (at the related context level, of course). |
| `*bar` | `foobar`<br />`foooobar`<br />`foo.bar`<br />`bar` | A positive match for any file or directory ending with `bar` (again, at the related context level). |
| `**` | `dirA/`<br />`dirA/dirA/`<br />`dirA/dirA/file0.log`<br />`dirA/file0.txt`<br />`file0.txt` | A positive match for all files and directories, including their contents, recursively.|
| `**/` | `dirA/`<br />`dirA/dirA/`<br />~~`dirA/dirA/file0.log`~~<br />~~`dirA/file0.txt`~~<br />~~`file0.txt`~~ | A positive match for all directories and their subdirectories, recursively. |
| `**`<br />`!**/` | ~~`dirA/`~~<br />~~`dirA/dirA/`~~<br />`dirA/dirA/file0.log`<br />`dirA/file0.txt`<br />`file0.txt` | A trick to positively match all files, recursively while excluding directories. |
| `**/**` | ~~`dirA/`~~<br />`dirA/dirA/`<br />`dirA/dirA/file0.log`<br />`dirA/file0.txt`<br />~~`file0.txt`~~ | A recursive match for all objects located at the second level and deeper. |
| `**/**/` | ~~`dirA/`~~<br />`dirA/dirA/`<br />~~`dirA/dirA/file0.log`~~<br />~~`dirA/file0.txt`~~<br />~~`file0.txt`~~ | A recursive match for all directories located at the second level and deeper. |
| `dirA/**` | ~~`dirA/`~~<br />`dirA/dirB/`<br />`dirA/dirB/.../file0.txt` | A pattern to positively match any object inside the `dirA` directory, at any nesting level, recursively, excluding the `dirA` directory itself. |
| `dirA/**/file0.txt` | `dirA/dirA/file0.txt`<br />`dirA/dirA/dirA/file0.txt`<br />~~`dirA/file0.txt`~~ | The pattern will not match `dirA/file0.txt` because `/**/` requires at least one additional level of nesting between `dirA` and `file0.txt`. |
| `dirA/**file0.txt` | `dirA/dirA/file0.txt`<br />`dirA/dirA/dirA/file0.txt`<br />`dirA/file0.txt` | In contrast, the pattern matches `dirA/file0.txt` here, as `/**` allows matching files at any depth within `dirA`, including directly inside it. The slashes make the difference! |
| `foo**bar` | `foo/foo/bar`<br />`foo/bar/`<br />`foobar`<br /><br />`foo/foobar/`<br />`foo.bar` | A pattern to recursively match any path starting with `foo` and ending with `bar`, regardless of nesting. |
| `foo?.bar` | `foo0.bar`<br />`foo1.bar`<br />`fooA.bar`<br />`foo..bar` | A positive match for filename where the `?` represents exactly one character other than a slash (`/`). |
| `foo?.bar` | ~~`foo.bar`~~ | Because *the `?` represents exactly one character*. |
| `foo?bar` | `foo_bar`<br />`foo.bar`<br />~~`foo/bar`~~ | Because *other than a slash `/`*. |
| `file[0-9].txt` | `file0.txt`<br />`file1.txt`<br />...<br />`file9.txt`<br />~~`files.txt`~~ | Matches any file with the name pattern `file?.txt` where the `?` is a digit from `0` to `9`. |
| `file[!9a].txt` | `file0.txt`<br />`file1.txt`<br />...<br />~~`file9.txt`~~<br />~~`filea.txt`~~<br />`files.txt` | Matches any file with the name pattern `file?.txt` where the `?` is any character except `9`. |
| `file\*\*.txt` | `file**.txt`<br />~~`file1.txt`~~<br /> | Backslashes `\` escape the asterisks `*`, so it will handle them literally as any other characters. This means it will match a file named `file**.txt`, not any file pattern. |
| `\!file.txt`<br />`\#file.txt` | `!file.txt`<br />`\#file.txt` | Escaped exclamation mark `!` and hash sign `#` will also be handled literally as any other characters, meaning the pattern will match files named `!file.txt` and `#file.txt` without treating them as special symbols. |
## Installation
### From GitHub
```bash
git clone https://github.com/codyverse/fls.git
cd fls
```
No additional dependencies are required.

### Via PIP
- Main package:
```bash
pip install fls
```
- A package with additional test dependencies:
```bash
pip install fls[dev]
```


## Usage

### Basic Setup
1. Create a `.fls` file in your project root or specific directories.
2. Add rule patterns to the `.fls` file (one per line).

Example `.fls` file:
```
# Match all `.log` files
*.log
# Match `temp/` directory
temp/
# Do not match `temp/keep.txt`
!temp/keep.txt
```

3. Use the FLS class to scan and check matched files.
```python
from fls import FLS

# Initialize FLS with the root directory and protocol file
fls = FLS(root='path_to_your_project', protocol='.fls')

# Check if a specific file or directory matches the given rules
print(fls.is_matched('path_to_your_project/temp/some_file.log')) # Output: True if matched, False if ignored
print(fls.is_matched('path_to_your_project/temp/keep.txt')) # Output: True if matched, False if ignored

# Retrieve and print all rules for a specific directory
for rule in fls.get_rules('path_to_your_project/temp'):
print(rule.get_pattern) # Prints the pattern of each rule for the specified directory

# Retrieve and print all rules for each directory in the project
for path, rules in fls.get_all_rules():
print(f"{os.path.relpath(path, test_path)}") # Prints the relative path of the directory
for rule in rules:
_r = ', '.join(f"'{key}': '{value}'" for key, value in rule.rule.items()) # Prints the rule details
print(f" {_r}")

# Retrieve a list of matched and unmatched files and directories and print their status
for path, is_matched in fls.matched():
status = f"Matched" if is_matched else "Ignored" # Sets the status based on whether the path is matched
print(f"{path}: {status}")
```
**Note**: Replace `path_to_your_project` with the actual path to your project directory.

## Contributing

Feel free to contribute by submitting issues or pull requests!
Loading

0 comments on commit 6b2b571

Please sign in to comment.