Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Ingvarstep committed Oct 5, 2024
0 parents commit 485ad55
Show file tree
Hide file tree
Showing 25 changed files with 5,004 additions and 0 deletions.
8 changes: 8 additions & 0 deletions .changeset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Changesets

Hello and welcome! This folder has been automatically generated by `@changesets/cli`, a build tool that works
with multi-package repos, or single-package repos to help you version and publish your code. You can
find the full documentation for it [in our repository](https://github.com/changesets/changesets)

We have a quick list of common questions to get you started engaging with this project in
[our documentation](https://github.com/changesets/changesets/blob/main/docs/common-questions.md)
11 changes: 11 additions & 0 deletions .changeset/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"$schema": "https://unpkg.com/@changesets/[email protected]/schema.json",
"changelog": "@changesets/cli/changelog",
"commit": false,
"fixed": [],
"linked": [],
"access": "restricted",
"baseBranch": "main",
"updateInternalDependencies": "patch",
"ignore": []
}
9 changes: 9 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
root = true

[*]
indent_style = space
indent_size = 2
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
55 changes: 55 additions & 0 deletions .github/workflows/pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: "PR Workflow"

on:
pull_request:
branches:
- main

jobs:
build:
runs-on: ubuntu-latest
steps:
# Checkout the code
- name: Checkout Code
uses: actions/checkout@v3

# Set up Node.js
- name: Set Up Node.js
uses: actions/setup-node@v3
with:
node-version: 18

# Set up pnpm
- name: Set Up pnpm
uses: pnpm/action-setup@v2
with:
version: 6.0.2

# Install dependencies
- name: Install Dependencies with pnpm
run: pnpm install

# Run tests
- name: Run Tests
run: pnpm test

# Lint the code
- name: Lint Code
run: pnpm run lint

# Build the project
- name: Build Project
run: pnpm run build

# Check for pending changesets before running further tests and version checks
- name: Check for Changesets
id: check_changesets
run: |
if pnpm changeset status | grep -q "No unreleased changesets"; then
echo "No unreleased changesets found, create a changeset using pnpm changeset."
exit 1
else
echo "Changesets found, version bump possible."
exit 0
fi
continue-on-error: false
136 changes: 136 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
name: Release

on:
push:
branches:
- main # Trigger only on pushes to the main branch

workflow_dispatch: # Allows manually triggering the workflow for testing or urgent releases

concurrency: ${{ github.workflow }}-${{ github.ref }}

jobs:
release:
if: github.repository == 'GLiClass.js'
permissions:
contents: write # Required to create the release
actions: read # For checking token permissions
issues: write # For creating an issue
name: Create GitHub Release and Publish to npm
runs-on: ubuntu-latest

steps:
# Set NPM Registry to ensure the correct registry is used
- name: Set NPM Registry
run:
npm config set registry https://registry.npmjs.org/

# Set NPM registry and authentication token via .npmrc
- name: Create .npmrc file with auth token
run: |
echo "//registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}" > ~/.npmrc
# Verify NPM Authentication
- name: Verify NPM Authentication
run: npm whoami
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

# Checkout the code
- name: Checkout code
uses: actions/checkout@v3

# Set up Node.js
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: 18

# Set up pnpm
- name: Set up pnpm
uses: pnpm/action-setup@v2
with:
version: 6.0.2

# Install dependencies
- name: Install dependencies
run: pnpm install

# Check for pending changesets before proceeding with the release
- name: Check for changesets
id: check_changesets
run: |
if pnpm changeset status | grep -q "No unreleased changesets"; then
echo "No unreleased changesets found. Skipping version bump and release."
exit 0
fi
continue-on-error: false # If no changesets are found, this step will stop the workflow

# Run Changesets to version the packages and apply changelogs
- name: Run Changesets version
if: steps.check_changesets.outcome == 'success'
run: pnpm changeset version

# Commit the version bump and changelog updates (if applicable)
- name: Commit version bump
if: steps.check_changesets.outcome == 'success'
run: |
git config --global user.name "GitHub Actions"
git config --global user.email "[email protected]"
git add .
git commit -m "Version bump and changelog update"
git push
# Extract package name and version from package.json
- name: Get name and version from package.json
id: get_package_info
run: |
NAME=$(jq -r '.name' package.json)
VERSION=$(jq -r '.version' package.json)
echo "PACKAGE_NAME=$NAME" >> $GITHUB_ENV
echo "PACKAGE_VERSION=$VERSION" >> $GITHUB_ENV
# Create a new Git tag based on the version from package.json
- name: Create Tag
if: steps.check_changesets.outcome == 'success'
run: |
git tag v${{ env.PACKAGE_VERSION }}
git push origin v${{ env.PACKAGE_VERSION }}
# Build the package (after version bump)
- name: Build the package
if: steps.check_changesets.outcome == 'success'
run: pnpm run build # Ensure you have a build script in your package.json

# Create release archives (zip and gzip)
- name: Create source code archives
if: steps.check_changesets.outcome == 'success'
run: |
zip -r ${{ env.PACKAGE_NAME }}-${{ env.PACKAGE_VERSION }}.zip dist package.json src README.md CHANGELOG.md
tar -czvf ${{ env.PACKAGE_NAME }}-${{ env.PACKAGE_VERSION }}.tar.gz dist package.json src README.md CHANGELOG.md
# Create GitHub Release and Upload Release Assets (with display name "Source Code")
- name: Create GitHub Release and Upload Assets
if: steps.check_changesets.outcome == 'success'
uses: softprops/action-gh-release@v1
with:
tag_name: v${{ env.PACKAGE_VERSION }} # Use the tag created in the previous step
name: ${{ env.PACKAGE_VERSION }} # Use the version as the release name
files: |
${{ env.PACKAGE_NAME }}-${{ env.PACKAGE_VERSION }}.zip#Source Code
${{ env.PACKAGE_NAME }}-${{ env.PACKAGE_VERSION }}.tar.gz#Source Code
env:
GITHUB_TOKEN:
${{ secrets.GITHUB_TOKEN }} # GitHub token for authentication


# Set the NPM authentication token using pnpm
- name: Set NPM Auth Token
run: pnpm config set //registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}

# Publish to npm (activated)
- name: Publish to npm
if: steps.check_changesets.outcome == 'success'
run: pnpm publish --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} # Use NPM token for publishing
29 changes: 29 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
*.onnx

# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*

node_modules
dist
dist-ssr
*.local

# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?

*.tar.gz
*.zip
10 changes: 10 additions & 0 deletions .prettierrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"printWidth": 100,
"tabWidth": 2,
"useTabs": false,
"semi": true,
"singleQuote": false,
"trailingComma": "all",
"bracketSpacing": true,
"endOfLine": "lf"
}
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# gliclass
120 changes: 120 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@

# ⭐GLiClass.c: Generalist and Lightweight Model for Sequence Classification in C

GLiClass.c is a C - based inference engine for running GLiClass(Generalist and Lightweight Model for Sequence Classification) models. This is an efficient zero-shot classifier inspired by [GLiNER](https://github.com/urchade/GLiNER) work. It demonstrates the same performance as a cross-encoder while being more compute-efficient because classification is done at a single forward path.

It can be used for topic classification, sentiment analysis and as a reranker in RAG pipelines.

<p align="center">
<img src="kg.png" style="position: relative; top: 5px;">
<a href="https://www.knowledgator.com/"> Knowledgator</a>
<span>&nbsp;&nbsp;β€’&nbsp;&nbsp;</span>
<a href="https://www.linkedin.com/company/knowledgator/">βœ”οΈ LinkedIn</a>
<span>&nbsp;&nbsp;β€’&nbsp;&nbsp;</span>
<a href="https://discord.gg/NNwdHEKX">πŸ“’ Discord</a>
<span>&nbsp;&nbsp;β€’&nbsp;&nbsp;</span>
<a href="https://huggingface.co/spaces/knowledgator/GLiClass_SandBox">πŸ€— Space</a>
<span>&nbsp;&nbsp;β€’&nbsp;&nbsp;</span>
<a href="https://huggingface.co/collections/knowledgator/gliclass-6661838823756265f2ac3848">πŸ€— GliClass Collection</a>
</p>

## 🌟 Key Features

- Flexible entity recognition without predefined categories
- Lightweight and fast inference
- Easy integration with web applications
- TypeScript support for better developer experience

## πŸš€ Getting Started

### Installation

```bash
npm install gliclass
```

### Basic Usage

```javascript
const gliclass = new Gliclass({
tokenizerPath: "knowledgator/gliclass-small-v1.0",
onnxSettings: {
modelPath: "public/model.onnx",
executionProvider: "cpu",
multiThread: true,
},
promptFirst: false,
});

await gliclass.initialize();

const input_text = "Your input text here";
const texts = [input_text];
const labels = ["business", "science", "tech"];
const threshold = 0.5;

const decoded = await gliclass.inference({ texts, labels, threshold });
console.log(decoded);
```

### Advanced Usage

#### ONNX settings API

- modelPath: can be either a URL to a local model as in the basic example, or it can also be the Model itself as an array of binary data.
- executionProvider: these are the same providers that ONNX web supports, currently we allow `webgpu` (recommended), `cpu`, `wasm`, `webgl` but more can be added
- wasmPaths: Path to the wasm binaries, this can be either a URL to the binaries like a CDN url, or a local path to a folder with the binaries.
- multiThread: wether to multithread at all, only relevent for wasm and cpu exeuction providers.
- multiThread: When choosing the wasm or cpu provider, multiThread will allow you to specify the number of cores you want to use.
- fetchBinary: will prefetch the binary from the default or provided wasm paths

## πŸ›  Setup & Model Preparation

To use GLiNER models in a web environment, you need an ONNX format model. You can:

1. Search for pre-converted models on [HuggingFace](https://huggingface.co/onnx-community?search_models=gliclass)
2. Convert a model yourself using the [official Python script](https://github.com/Knowledgator/GLiClass.c/blob/main/ONNX_CONVERTING/convert_to_onnx.py)

### Converting to ONNX Format

Use the `convert_to_onnx.py` script with the following arguments:

- `model_path`: Location of the GLiNER model
- `save_path`: Where to save the ONNX file
- `quantize`: Set to True for IntU8 quantization (optional)

Example:

```bash
python convert_to_onnx.py --model_path /path/to/your/model --save_path /path/to/save/onnx --quantize True
```

## 🌟 Use Cases

GLiClass.js offers versatile text classification capabilities across various domains:

1. **Documents Classification**
2. **Sentiment Analysis**
3. **Reranking of Search Results**
...

## πŸ”§ Areas for Improvement

- [ ] Further optimize inference speed
- [ ] Add support for more architectures
- [ ] Enable model training capabilities
- [ ] Provide more usage examples

## Creating a PR

- for any changes, remember to run `pnpm changeset`, otherwise there will not be a version bump and the PR Github Action will fail.

## πŸ™ Acknowledgements

- [GLiNER original authors](https://github.com/urchade/GLiNER)
- [ONNX Runtime Web](https://github.com/microsoft/onnxruntime)
- [Transformers.js](https://github.com/xenova/transformers.js)

## πŸ“ž Support

For questions and support, please join our [Discord community](https://discord.gg/ApZvyNZU) or open an issue on GitHub.
Loading

0 comments on commit 485ad55

Please sign in to comment.