Skip to content

Commit

Permalink
Merge pull request #53 from ahmetoner/add-new-large-model-v2
Browse files Browse the repository at this point in the history
Add new large model v2
  • Loading branch information
ahmetoner authored Dec 8, 2022
2 parents e9daea4 + f777535 commit 10b242f
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 126 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ docker run -d -p 9000:9000 -e ASR_MODEL=base onerahmet/openai-whisper-asr-webser
# Interactive Swagger API documentation is available at http://localhost:9000/docs
```

Available ASR_MODELs are `tiny`, `base`, `small`, `medium` and `large`
Available ASR_MODELs are `tiny`, `base`, `small`, `medium`, `large`, `large-v1` and `large-v2`. Please note that `large` and `large-v2` are the same model.

For English-only applications, the `.en` models tend to perform better, especially for the `tiny.en` and `base.en` models. We observed that the difference becomes less significant for the `small.en` and `medium.en` models.

Expand Down Expand Up @@ -64,7 +64,7 @@ poetry install
Starting the Webservice:

```sh
gunicorn --bind 0.0.0.0:9000 --workers 1 --timeout 0 app.webservice:app -k uvicorn.workers.UvicornWorker
gunicorn --bind 0.0.0.0:9001 --workers 1 --timeout 0 app.webservice:app -k uvicorn.workers.UvicornWorker
```

## Quick start
Expand Down
120 changes: 0 additions & 120 deletions app/languages.py

This file was deleted.

5 changes: 3 additions & 2 deletions app/webservice.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
from fastapi.openapi.docs import get_swagger_ui_html
import whisper
from whisper.utils import write_srt, write_vtt
from whisper import tokenizer
import os
from os import path
from pathlib import Path
import ffmpeg
from typing import BinaryIO, Union
from .languages import LANGUAGES, LANGUAGE_CODES
import numpy as np
from io import StringIO
from threading import Lock
Expand All @@ -18,6 +18,7 @@
import importlib.metadata

SAMPLE_RATE=16000
LANGUAGE_CODES=sorted(list(tokenizer.LANGUAGES.keys()))

projectMetadata = importlib.metadata.metadata('whisper-asr-webservice')
app = FastAPI(
Expand Down Expand Up @@ -101,7 +102,7 @@ def language_detection(
_, probs = model.detect_language(mel)
detected_lang_code = max(probs, key=probs.get)

result = { "detected_language": LANGUAGES[detected_lang_code],
result = { "detected_language": tokenizer.LANGUAGES[detected_lang_code],
"langauge_code" : detected_lang_code }

return result
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "whisper-asr-webservice"
version = "1.0.4"
version = "1.0.5"
description = "Whisper ASR Webservice is a general-purpose speech recognition webservice."
homepage = "https://github.com/ahmetoner/whisper-asr-webservice/"
license = "https://github.com/ahmetoner/whisper-asr-webservice/blob/main/LICENCE"
Expand All @@ -16,7 +16,7 @@ python = "^3.9"
unidecode = "^1.3.4"
uvicorn = { extras = ["standard"], version = "^0.18.2" }
gunicorn = "^20.1.0"
whisper = {git = "https://github.com/openai/whisper.git", rev="eff383b27b783e280c089475852ba83f20f64998"}
whisper = {git = "https://github.com/openai/whisper.git", rev="b9265e5796f5d80c18d1f9231ab234225676780b"}
tqdm = "^4.64.1"
transformers = "^4.22.1"
python-multipart = "^0.0.5"
Expand Down

0 comments on commit 10b242f

Please sign in to comment.