Skip to content

Commit

Permalink
Test (#101)
Browse files Browse the repository at this point in the history
* Yiddish transliteration via submodules.

* Update checkout workflow.

* Change refs for Yiddish submodules.

* Fix WORKDIR in Dockerfile

* Do not remove yiddish module.

* Manually add yiddish submodules.

* Use git clone instead of submodule.

* Move ext checkout to github actions.

* Chinese numerals (#97)

* WIP Parse Chinese numerals.

* WIP complete number parsing.

* Complete Chinese numerals:

* Use standard table override instead of pre-config hooks.
* Add few test strings.

* Complete numerals:

* Transliterate all numeric examples correctly
* Modify hook return logic for consistency
* WIP partial spacing fix.

* Some cleanup; upgrade docker OS.

* Add dependency for uwsgi.

* Squashed commit of the following: (#98)

commit 30859a5
Author: scossu <[email protected]>
Date:   Wed Feb 28 22:17:36 2024 -0500

    Move ext checkout to github actions.

commit 6d8da6d
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:45:01 2024 -0500

    Use git clone instead of submodule.

commit ade9da5
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:42:45 2024 -0500

    Manually add yiddish submodules.

commit 77cb9ef
Author: scossu <[email protected]>
Date:   Wed Feb 28 21:23:37 2024 -0500

    Do not remove yiddish module.

commit e405b36
Author: scossu <[email protected]>
Date:   Wed Feb 28 09:11:41 2024 -0500

    Fix WORKDIR in Dockerfile

commit 95445ba
Author: scossu <[email protected]>
Date:   Wed Feb 28 09:07:50 2024 -0500

    Change refs for Yiddish submodules.

commit 208ea09
Author: scossu <[email protected]>
Date:   Wed Feb 28 08:45:58 2024 -0500

    Update checkout workflow.

* Add debug output to /trans response.

* Split docker files and requirements.

* Add bad request debug handler.

* Add bad request debug handler.

* Adjust CI workflows.

* Fix image name typo.

* Refine triggers.

* Fix typo on test workflow trigger.

* Use JSON in POST body.

* Also use JSON in feedback request; update docs.

* Return json data in 400 debug.
  • Loading branch information
scossu authored May 9, 2024
1 parent b303902 commit fa5b48d
Show file tree
Hide file tree
Showing 16 changed files with 268 additions and 79 deletions.
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
name: Push image to Docker Hub.
name: Push app image
on:
# This runs on v *.*.0 after the base image has been
# built and pushed, or on patch version tag.
push:
tags:
- "v*.*.*"
- "v*.*.[1-9]*"
workflow_run:
workflows:
- "Push base image"
types:
- "completed"

env:
DOCKER_USER: lcnetdev
Expand All @@ -13,13 +20,15 @@ jobs:
push-image-to-docker-hub:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: checkout repo
uses: actions/checkout@v4
with:
submodules: recursive

- name: Build the Docker image
run: >
docker build . --tag $DOCKER_USER/$REPO_NAME:${{ github.ref_name }}
docker build -f Dockerfile .
--tag $DOCKER_USER/$REPO_NAME:${{ github.ref_name }}
--tag $DOCKER_USER/$REPO_NAME:latest
- name: Login to Docker Hub
Expand Down
46 changes: 46 additions & 0 deletions .github/workflows/push-base-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Push base image
on:
push:
tags:
- "v*.*.0"

env:
DOCKER_USER: lcnetdev
DOCKER_PASSWORD: ${{secrets.DOCKER_HUB}}
REPO_NAME: scriptshifter-base

jobs:
push-image-to-docker-hub:
runs-on: ubuntu-latest
steps:
- name: checkout repo
uses: actions/checkout@v4
with:
submodules: recursive

- name: checkout yiddish submodules (1/2)
uses: actions/checkout@v4
with:
repository: ibleaman/loshn-koydesh-pronunciation
path: ext/yiddish/yiddish/submodules/loshn-koydesh-pronunciation

- name: checkout yiddish submodules (2/2)
uses: actions/checkout@v4
with:
repository: ibleaman/hasidify_lexicon
path: ext/yiddish/yiddish/submodules/hasidify_lexicon

- name: Build the Docker image
run: >
docker build -f scriptshifter_base.Dockerfile .
--tag $DOCKER_USER/$REPO_NAME:${{ github.ref_name }}
--tag $DOCKER_USER/$REPO_NAME:latest
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: lcnetdev
password: ${{ secrets.DOCKER_HUB }}

- name: Push to Docker Hub
run: docker push $DOCKER_USER/$REPO_NAME --all-tags
14 changes: 9 additions & 5 deletions .github/workflows/push-test-image.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Push test image to Docker Hub.
name: Push test image
on:
push:
branch:
- "main"
branches:
- "test"

env:
DOCKER_USER: lcnetdev
Expand All @@ -13,12 +13,16 @@ jobs:
push-image-to-docker-hub:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: checkout repo
uses: actions/checkout@v4
with:
submodules: recursive

- name: Build the Docker image
run: docker build . --tag $DOCKER_USER/$REPO_NAME:test
run: >
docker build -f Dockerfile .
--tag $DOCKER_USER/$REPO_NAME:${{ github.ref_name }}
--tag $DOCKER_USER/$REPO_NAME:test
- name: Login to Docker Hub
uses: docker/login-action@v3
Expand Down
4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
[submodule "ext/arabic_rom"]
path = ext/arabic_rom
url = https://github.com/fadhleryani/Arabic_ALA-LC_Romanization.git
[submodule "ext/yiddish"]
path = ext/yiddish
url = https://github.com/scossu/yiddish.git
branch = loc
28 changes: 7 additions & 21 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,29 +1,15 @@
FROM python:3.10-slim-bookworm

RUN apt update
RUN apt install -y build-essential tzdata gfortran libopenblas-dev libboost-all-dev libpcre2-dev

ENV TZ=America/New_York
ENV _workroot "/usr/local/scriptshifter/src"

WORKDIR ${_workroot}
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

# Remove development packages.
RUN apt remove -y build-essential
RUN apt autoremove -y

RUN addgroup --system www
RUN adduser --system www
RUN gpasswd -a www www
FROM lcnetdev/scriptshifter-base:latest
ARG WORKROOT "/usr/local/scriptshifter/src"

# Copy core application files.
WORKDIR ${WORKROOT}
COPY entrypoint.sh uwsgi.ini wsgi.py ./
COPY ext ./ext/
COPY scriptshifter ./scriptshifter/
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

RUN chmod +x ./entrypoint.sh
RUN chown -R www:www ${_workroot} .
#RUN chown -R www:www ${WORKROOT} .

EXPOSE 8000

Expand Down
7 changes: 7 additions & 0 deletions deps.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# External dependencies.
aksharamukha>=2.1,<3
camel-tools>=1.5
funcy>=1.15,<2
pymarc>=4.0,<5
repackage>=0.7.3
./ext/yiddish
23 changes: 23 additions & 0 deletions doc/rest_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,10 @@ Transliterate an input string into a given language.

### POST body

MIME type: `application/json`

Content: JSON object with the following keys:

- `lang`: Language code as given by the `/languages` endpoint.
- `text`: Input text to be transliterated.
- `capitalize`: One of `first` (capitalize the first letter of the input),
Expand All @@ -92,3 +96,22 @@ Content: JSON object containing two keys: `ouput` containing the transliterated
string; and `warnings` containing a list of warnings. Characters not found in
the mapping are copied verbatim in the transliterated string (see
"Configuration files" section for more information).

## `POST /feedback`

Send a feedback form about a transliteration result.

### POST body

MIME type: `application/json`

Content: JSON object with the following keys:

`lang`: language of the transliteration. Mandatory.
`src`: source text. Mandatory.
`t_dir`: transliteration direction. If omitted, it defaults to `s2r`.
`result`: result of the transliteration. Mandatory.
`expected`: expected result. Mandatory.
`options`: options passed to the request, if any.
`notes`: optional user notes.
`contact`: contact email for feedback. Optional.
1 change: 1 addition & 0 deletions ext/yiddish
Submodule yiddish added at 9bf22c
6 changes: 1 addition & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
aksharamukha>=2.1,<3
camel-tools>=1.5
# Core application dependencies.
flask>=2.3,<3
funcy>=1.15,<2
pymarc>=4.0,<5
python-dotenv>=1.0,<2
pyyaml>=6.0,<7
repackage>=0.7.3
uwsgi>=2.0,<2.1
51 changes: 51 additions & 0 deletions scriptshifter/hooks/yiddish_/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# @package ext

__doc__ = """
Yiddish transliteration module.
Courtesy of Isaac Bleaman and Asher Lewis.
https://github.com/ibleaman/yiddish.git
Note the underscore in the module name to disambiguate with the `yiddish`
external package name.
"""


from yiddish import detransliterate, transliterate

from scriptshifter.exceptions import BREAK
from scriptshifter.tools import capitalize


def s2r_post_config(ctx):
"""
Script to Roman.
"""

rom = transliterate(
ctx.src, loc=True,
loshn_koydesh=ctx.options.get("loshn_koydesh"))

if ctx.options["capitalize"] == "all":
rom = capitalize(rom)
elif ctx.options["capitalize"] == "first":
rom = rom[0].upper() + rom[1:]

ctx.dest = rom

return BREAK


def r2s_post_config(ctx):
"""
Roman to script.
NOTE: This doesn't support the `loc` option.
"""

ctx.dest = detransliterate(
ctx.src,
loshn_koydesh=ctx.options.get("loshn_koydesh"))

return BREAK
48 changes: 29 additions & 19 deletions scriptshifter/rest_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
from base64 import b64encode
from copy import deepcopy
from email.message import EmailMessage
from json import dumps, loads
from json import dumps
from os import environ, urandom
from smtplib import SMTP

from flask import Flask, jsonify, render_template, request
from werkzeug.exceptions import BadRequest

from scriptshifter import EMAIL_FROM, EMAIL_TO, SMTP_HOST, SMTP_PORT
from scriptshifter.exceptions import ApiError
Expand Down Expand Up @@ -46,6 +47,20 @@ def handle_exception(e: ApiError):
}, e.status_code)


@app.errorhandler(BadRequest)
def handle_400(e):
if logging.DEBUG >= logging.root.level:
body = {
"debug": {
"form_data": request.json or request.form,
}
}
else:
body = ""

return body, 400


@app.route("/", methods=["GET"])
def index():
return render_template(
Expand Down Expand Up @@ -91,16 +106,16 @@ def get_options(lang):

@app.route("/trans", methods=["POST"])
def transliterate_req():
lang = request.form["lang"]
in_txt = request.form["text"]
capitalize = request.form.get("capitalize", False)
t_dir = request.form.get("t_dir", "s2r")
lang = request.json["lang"]
in_txt = request.json["text"]
capitalize = request.json.get("capitalize", False)
t_dir = request.json.get("t_dir", "s2r")
if t_dir not in ("s2r", "r2s"):
return f"Invalid direction: {t_dir}", 400

if not len(in_txt):
return ("No input text provided! ", 400)
options = loads(request.form.get("options", "{}"))
options = request.json.get("options", {})
logger.debug(f"Extra options: {options}")

try:
Expand All @@ -116,14 +131,9 @@ def feedback():
"""
Allows users to provide feedback to improve a specific result.
"""
lang = request.form["lang"]
src = request.form["src"]
t_dir = request.form.get("t_dir", "s2r")
result = request.form["result"]
expected = request.form["expected"]
options = request.form.get("options", {})
notes = request.form.get("notes")
contact = request.form.get("contact")
t_dir = request.json.get("t_dir", "s2r")
options = request.json.get("options", {})
contact = request.json.get("contact")

msg = EmailMessage()
msg["subject"] = "Scriptshifter feedback report"
Expand All @@ -133,16 +143,16 @@ def feedback():
msg["cc"] = contact
msg.set_content(f"""
*Scriptshifter feedback report from {contact or 'anonymous'}*\n\n
*Language:* {lang}\n
*Language:* {request.json['lang']}\n
*Direction:* {
'Roman to Script' if t_dir == 'r2s'
else 'Script to Roman'}\n
*Source:* {src}\n
*Result:* {result}\n
*Expected result:* {expected}\n
*Source:* {request.json['src']}\n
*Result:* {request.json['result']}\n
*Expected result:* {request.json['expected']}\n
*Applied options:* {dumps(options)}\n
*Notes:*\n
{notes}""")
{request.json['notes']}""")

# TODO This uses a test SMTP server:
# python -m smtpd -n -c DebuggingServer localhost:1025
Expand Down
Loading

0 comments on commit fa5b48d

Please sign in to comment.