Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Python 3 and iRODS 4.3.3 #536

Merged
merged 27 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
bbfbaf0
YDA-5992: remove Python 2 lint workflow
lwesterhof Nov 19, 2024
72dcf33
YDA-5992: Python3 simplifies UTF-8 handling
lwesterhof Nov 19, 2024
3f57299
YDA-5992: as of PEP 3120, the default encoding is UTF-8
lwesterhof Nov 19, 2024
7540863
YDA-5992: remove unicode literals, percent string formatting and upda…
lwesterhof Nov 19, 2024
d67314e
YDA-5992: put underscores in access_types
lwesterhof Nov 19, 2024
1397ecf
YDA-5992: Python 3 returns iterators from map() and filter()
lwesterhof Nov 20, 2024
71ea79d
YDA-5992: range() does not return a list
lwesterhof Nov 20, 2024
99de46b
YDA-5992: write string to severLog instead of line since logs are in …
lwesterhof Nov 20, 2024
01c72a0
YDA-5992: upgrade workflows to Python 3.12
lwesterhof Nov 22, 2024
6793b4b
YDA-5992: install pysqlcipher3 with ruleset
lwesterhof Nov 22, 2024
530f787
YDA-5992: fix parsing query columns
lwesterhof Nov 22, 2024
90ed61c
YDA-5992: convert binary string
lwesterhof Nov 22, 2024
dd69f97
YDA-5992: use msiBytesBufToStr to convert iRODS bytes buffer to string
lwesterhof Nov 26, 2024
c91a0f3
YDA-5992: fix type errors
lwesterhof Nov 27, 2024
d64a41a
YDA-5992: replace itertools.imap() with map()
lwesterhof Nov 27, 2024
ee1fb91
YDA-5992: catch all exceptions when loading text data object
lwesterhof Nov 28, 2024
6f57454
YDA-5992: fix encoding for DataCite payload
lwesterhof Nov 28, 2024
dfb2ae9
YDA-5992: don't try to decode Unicode strings
lwesterhof Nov 28, 2024
ff7d99e
YDA-5992: cleanup and replace itertools.ifilter with filter
lwesterhof Nov 29, 2024
72a31af
YDA-5992: rearrange admin scripts
lwesterhof Dec 2, 2024
2943ea7
YDA-5992: fix urllib quoting
lwesterhof Dec 3, 2024
e863948
YDA-5992: init UUError message
lwesterhof Dec 3, 2024
991fc21
YDA-5992: limit search for data packages to vault
lwesterhof Dec 4, 2024
b8797bb
YDA-5992: clean up scheduled admin jobs
lwesterhof Dec 4, 2024
8893788
YDA-5992: increase waiting time for archival of deposit
lwesterhof Dec 5, 2024
23d6670
YDA-5992: add type annotations
lwesterhof Dec 10, 2024
02a658a
YDA-5992: add type annotations to utils
lwesterhof Dec 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/api-and-integration-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ jobs:
cd tests
nohup bash -c 'while true ; do sleep 5 ; ../yoda/docker/run-cronjob.sh copytovault >> ../copytovault.log 2>&1 ; ../yoda/docker/run-cronjob.sh publication >> ../publication.log 2>&1 ; done' &
test -d mycache || mkdir -p mycache
python3 -m pytest --skip-ui --datarequest --deposit -o cache_dir=mycache --environment environments/docker.json
python3 -m pytest --skip-ui --deposit -o cache_dir=mycache --environment environments/docker.json
cat ../copytovault.log
cat ../publication.log

Expand Down
11 changes: 8 additions & 3 deletions .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ on: [push, pull_request]

jobs:
lint:
runs-on: ubuntu-20.04
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11']
python-version: ['3.11', '3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Python
Expand All @@ -19,12 +19,17 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install flake8==6.0.0 flake8-import-order==0.18.2 darglint==1.8.1 codespell types-requests
python -m pip install flake8==6.0.0 flake8-import-order==0.18.2 darglint==1.8.1 codespell
python -m pip install mypy types-requests types-python-dateutil types-redis

- name: Lint with flake8
run: |
flake8 --statistics

- name: Check static typing
run: |
mypy . --explicit-package-bases

- name: Check code for common misspellings
run: |
codespell -q 3 --skip="*.r,*.xsd,*.json" || true
Expand Down
35 changes: 0 additions & 35 deletions .github/workflows/python2.yml

This file was deleted.

16 changes: 7 additions & 9 deletions .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,31 +10,29 @@ on:

jobs:
unit-tests:
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: [2.7]
python-version: ['3.12']
steps:
- uses: actions/checkout@v4

- name: Set up Python
# setup-python stopped supporting Python 2.7, use https://github.com/MatteoH2O1999/setup-python
uses: MatteoH2O1999/[email protected]
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
allow-build: info
cache-build: true
architecture: x64

- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install coveragepy==1.6.0
python -m pip install coverage==7.6.7

- name: Run unit tests
run: |
cd unit-tests
coverage run --omit=test_*.py,unit_tests.py --source=$(cd .. ; pwd),$(cd ../util ; pwd) -m unittest unit_tests
export PYTHONPATH=$(cd ../util ; pwd):$PYTHONPATH
coverage run --omit=test_*.py,unit_tests.py -m unittest unit_tests

- name: Report code coverage
run: |
Expand Down
6 changes: 4 additions & 2 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# -*- coding: utf-8 -*-
"""Yoda core ruleset containing iRODS and Python rules and policies useful for all Yoda environments."""

__version__ = '1.10.0'
__version__ = '2.0.0'
__copyright__ = 'Copyright (c) 2015-2024, Utrecht University'
__license__ = 'GPLv3, see LICENSE'

Expand All @@ -23,6 +22,9 @@
+ ', Jelmer Zondergeld')
# (in alphabetical order)

import sys
sys.path.extend([ '/etc/irods/rules_uu', '/etc/irods/rules_uu/util' ])

# Import all modules containing rules into the package namespace,
# so that they become visible to iRODS.

Expand Down
3 changes: 1 addition & 2 deletions admin.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# -*- coding: utf-8 -*-
"""Functions for admin module."""

__copyright__ = 'Copyright 2024, Utrecht University'
Expand All @@ -12,7 +11,7 @@


@api.make()
def api_admin_has_access(ctx):
def api_admin_has_access(ctx: rule.Context) -> api.Result:
"""
Checks if the user has admin access based on user rights or membership in admin-priv group.

Expand Down
81 changes: 42 additions & 39 deletions browse.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# -*- coding: utf-8 -*-
"""Functions for listing collection information."""

__copyright__ = 'Copyright (c) 2019-2024, Utrecht University'
__license__ = 'GPLv3, see LICENSE'

import re
from collections import OrderedDict
from typing import Dict

import magic
from genquery import AS_DICT, Query
Expand All @@ -19,13 +19,13 @@


@api.make()
def api_browse_folder(ctx,
coll='/',
sort_on='name',
sort_order='asc',
offset=0,
limit=10,
space=pathutil.Space.OTHER.value):
def api_browse_folder(ctx: rule.Context,
coll: str = '/',
sort_on: str = 'name',
sort_order: str = 'asc',
offset: int = 0,
limit: int = 10,
space: str = pathutil.Space.OTHER.value) -> api.Result:
"""Get paginated collection contents, including size/modify date information.

:param ctx: Combined type of a callback and rei struct
Expand All @@ -38,9 +38,9 @@ def api_browse_folder(ctx,

:returns: Dict with paginated collection contents
"""
def transform(row):
def transform(row: Dict) -> Dict:
# Remove ORDER_BY etc. wrappers from column names.
x = {re.sub('.*\((.*)\)', '\\1', k): v for k, v in row.items()}
x = {re.sub(r'.*\((.*)\)', '\\1', k): v for k, v in row.items()}
if 'DATA_NAME' in x and 'META_DATA_ATTR_VALUE' in x:
return {x['DATA_NAME']: x['META_DATA_ATTR_VALUE']}
elif 'DATA_NAME' in x:
Expand Down Expand Up @@ -89,11 +89,11 @@ def transform(row):
qcoll = Query(ctx, ccols, "COLL_PARENT_NAME = '{}'".format(coll),
offset=offset, limit=limit, output=AS_DICT)

colls = map(transform, [c for c in list(qcoll) if _filter_vault_deposit_index(c)])
colls = list(map(transform, [c for c in list(qcoll) if _filter_vault_deposit_index(c)]))

qdata = Query(ctx, dcols, "COLL_NAME = '{}' AND DATA_REPL_STATUS n> '0'".format(coll),
offset=max(0, offset - qcoll.total_rows()), limit=limit - len(colls), output=AS_DICT)
datas = map(transform, list(qdata))
datas = list(map(transform, list(qdata)))

# No results at all? Make sure the collection actually exists.
if len(colls) + len(datas) == 0 and not collection.exists(ctx, coll):
Expand All @@ -105,13 +105,13 @@ def transform(row):


@api.make()
def api_browse_collections(ctx,
coll='/',
sort_on='name',
sort_order='asc',
offset=0,
limit=10,
space=pathutil.Space.OTHER.value):
def api_browse_collections(ctx: rule.Context,
coll: str = '/',
sort_on: str = 'name',
sort_order: str = 'asc',
offset: int = 0,
limit: int = 10,
space: str = pathutil.Space.OTHER.value) -> api.Result:
"""Get paginated collection contents, including size/modify date information.

This function browses a folder and only looks at the collections in it. No dataobjects.
Expand All @@ -127,9 +127,9 @@ def api_browse_collections(ctx,

:returns: Dict with paginated collection contents
"""
def transform(row):
def transform(row: Dict) -> Dict:
# Remove ORDER_BY etc. wrappers from column names.
x = {re.sub('.*\((.*)\)', '\\1', k): v for k, v in row.items()}
x = {re.sub(r'.*\((.*)\)', '\\1', k): v for k, v in row.items()}

if 'DATA_NAME' in x:
return {'name': x['DATA_NAME'],
Expand Down Expand Up @@ -173,7 +173,7 @@ def transform(row):
qcoll = Query(ctx, ccols, "COLL_PARENT_NAME = '{}'".format(coll),
offset=offset, limit=limit, output=AS_DICT)

colls = map(transform, [d for d in list(qcoll) if _filter_vault_deposit_index(d)])
colls = list(map(transform, [d for d in list(qcoll) if _filter_vault_deposit_index(d)]))

# No results at all? Make sure the collection actually exists.
if len(colls) == 0 and not collection.exists(ctx, coll):
Expand All @@ -185,13 +185,13 @@ def transform(row):


@api.make()
def api_search(ctx,
search_string,
search_type='filename',
sort_on='name',
sort_order='asc',
offset=0,
limit=10):
def api_search(ctx: rule.Context,
search_string: str,
search_type: str = 'filename',
sort_on: str = 'name',
sort_order: str = 'asc',
offset: int = 0,
limit: int = 10) -> api.Result:
"""Get paginated search results, including size/modify date/location information.

:param ctx: Combined type of a callback and rei struct
Expand All @@ -204,9 +204,9 @@ def api_search(ctx,

:returns: Dict with paginated search results
"""
def transform(row):
def transform(row: Dict) -> Dict:
# Remove ORDER_BY etc. wrappers from column names.
x = {re.sub('.*\((.*)\)', '\\1', k): v for k, v in row.items()}
x = {re.sub(r'.*\((.*)\)', '\\1', k): v for k, v in row.items()}

if 'DATA_NAME' in x:
_, _, path, subpath = pathutil.info(x['COLL_NAME'])
Expand All @@ -217,23 +217,24 @@ def transform(row):
'type': 'data',
'size': int(x['DATA_SIZE']),
'modify_time': int(x['DATA_MODIFY_TIME'])}

if 'COLL_NAME' in x:
elif 'COLL_NAME' in x:
_, _, path, subpath = pathutil.info(x['COLL_NAME'])
if subpath != '':
path = path + "/" + subpath

return {'name': "/{}".format(path),
'type': 'coll',
'modify_time': int(x['COLL_MODIFY_TIME'])}
else:
return {}

# Replace, %, _ and \ since iRODS does not handle those correctly.
# HdR this can only be done in a situation where search_type is NOT status!
# Status description must be kept in tact.
if search_type != 'status':
search_string = search_string.replace("\\", "\\\\")
search_string = search_string.replace("%", "\%")
search_string = search_string.replace("_", "\_")
search_string = search_string.replace("%", r"\%")
search_string = search_string.replace("_", r"\_")

zone = user.zone(ctx)

Expand Down Expand Up @@ -280,13 +281,13 @@ def transform(row):
qdata = Query(ctx, cols, where, offset=max(0, int(offset)),
limit=int(limit), case_sensitive=query_is_case_sensitive, output=AS_DICT)

datas = map(transform, [d for d in list(qdata) if _filter_vault_deposit_index(d)])
datas = list(map(transform, [d for d in list(qdata) if _filter_vault_deposit_index(d)]))

return OrderedDict([('total', qdata.total_rows()),
('items', datas)])


def _filter_vault_deposit_index(row):
def _filter_vault_deposit_index(row: Dict) -> bool:
"""This internal function filters out index collections in deposit vault collections.
These collections are used internally by Yoda for indexing data package metadata, and
should not be displayed.
Expand All @@ -296,14 +297,14 @@ def _filter_vault_deposit_index(row):
:returns: boolean value that indicates whether row should be displayed
"""
# Remove ORDER_BY etc. wrappers from column names.
x = {re.sub('.*\((.*)\)', '\\1', k): v for k, v in row.items()}
x = {re.sub(r'.*\((.*)\)', '\\1', k): v for k, v in row.items()}
# Filter out deposit vault index collection
return not re.match("^/[^/]+/home/vault-[^/]+/deposit-[^/]+/index$",
x['COLL_NAME'])


@api.make()
def api_load_text_obj(ctx, file_path='/'):
def api_load_text_obj(ctx: rule.Context, file_path: str = '/') -> api.Result:
"""Retrieve a text file (as a string) in either the research, deposit, or vault space.

:param ctx: Combined type of a callback and rei struct
Expand Down Expand Up @@ -345,3 +346,5 @@ def api_load_text_obj(ctx, file_path='/'):
return api.Error('large_size', 'The given text file is too large to render')
except error.UUError:
return api.Error('ReadError', 'Could not retrieve file')
except Exception:
return api.Error('not_valid', 'The given data object is not a text file')
Loading
Loading