Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update development dependencies and improve release process #1652

Merged
merged 32 commits into from
Apr 29, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
31fc81b
Recover dependencies; merge make/hatch
roll Apr 16, 2024
955d90f
Fixed linting
roll Apr 16, 2024
b049367
Use hatch shortcuts
roll Apr 16, 2024
e0a067d
Fixed analyzer/schema/steps type errors
roll Apr 16, 2024
a79adf0
Fixed resource type errors
roll Apr 16, 2024
99e912e
Fixed lazy_fixtures usage
roll Apr 16, 2024
e65a2f5
Started moving tests closer to the codebase
roll Apr 16, 2024
35c829a
Migrated schemes tests
roll Apr 16, 2024
97141d7
Migrated resources tests
roll Apr 16, 2024
9a3bdc9
Migrated formats tests
roll Apr 16, 2024
58a7779
Migrated portals tests
roll Apr 16, 2024
6779b9e
Improved modularity
roll Apr 16, 2024
a97c773
Recover python3.12 on CI
roll Apr 16, 2024
2eda556
Skip tests failing on pyhton3.12
roll Apr 16, 2024
781ff99
Recover python3.8/9 on CI
roll Apr 16, 2024
8e1d1aa
Skip pytest-vcr related failures on python3.8/9
roll Apr 16, 2024
67a8604
Added py3.12 into project description
roll Apr 16, 2024
0f989ad
Fixed tests
roll Apr 16, 2024
9acf0e1
Don't use __all__ in formats
roll Apr 29, 2024
3c17e77
Don't use __all__ in portals
roll Apr 29, 2024
7795b24
Don't use __all__ in schemes
roll Apr 29, 2024
6bbf202
Don't use __all__ in steps
roll Apr 29, 2024
062aa0e
Don't use __all__ in checks
roll Apr 29, 2024
09fd600
Don't use __all__ in resources
roll Apr 29, 2024
db47fb7
Don't user __all__ in fields
roll Apr 29, 2024
91a4efc
Don't use __all__ in errors
roll Apr 29, 2024
f784714
Don't use __all__ in root
roll Apr 29, 2024
54f3d61
Updated Python3.12+ test skip messages
roll Apr 29, 2024
45a8c41
Remove `jsonschema` uppder version limit
roll Apr 29, 2024
c093db6
Fixed macos setup on CI
roll Apr 29, 2024
6a6c333
Downgrade macos version on CI
roll Apr 29, 2024
5c0b901
Downgrage macos on CI
roll Apr 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
11 changes: 3 additions & 8 deletions .github/workflows/general.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
# TODO: recover 3.8 and 3.9 when pytest-vcr is fixed
# https://github.com/ktosiek/pytest-vcr/issues/53
# TODO: recover 3.12 when duck is fixed
# https://github.com/duckdb/duckdb/issues/9563
# python-version: [3.8, 3.9, "3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down Expand Up @@ -90,7 +85,7 @@ jobs:
run: cp .env.example .env
- name: Test software
# https://stackoverflow.com/questions/9678408/cant-install-psycopg2-with-pip-in-virtualenv-on-mac-os-x-10-7
run: LDFLAGS=`echo $(pg_config --ldflags)` make test
run: LDFLAGS=`echo $(pg_config --ldflags)` hatch run +py=3.10 ci:test

# Test (Windows)

Expand All @@ -109,7 +104,7 @@ jobs:
- name: Prepare variables
run: cp .env.example .env
- name: Test software
run: make test
run: hatch run +py=3.10 ci:test

# Deploy

Expand Down
28 changes: 18 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,10 @@ hatch shell
Use the following command to build the container:

```bash tabs=CLI
make docker
hatch run image
```

This should take care of setting up everything. If the container is
built without errors, you can then run commands like `make` inside the
container to accomplish various tasks (see the next section for details).
This should take care of setting up everything. If the container is built without errors, you can then run commands like `hatch` inside the container to accomplish various tasks (see the next section for details).

To make things easier, we can create an alias:

Expand All @@ -65,7 +63,7 @@ alias "frictionless-dev=docker run --rm -v $PWD:/home/frictionless -it frictionl
Then, for example, to run the tests, we can use:

```bash tabs=CLI
frictionless-dev make test
frictionless-dev hatch run test
```

## Development
Expand All @@ -74,13 +72,11 @@ frictionless-dev make test

Frictionless is a Python3.8+ framework, and it uses some common Python tools for the development process (we recommend enabling support of these tools in your IDE):

- code linting: `ruff`
- import sorting: `isort`
- code formatting: `black`
- linting/formatting: `ruff`
- type checking: `pyright`
- code testing: `pytest`

You also need `git` to work on the project, and `make` is recommended.
You also need `git` to work on the project.

### Documentation

Expand Down Expand Up @@ -117,33 +113,44 @@ def vcr_config():
- Setup CKAN local instance: https://github.com/okfn/docker-ckan
- Create a sysadmin account and generate api token
- Set apikey token in .env file

```
CKAN_APIKEY=***************************
```

#### Regenerating cassettes for Zenodo

**Read**

- To read, we need to use live site, the api library uses it by default.
- Login to zenodo if you have an account and create an access token.
- Set access token in .env file.

```
ZENODO_ACCESS_TOKEN=***************************
```

**Write**

- To write we can use either live site or sandbox. We recommend to use sandbox (https://sandbox.zenodo.org/api/).
- Login to zenodo(sandbox) if you have an account and create an access token.
- Set access token in .env file.

```
ZENODO_SANDBOX_ACCESS_TOKEN=***************************
```

- Set base_url in the control params

```
base_url='base_url="https://sandbox.zenodo.org/api/'
```

#### Regenerating cassettes for Github

- Login to github if you have an account and create an access token(Developer settings > Personal access tokens > Tokens).
- Set access token and other details in .env file. If email/name of the user is hidden we need to provide those details as well.

```
GITHUB_NAME=FD
[email protected]
Expand All @@ -153,8 +160,9 @@ GITHUB_ACCESS_TOKEN=***************************
## Releasing

To release a new version:

- check that you have push access to the `main` branch
- run `hatch version <major|minor|micro>` to update the version
- add changes to `CHANGELOG.md` if it's not a patch release (major or minor)
- run `make release` which create a release commit and tag and push it to Github
- run `hatch run release` which create a release commit and tag and push it to Github
- an actual release will happen on the Github CI platform after running the tests
40 changes: 0 additions & 40 deletions Makefile

This file was deleted.

File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ def test_analyze_resource_detailed_descriptive_statistics_with_outliers():
assert analysis["fieldStats"]["average_grades"]["outliers"] == [10000.0]


@pytest.mark.skipif(sys.version_info >= (3, 12), reason="Fix for Python3.12+")
def test_analyze_resource_detailed_descriptive_statistics_variables_correlation():
resource = TableResource(path="data/analysis-data.csv")
analysis = resource.analyze(detailed=True)
Expand Down
12 changes: 9 additions & 3 deletions frictionless/analyzer/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,9 @@ def analyze_table_resource(
_statistics(rows_without_nan_values) # type: ignore
)
analysis_report["fieldStats"][field.name]["outliers"] = []
analysis_report["fieldStats"][field.name]["missingValues"] = resource.stats.rows - len(rows_without_nan_values) # type: ignore
analysis_report["fieldStats"][field.name]["missingValues"] = (
resource.stats.rows - len(rows_without_nan_values) # type: ignore
)

# calculate correlation between variables(columns/fields)
for field_y in resource.schema.fields:
Expand Down Expand Up @@ -123,10 +125,14 @@ def analyze_table_resource(
"outliers"
].append(cell)

analysis_report["notNullRows"] = resource.stats.rows - analysis_report["rowsWithNullValues"] # type: ignore
analysis_report["notNullRows"] = ( # type: ignore
resource.stats.rows - analysis_report["rowsWithNullValues"] # type: ignore
)
analysis_report["averageRecordSizeInBytes"] = 0
if resource.stats.rows and resource.stats.bytes:
analysis_report["averageRecordSizeInBytes"] = resource.stats.bytes / resource.stats.rows # type: ignore
analysis_report["averageRecordSizeInBytes"] = (
resource.stats.bytes / resource.stats.rows
) # type: ignore
analysis_report["timeTaken"] = timer.time
return {
**analysis_report,
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 3 additions & 1 deletion tests/conftest.py → frictionless/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,9 @@ def vcr_cassette_dir(request):
def populate_db(engine):
with engine.begin() as conn:
conn.execute(sa.text('CREATE TABLE "table" (id INTEGER PRIMARY KEY, name TEXT)'))
conn.execute(sa.text("INSERT INTO \"table\" VALUES (1, 'english'), (2, '中国人')"))
conn.execute(
sa.text("INSERT INTO \"table\" VALUES (1, 'english'), (2, '中国人')")
)
conn.execute(
sa.text(
"CREATE TABLE fruits (uid INTEGER PRIMARY KEY, fruit_name TEXT, calories INTEGER)"
Expand Down
File renamed without changes.
17 changes: 15 additions & 2 deletions frictionless/console/commands/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
# Register modules
from . import convert, describe, explore, extract, index, inspect, list, publish, query
from . import script, summary, transform, validate
from . import (
convert,
describe,
explore,
extract,
index,
inspect,
list,
publish,
query,
script,
summary,
transform,
validate,
)
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import sys

import pytest

from frictionless import Detector, Dialect, formats, platform
Expand Down Expand Up @@ -106,6 +108,7 @@ def test_csv_parser_buffer():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info < (3, 10), reason="pytest-vcr bug in Python3.8/9")
def test_csv_parser_remote():
with TableResource(path=BASEURL % "data/table.csv") as resource:
assert resource.header == ["id", "name"]
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
import io
import sys
from decimal import Decimal

import pytest

from frictionless import Detector, Dialect, FrictionlessException, Package, formats
from frictionless import platform
from frictionless import (
Detector,
Dialect,
FrictionlessException,
Package,
formats,
platform,
)
from frictionless.resources import TableResource

BASEURL = "https://raw.githubusercontent.com/frictionlessdata/frictionless-py/master/%s"
Expand All @@ -24,6 +31,7 @@ def test_xlsx_parser_table():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info >= (3, 12), reason="Fix for Python3.12+")
def test_xlsx_parser_remote():
path = BASEURL % "data/table.xlsx"
with TableResource(path=path) as resource:
Expand Down Expand Up @@ -166,6 +174,7 @@ def test_xlsx_parser_preserve_formatting_number_multicode():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info >= (3, 12), reason="Fix for Python3.12+")
def test_xlsx_parser_workbook_cache():
path = BASEURL % "data/sheets.xlsx"
for sheet in ["Sheet1", "Sheet2", "Sheet3"]:
Expand Down Expand Up @@ -201,6 +210,7 @@ def test_xlsx_parser_merged_cells_fill_boolean():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info >= (3, 12), reason="Fix for Python3.12+")
def test_xlsx_parser_fix_for_2007_xls():
path = "https://ams3.digitaloceanspaces.com/budgetkey-files/spending-reports/2018-3-משרד התרבות והספורט-לשכת הפרסום הממשלתית-2018-10-22-c457.xls"
with TableResource(path=path, format="xlsx") as resource:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import json
import sys

import pytest

Expand Down Expand Up @@ -66,6 +67,7 @@ def test_json_parser_from_buffer_keyed():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info < (3, 10), reason="pytest-vcr bug in Python3.8/9")
def test_json_parser_from_remote():
with TableResource(path=BASEURL % "data/table.json") as resource:
assert resource.header == ["id", "name"]
Expand All @@ -76,6 +78,7 @@ def test_json_parser_from_remote():


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info < (3, 10), reason="pytest-vcr bug in Python3.8/9")
def test_json_parser_from_remote_keyed():
with TableResource(path=BASEURL % "data/table.keyed.json") as resource:
assert resource.dialect.to_descriptor() == {"json": {"keyed": True}}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
import pytest

from frictionless import Checklist, Dialect, Inquiry, Package, Pipeline, Report
from frictionless import Resource, Schema
from frictionless import (
Checklist,
Dialect,
Inquiry,
Package,
Pipeline,
Report,
Resource,
Schema,
)

SCHEMA = {
"fields": [
Expand Down
5 changes: 2 additions & 3 deletions frictionless/formats/sql/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,10 @@
import re
from typing import TYPE_CHECKING, Any, Callable, Dict, Generator, List, Optional

from ... import models
from ...package import Package
from ...platform import platform
from ...resource import Resource
from ...system import Adapter
from ...system import Adapter, PublishResult
from . import settings
from .control import SqlControl
from .mapper import SqlMapper
Expand Down Expand Up @@ -111,7 +110,7 @@ def write_package(self, package: Package):
resource = package.get_table_resource(table.name)
with resource:
self.write_row_stream(resource.row_stream, table_name=table.name)
return models.PublishResult(
return PublishResult(
url=self.engine.url.render_as_string(hide_password=True),
context=dict(engine=self.engine),
)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import sys
import zipfile

import pytest
Expand Down Expand Up @@ -39,6 +40,7 @@ def test_zip_adapter_to_zip_resource_path(tmpdir):


@pytest.mark.vcr
@pytest.mark.skipif(sys.version_info < (3, 10), reason="pytest-vcr bug in Python3.8/9")
def test_zip_adapter_to_zip_resource_remote_path(tmpdir):
path = os.path.join(tmpdir, "package.zip")
source = Package(resources=[Resource(path=BASEURL % "data/table.csv")])
Expand Down
6 changes: 3 additions & 3 deletions frictionless/formats/zip/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
import tempfile
from typing import Optional

from ... import errors, helpers, models
from ... import errors, helpers
from ...exception import FrictionlessException
from ...package import Package
from ...platform import platform
from ...resources import FileResource, TableResource
from ...system import Adapter
from ...system import Adapter, PublishResult
from .control import ZipControl

# NOTE:
Expand Down Expand Up @@ -112,4 +112,4 @@ def write_package(self, package: Package):
error = errors.PackageError(note=str(exception))
raise FrictionlessException(error) from exception

return models.PublishResult(context=dict(path=path))
return PublishResult(context=dict(path=path))
1 change: 1 addition & 0 deletions frictionless/helpers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .general import *
File renamed without changes.
Loading
Loading