Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update development dependencies and improve release process #1652

Merged
merged 32 commits into from
Apr 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
31fc81b
Recover dependencies; merge make/hatch
roll Apr 16, 2024
955d90f
Fixed linting
roll Apr 16, 2024
b049367
Use hatch shortcuts
roll Apr 16, 2024
e0a067d
Fixed analyzer/schema/steps type errors
roll Apr 16, 2024
a79adf0
Fixed resource type errors
roll Apr 16, 2024
99e912e
Fixed lazy_fixtures usage
roll Apr 16, 2024
e65a2f5
Started moving tests closer to the codebase
roll Apr 16, 2024
35c829a
Migrated schemes tests
roll Apr 16, 2024
97141d7
Migrated resources tests
roll Apr 16, 2024
9a3bdc9
Migrated formats tests
roll Apr 16, 2024
58a7779
Migrated portals tests
roll Apr 16, 2024
6779b9e
Improved modularity
roll Apr 16, 2024
a97c773
Recover python3.12 on CI
roll Apr 16, 2024
2eda556
Skip tests failing on pyhton3.12
roll Apr 16, 2024
781ff99
Recover python3.8/9 on CI
roll Apr 16, 2024
8e1d1aa
Skip pytest-vcr related failures on python3.8/9
roll Apr 16, 2024
67a8604
Added py3.12 into project description
roll Apr 16, 2024
0f989ad
Fixed tests
roll Apr 16, 2024
9acf0e1
Don't use __all__ in formats
roll Apr 29, 2024
3c17e77
Don't use __all__ in portals
roll Apr 29, 2024
7795b24
Don't use __all__ in schemes
roll Apr 29, 2024
6bbf202
Don't use __all__ in steps
roll Apr 29, 2024
062aa0e
Don't use __all__ in checks
roll Apr 29, 2024
09fd600
Don't use __all__ in resources
roll Apr 29, 2024
db47fb7
Don't user __all__ in fields
roll Apr 29, 2024
91a4efc
Don't use __all__ in errors
roll Apr 29, 2024
f784714
Don't use __all__ in root
roll Apr 29, 2024
54f3d61
Updated Python3.12+ test skip messages
roll Apr 29, 2024
45a8c41
Remove `jsonschema` uppder version limit
roll Apr 29, 2024
c093db6
Fixed macos setup on CI
roll Apr 29, 2024
6a6c333
Downgrade macos version on CI
roll Apr 29, 2024
5c0b901
Downgrage macos on CI
roll Apr 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
16 changes: 7 additions & 9 deletions .github/workflows/general.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
# TODO: recover 3.8 and 3.9 when pytest-vcr is fixed
# https://github.com/ktosiek/pytest-vcr/issues/53
# TODO: recover 3.12 when duck is fixed
# https://github.com/duckdb/duckdb/issues/9563
# python-version: [3.8, 3.9, "3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand Down Expand Up @@ -76,7 +71,10 @@ jobs:

test-macos:
if: github.event_name != 'schedule' || github.repository_owner == 'frictionlessdata'
runs-on: macos-latest
# TODO: migrate to macos-latest after figuring out how to
# make `posgres/pg_config` works in the environment. Currently, it fails
# with the following error: "pg_config" not found"
runs-on: macos-12
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand All @@ -90,7 +88,7 @@ jobs:
run: cp .env.example .env
- name: Test software
# https://stackoverflow.com/questions/9678408/cant-install-psycopg2-with-pip-in-virtualenv-on-mac-os-x-10-7
run: LDFLAGS=`echo $(pg_config --ldflags)` make test
run: LDFLAGS=`echo $(pg_config --ldflags)` hatch run +py=3.10 ci:test

# Test (Windows)

Expand All @@ -109,7 +107,7 @@ jobs:
- name: Prepare variables
run: cp .env.example .env
- name: Test software
run: make test
run: hatch run +py=3.10 ci:test

# Deploy

Expand Down
28 changes: 18 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,10 @@ hatch shell
Use the following command to build the container:

```bash tabs=CLI
make docker
hatch run image
```

This should take care of setting up everything. If the container is
built without errors, you can then run commands like `make` inside the
container to accomplish various tasks (see the next section for details).
This should take care of setting up everything. If the container is built without errors, you can then run commands like `hatch` inside the container to accomplish various tasks (see the next section for details).

To make things easier, we can create an alias:

Expand All @@ -65,7 +63,7 @@ alias "frictionless-dev=docker run --rm -v $PWD:/home/frictionless -it frictionl
Then, for example, to run the tests, we can use:

```bash tabs=CLI
frictionless-dev make test
frictionless-dev hatch run test
```

## Development
Expand All @@ -74,13 +72,11 @@ frictionless-dev make test

Frictionless is a Python3.8+ framework, and it uses some common Python tools for the development process (we recommend enabling support of these tools in your IDE):

- code linting: `ruff`
- import sorting: `isort`
- code formatting: `black`
- linting/formatting: `ruff`
- type checking: `pyright`
- code testing: `pytest`

You also need `git` to work on the project, and `make` is recommended.
You also need `git` to work on the project.

### Documentation

Expand Down Expand Up @@ -117,33 +113,44 @@ def vcr_config():
- Setup CKAN local instance: https://github.com/okfn/docker-ckan
- Create a sysadmin account and generate api token
- Set apikey token in .env file

```
CKAN_APIKEY=***************************
```

#### Regenerating cassettes for Zenodo

**Read**

- To read, we need to use live site, the api library uses it by default.
- Login to zenodo if you have an account and create an access token.
- Set access token in .env file.

```
ZENODO_ACCESS_TOKEN=***************************
```

**Write**

- To write we can use either live site or sandbox. We recommend to use sandbox (https://sandbox.zenodo.org/api/).
- Login to zenodo(sandbox) if you have an account and create an access token.
- Set access token in .env file.

```
ZENODO_SANDBOX_ACCESS_TOKEN=***************************
```

- Set base_url in the control params

```
base_url='base_url="https://sandbox.zenodo.org/api/'
```

#### Regenerating cassettes for Github

- Login to github if you have an account and create an access token(Developer settings > Personal access tokens > Tokens).
- Set access token and other details in .env file. If email/name of the user is hidden we need to provide those details as well.

```
GITHUB_NAME=FD
[email protected]
Expand All @@ -153,8 +160,9 @@ GITHUB_ACCESS_TOKEN=***************************
## Releasing

To release a new version:

- check that you have push access to the `main` branch
- run `hatch version <major|minor|micro>` to update the version
- add changes to `CHANGELOG.md` if it's not a patch release (major or minor)
- run `make release` which create a release commit and tag and push it to Github
- run `hatch run release` which create a release commit and tag and push it to Github
- an actual release will happen on the Github CI platform after running the tests
40 changes: 0 additions & 40 deletions Makefile

This file was deleted.

109 changes: 42 additions & 67 deletions frictionless/__init__.py
Original file line number Diff line number Diff line change
@@ -1,68 +1,43 @@
from .actions import convert, describe, extract, index, list, transform, validate
from .analyzer import Analyzer
from .catalog import Catalog, Dataset
from .checklist import Check, Checklist
from .detector import Detector
from .dialect import Control, Dialect
from .error import Error
from .exception import FrictionlessException
from .indexer import Indexer
from .inquiry import Inquiry, InquiryTask
from .metadata import Metadata
from .package import Package
from .pipeline import Pipeline, Step
from .platform import Platform, platform
from .report import Report, ReportTask
from .resource import Resource
from .schema import Field, Schema
from .actions import convert as convert
from .actions import describe as describe
from .actions import extract as extract
from .actions import list as list
from .actions import transform as transform
from .actions import validate as validate
from .analyzer import Analyzer as Analyzer
from .catalog import Catalog as Catalog
from .catalog import Dataset as Dataset
from .checklist import Check as Check
from .checklist import Checklist as Checklist
from .detector import Detector as Detector
from .dialect import Control as Control
from .dialect import Dialect as Dialect
from .error import Error as Error
from .exception import FrictionlessException as FrictionlessException
from .indexer import Indexer as Indexer
from .inquiry import Inquiry as Inquiry
from .inquiry import InquiryTask as InquiryTask
from .metadata import Metadata as Metadata
from .package import Package as Package
from .pipeline import Pipeline as Pipeline
from .pipeline import Step as Step
from .platform import Platform as Platform
from .platform import platform as platform
from .report import Report as Report
from .report import ReportTask as ReportTask
from .resource import Resource as Resource
from .schema import Field as Field
from .schema import Schema as Schema
from .settings import VERSION as __version__
from .system import Adapter, Loader, Mapper, Parser, Plugin, System, system
from .table import Header, Lookup, Row
from .transformer import Transformer
from .validator import Validator

__all__ = [
"Adapter",
"Analyzer",
"Catalog",
"Check",
"Checklist",
"Control",
"Dataset",
"Detector",
"Dialect",
"Error",
"Field",
"FrictionlessException",
"Header",
"Indexer",
"Inquiry",
"InquiryTask",
"Loader",
"Lookup",
"Mapper",
"Metadata",
"Package",
"Parser",
"Pipeline",
"Platform",
"Plugin",
"Report",
"ReportTask",
"Resource",
"Row",
"Schema",
"Step",
"System",
"Transformer",
"Validator",
"convert",
"describe",
"extract",
"index",
"list",
"platform",
"system",
"transform",
"validate",
]
from .system import Adapter as Adapter
from .system import Loader as Loader
from .system import Mapper as Mapper
from .system import Parser as Parser
from .system import Plugin as Plugin
from .system import System as System
from .system import system as system
from .table import Header as Header
from .table import Lookup as Lookup
from .table import Row as Row
from .transformer import Transformer as Transformer
from .validator import Validator as Validator
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,10 @@ def test_analyze_resource_detailed_descriptive_statistics_with_outliers():
assert analysis["fieldStats"]["average_grades"]["outliers"] == [10000.0]


@pytest.mark.skipif(
sys.version_info >= (3, 12),
reason="Fix for Python3.12+ (possible bug to investigate)",
)
def test_analyze_resource_detailed_descriptive_statistics_variables_correlation():
resource = TableResource(path="data/analysis-data.csv")
analysis = resource.analyze(detailed=True)
Expand Down
12 changes: 9 additions & 3 deletions frictionless/analyzer/analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,9 @@ def analyze_table_resource(
_statistics(rows_without_nan_values) # type: ignore
)
analysis_report["fieldStats"][field.name]["outliers"] = []
analysis_report["fieldStats"][field.name]["missingValues"] = resource.stats.rows - len(rows_without_nan_values) # type: ignore
analysis_report["fieldStats"][field.name]["missingValues"] = (
resource.stats.rows - len(rows_without_nan_values) # type: ignore
)

# calculate correlation between variables(columns/fields)
for field_y in resource.schema.fields:
Expand Down Expand Up @@ -123,10 +125,14 @@ def analyze_table_resource(
"outliers"
].append(cell)

analysis_report["notNullRows"] = resource.stats.rows - analysis_report["rowsWithNullValues"] # type: ignore
analysis_report["notNullRows"] = ( # type: ignore
resource.stats.rows - analysis_report["rowsWithNullValues"] # type: ignore
)
analysis_report["averageRecordSizeInBytes"] = 0
if resource.stats.rows and resource.stats.bytes:
analysis_report["averageRecordSizeInBytes"] = resource.stats.bytes / resource.stats.rows # type: ignore
analysis_report["averageRecordSizeInBytes"] = (
resource.stats.bytes / resource.stats.rows
) # type: ignore
analysis_report["timeTaken"] = timer.time
return {
**analysis_report,
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
16 changes: 1 addition & 15 deletions frictionless/checks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,4 @@
from .baseline import baseline
from .baseline import baseline as baseline
from .cell import *
from .row import *
from .table import *

__all__ = [
"ascii_value",
"baseline",
"deviated_cell",
"deviated_value",
"duplicate_row",
"forbidden_value",
"required_value",
"row_constraint",
"sequential_value",
"table_dimensions",
"truncated_value",
]
4 changes: 3 additions & 1 deletion tests/conftest.py → frictionless/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,9 @@ def vcr_cassette_dir(request):
def populate_db(engine):
with engine.begin() as conn:
conn.execute(sa.text('CREATE TABLE "table" (id INTEGER PRIMARY KEY, name TEXT)'))
conn.execute(sa.text("INSERT INTO \"table\" VALUES (1, 'english'), (2, '中国人')"))
conn.execute(
sa.text("INSERT INTO \"table\" VALUES (1, 'english'), (2, '中国人')")
)
conn.execute(
sa.text(
"CREATE TABLE fruits (uid INTEGER PRIMARY KEY, fruit_name TEXT, calories INTEGER)"
Expand Down
File renamed without changes.
17 changes: 15 additions & 2 deletions frictionless/console/commands/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
# Register modules
from . import convert, describe, explore, extract, index, inspect, list, publish, query
from . import script, summary, transform, validate
from . import (
convert,
describe,
explore,
extract,
index,
inspect,
list,
publish,
query,
script,
summary,
transform,
validate,
)
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading
Loading