Skip to content

Commit

Permalink
doms_databasen -> domsdatabasen
Browse files Browse the repository at this point in the history
  • Loading branch information
oliverkinch committed Jul 5, 2024
1 parent 13e4df0 commit 25844b5
Show file tree
Hide file tree
Showing 24 changed files with 60 additions and 98 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
pip install -r requirements.txt
- name: Build documentation
run: pip install pdoc==7.3.0 && pdoc --docformat google src/doms_databasen -o docs
run: pip install pdoc==7.3.0 && pdoc --docformat google src/domsdatabasen -o docs

- name: Compress documentation
run: tar --directory docs/ -hcf artifact.tar .
Expand Down
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,8 @@ outputs/
multirun/

# Documentation
docs/doms_databasen/
docs/doms_databasen.html
docs/domsdatabasen/
docs/domsdatabasen.html
docs/index.html
docs/search.js

Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ repos:
rev: v0.0.290
hooks:
- id: ruff
exclude: src/doms_databasen/constants.py
exclude: src/domsdatabasen/constants.py
args: [--fix, --exit-non-zero-on-fix]
types_or: [python, pyi, jupyter]
- repo: https://github.com/kynan/nbstripout
Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Welcome to doms_databasen contributing guide
# Welcome to domsdatabasen contributing guide

Thank you for investing your time in contributing to our project! :sparkles:.

Expand Down Expand Up @@ -29,11 +29,11 @@ resources to help you get started with open source contributions:
If you spot a problem with the package, [search if an issue already
exists](https://docs.github.com/en/github/searching-for-information-on-github/searching-on-github/searching-issues-and-pull-requests#search-by-the-title-body-or-comments).
If a related issue doesn't exist, you can open a new issue using a relevant [issue
form](https://github.com/alexandrainst/doms_databasen/issues).
form](https://github.com/alexandrainst/domsdatabasen/issues).

#### Solve an issue

Scan through our [existing issues](https://github.com/alexandrainst/doms_databasen/issues)
Scan through our [existing issues](https://github.com/alexandrainst/domsdatabasen/issues)
to find one that interests you. You can narrow down the search using `labels` as
filters. See [Labels](/contributing/how-to-use-labels.md) for more information. If you
find an issue to work on, you are welcome to open a PR with a fix.
Expand Down Expand Up @@ -87,4 +87,4 @@ questions or request for additional information.

### Your PR is merged!

Congratulations :tada::tada: The doms_databasen team thanks you :sparkles:.
Congratulations :tada::tada: The domsdatabasen team thanks you :sparkles:.
14 changes: 7 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ help:
@grep -E '^[0-9a-zA-Z_-]+:.*?## .*$$' makefile | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'

install: ## Install dependencies
@echo "Installing the 'doms_databasen' project..."
@echo "Installing the 'domsdatabasen' project..."
@$(MAKE) --quiet install-brew
@$(MAKE) --quiet install-gpg
@$(MAKE) --quiet generate-gpg-key
Expand All @@ -45,7 +45,7 @@ install: ## Install dependencies
@$(MAKE) --quiet setup-environment-variables
@$(MAKE) --quiet setup-git
@$(MAKE) --quiet add-repo-to-git
@echo "Installed the 'doms_databasen' project. You can now activate your virtual environment with 'source .venv/bin/activate'."
@echo "Installed the 'domsdatabasen' project. You can now activate your virtual environment with 'source .venv/bin/activate'."
@echo "Note that this is a Poetry project. Use 'poetry add <package>' to install new dependencies and 'poetry remove <package>' to remove them."

install-brew:
Expand Down Expand Up @@ -130,11 +130,11 @@ add-repo-to-git:
git commit --quiet -m "Initial commit"; \
fi
@if [ "$(shell git remote)" = "" ]; then \
git remote add origin [email protected]:alexandrainst/doms_databasen.git; \
git remote add origin [email protected]:alexandrainst/domsdatabasen.git; \
fi

docs: ## Generate documentation
@poetry run pdoc --docformat google src/doms_databasen -o docs
@poetry run pdoc --docformat google src/domsdatabasen -o docs
@echo "Saved documentation."

view-docs: ## View documentation
Expand All @@ -146,15 +146,15 @@ view-docs: ## View documentation
(*CYGWIN*) openCmd='cygstart'; ;; \
(*) echo 'Error: Unsupported platform: $${uname}'; exit 2; ;; \
esac; \
"$${openCmd}" docs/doms_databasen.html
"$${openCmd}" docs/domsdatabasen.html

test:
@poetry run pytest tests/scraper ; \
poetry run pytest tests/processor ; \

docker: ## Build Docker image and run container
@docker build -t doms_databasen .
@docker run -it --rm doms_databasen
@docker build -t domsdatabasen .
@docker run -it --rm domsdatabasen

tree: ## Print directory tree
@tree -a --gitignore -I .git .
Expand Down
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<a href="https://github.com/alexandrainst/doms_databasen"><img src="gfx/alexandra_logo.png" width="239" height="175" align="right" /></a>
<a href="https://github.com/alexandrainst/domsdatabasen"><img src="gfx/alexandra_logo.png" width="239" height="175" align="right" /></a>
# Domsdatabasen

Scraping og processering af [domsdatabasen](https://domsdatabasen.dk/#).
Expand All @@ -17,11 +17,11 @@ Se `src/scripts/process.py`.
Se `src/scripts/finalize.py`.

______________________________________________________________________
[![Documentation](https://img.shields.io/badge/docs-passing-green)](https://alexandrainst.github.io/doms_databasen/doms_databasen.html)
[![License](https://img.shields.io/github/license/oliverkinch/doms_databasen)](https://github.com/alexandrainst/doms_databasen/blob/master/LICENSE)
[![LastCommit](https://img.shields.io/github/last-commit/oliverkinch/doms_databasen)](https://github.com/alexandrainst/doms_databasen/commits/master)
[![Code Coverage](https://img.shields.io/badge/Coverage-100%25-brightgreen.svg)](https://github.com/alexandrainst/doms_databasen/tree/master/tests)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](https://github.com/alexandrainst/doms_databasen/blob/master/CODE_OF_CONDUCT.md)
[![Documentation](https://img.shields.io/badge/docs-passing-green)](https://alexandrainst.github.io/domsdatabasen/domsdatabasen.html)
[![License](https://img.shields.io/github/license/oliverkinch/domsdatabasen)](https://github.com/alexandrainst/domsdatabasen/blob/master/LICENSE)
[![LastCommit](https://img.shields.io/github/last-commit/oliverkinch/domsdatabasen)](https://github.com/alexandrainst/domsdatabasen/commits/master)
[![Code Coverage](https://img.shields.io/badge/Coverage-100%25-brightgreen.svg)](https://github.com/alexandrainst/domsdatabasen/tree/master/tests)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg)](https://github.com/alexandrainst/domsdatabasen/blob/master/CODE_OF_CONDUCT.md)


Developers:
Expand All @@ -38,11 +38,11 @@ Developers:


## A Word on Modules and Scripts
In the `src` directory there are two subdirectories, `doms_databasen`
In the `src` directory there are two subdirectories, `domsdatabasen`
and `scripts`. This is a brief explanation of the differences between the two.

### Modules
All Python files in the `doms_databasen` directory are _modules_
All Python files in the `domsdatabasen` directory are _modules_
internal to the project package. Examples here could be a general data loading script,
a definition of a model, or a training function. Think of modules as all the building
blocks of a project.
Expand All @@ -65,7 +65,7 @@ When importing module functions/classes when you're in a script, you do it like
would normally import from any other package:

```
from doms_databasen import some_function
from domsdatabasen import some_function
```

Note that this is also how we import functions/classes in tests, since each test Python
Expand All @@ -90,7 +90,7 @@ for the repository (can be enabled on Github in the repository settings).
Code Spaces is a new feature on Github, that allows you to develop on a project
completely in the cloud, without having to do any local setup at all. This repo comes
included with a configuration file for running code spaces on Github. When hosted on
`alexandrainst/doms_databasen` then simply press the `<> Code` button
`alexandrainst/domsdatabasen` then simply press the `<> Code` button
and add a code space to get started, which will open a VSCode window directly in your
browser.

Expand Down Expand Up @@ -130,7 +130,7 @@ browser.
│   ├── scripts
│   │   ├── fix_dot_env_file.py
│   │   └── your_script.py
│   └── doms_databasen
│   └── domsdatabasen
│   ├── __init__.py
│   └── your_module.py
└── tests
Expand Down
79 changes: 20 additions & 59 deletions notebooks/dataset_card.ipynb

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[tool.poetry]
name = "doms_databasen"
name = "domsdatabasen"
description = "Scraper and PDF text processor for domsdatabasen.dk"
version = "0.2.0"
authors = [
Expand Down Expand Up @@ -66,7 +66,7 @@ extend-select = [
"D",
]
exclude = [
"src/doms_databasen/_xpaths.py",
"src/domsdatabasen/_xpaths.py",
]

[tool.ruff.pydocstyle]
Expand Down Expand Up @@ -100,5 +100,5 @@ filterwarnings = [
log_cli_level = "info"
testpaths = [
"tests",
"src/doms_databasen",
"src/domsdatabasen",
]
Empty file removed src/doms_databasen/__init__.py
Empty file.
1 change: 1 addition & 0 deletions src/domsdatabasen/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""__init__.py file for the domsdatabasen package."""
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Exceptions for the doms_databasen package."""
"""Exceptions for the domsdatabasen package."""


class PDFDownloadException(Exception):
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Utility function for the doms_databasen package."""
"""Utility function for the domsdatabasen package."""

import json
from typing import List
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

from omegaconf import DictConfig

from doms_databasen._utils import append_jsonl, init_jsonl, read_json
from domsdatabasen._utils import append_jsonl, init_jsonl, read_json

logger = getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ def _raw_data_exists(self, case_dir) -> bool:
and contains two files: the PDF document and the tabular data.
Same code as the method `_already_scraped` from class `Scraper`
(src/doms_databasen/scraper.py).
(src/domsdatabasen/scraper.py).
Args:
case_dir (Path):
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion src/scripts/finalize.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@


import hydra
from doms_databasen.dataset_builder import DatasetBuilder
from domsdatabasen.dataset_builder import DatasetBuilder
from omegaconf import DictConfig


Expand Down
2 changes: 1 addition & 1 deletion src/scripts/process.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
import logging

import hydra
from doms_databasen.processor import Processor
from domsdatabasen.processor import Processor
from omegaconf import DictConfig

logger = logging.getLogger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion src/scripts/scrape.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
import logging

import hydra
from doms_databasen.scraper import Scraper
from domsdatabasen.scraper import Scraper
from omegaconf import DictConfig

logger = logging.getLogger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion tests/processor/test_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@


import pytest
from doms_databasen.processor import Processor
from domsdatabasen.processor import Processor


@pytest.fixture(scope="module")
Expand Down
2 changes: 1 addition & 1 deletion tests/processor/test_text_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import cv2
import numpy as np
import pytest
from doms_databasen._text_extraction import PDFTextReader
from domsdatabasen._text_extraction import PDFTextReader
from PIL import Image


Expand Down
2 changes: 1 addition & 1 deletion tests/scraper/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import shutil

import pytest
from doms_databasen.scraper import Scraper
from domsdatabasen.scraper import Scraper
from hydra import compose, initialize

# Initialise Hydra
Expand Down

0 comments on commit 25844b5

Please sign in to comment.