Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] DEV-1849: Add yaml linter, formatter and unix-script for replacing chars. #984

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
65 changes: 53 additions & 12 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,23 +1,64 @@
---
repos:
- repo: [email protected]:Yelp/detect-secrets
- repo: [email protected]:Yelp/detect-secrets
rev: v0.13.0
hooks:
- id: detect-secrets
args: ['--baseline', '.secrets.baseline']
- id: detect-secrets
args: [--baseline, .secrets.baseline]

- repo: https://github.com/pre-commit/pre-commit-hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: check-json
- id: check-toml
- id: check-yaml
- id: check-json
- id: check-toml
- id: check-yaml
exclude: .gitlab-ci.yml
- id: end-of-file-fixer
- id: fix-encoding-pragma
- id: end-of-file-fixer
- id: fix-encoding-pragma
args: [--remove]
- id: no-commit-to-branch
- id: no-commit-to-branch
args: [--branch, develop, --branch, master, --pattern, release/.*]
- id: pretty-format-json
- id: pretty-format-json
args: [--autofix, --no-sort-keys]
- id: trailing-whitespace
- id: trailing-whitespace
args: [--markdown-linebreak-ext=md]

- repo: local
hooks:
- id: replacer
# More chars can be added
name: Replace unicode dashes with ascii dashes.
language: system
pass_filenames: false
entry: |
sh -c "
# echo -n '–' | hexdump -C # > e2 80 93
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a great comment 👍

LC_ALL=C
for f in $(ls src/gdcdictionary/schemas/*.yaml); do
# echo 'Converting ' $f
sed -i '' 's/\xe2\x80\x93/-/g' $f
done
"
Comment on lines +26 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This isn't going to work for everyone, such as Windows users.
  2. If we are going to enforce ascii, we could use require-ascii. It's catch much more than just the hyphen. We'll need to exclude these files, as described in this comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This isn't going to work for everyone, such as Windows users.

Good point. Should it be a deal breaker?

  1. If we are going to enforce ascii, we could use require-ascii. It's catch much more than just the hyphen. We'll need to exclude these files, as described in this comment.

The catch is that unicode chars are allowed. Depending on the version of the yaml file (which we do NOT explicitly define) and the version of the parser (somehow loose atm), one should either quote them or do whatever applies.

When I first saw the failure and the traceback, I was inclined to remove the whole "ascii" enforcement and instead specify yaml versions and straighten up all the versions (yaml files and parser).

I think we should pursue that end. It might take some bit effort but as long as we can diff the current json schema with the new one and make sure the match, we should be ok. IMHO.


- repo: https://github.com/jumanjihouse/pre-commit-hook-yamlfmt
rev: 0.2.1
hooks:
- id: yamlfmt
args: [--mapping, '2', --sequence, '4', --offset, '2', --width, '130']
# TODO: Remove stages so it always run. Do it once all yamls are properly formatted.
stages: [manual]

- repo: https://github.com/adrienverge/yamllint.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding libraries, check the license. This code is GPL'd. We cannot use it.

We need more permissible licenses, such as MIT or Apache 2.0.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when we neither distribute the tool, nor modify it, nor include it in our distribution?

(wouldn't using this tool be equivalent to using any other gnu-tool in our build system?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a fair question and, to my knowledge, a grey area. as an interpreted language, python (mostly) does not have the concept of "linking". so, there's little distinction between a library and source code. are you aware of any legal precedents that have clarified this? if not, we should err on the side of caution and not use GPL'd python code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@stilesj-uchicago and I had the same understanding => "we're good in this case".

We're using yamlfmt (mit) in gdcapi but not yamllint (gplv3).

Maybe I can remove yamllint until we find an alternative?

rev: v1.31.0
hooks:
- id: yamllint
name: Lint YAML files
args: [--strict]
files: ^src/.*\.(yaml|yml)$
exclude: |
(?x)^(
sandbox/.*\.(yaml|yml)|
\.yamllint|
\.pre-commit-config\.yaml|
docker-compose-ci\.yaml
)$
11 changes: 11 additions & 0 deletions .yamllint
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
# SEE https://yamllint.readthedocs.io/en/stable/configuration.html#default-configuration

extends: default

rules:
document-start: disable
line-length:
max: 130
indentation:
indent-sequences: consistent
7 changes: 4 additions & 3 deletions docker-compose-ci.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
version: "3.3"
---
version: '3.3'
services:
app:
image: quay.io/ncigdc/jenkins-agent:multipython
Expand All @@ -7,6 +8,6 @@ services:
HTTPS_PROXY: http://cloud-proxy:3128
HTTP_PROXY: http://cloud-proxy:3128
volumes:
- .:/home/jenkins
- $SSH_AUTH_SOCK:$SSH_AUTH_SOCK
- .:/home/jenkins
- $SSH_AUTH_SOCK:$SSH_AUTH_SOCK
command: bash -c "tox --recreate"
2 changes: 1 addition & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import os

import pytest
from src.gdcdictionary import ROOT_DIR, GDCDictionary
from gdcdictionary import ROOT_DIR, GDCDictionary
from tests.utils import load_yaml


Expand Down