Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to support CDM data import #1

Merged
merged 20 commits into from
Aug 14, 2024
Merged

Changes to support CDM data import #1

merged 20 commits into from
Aug 14, 2024

Conversation

jeff-cohere
Copy link
Collaborator

@jeff-cohere jeff-cohere commented Jun 14, 2024

This PR pulls in changes made to support our ad-hoc ("heroic?") import of data into CDM.

This is a pretty new repo, so I don't have anything CI-ish in place yet. @ialarmedalien , we were talking about getting dtspy "up to spec" with some GitHub Actions, style checks, and what-not. Maybe we can use this PR to get things in shape. Please feel free to invite whatever other reviewer(s) you think would be good representatives of the KBase Python Way.

Right now, we just use a basic pip -r requirements.txt install process. I'd prefer not to wander into the bog of what passes for Python package management tools, since they're all horrible. :-)

Update

I implemented some unit tests, though only for search and metadata features (for now).

@jeff-cohere jeff-cohere added enhancement New feature or request help wanted Extra attention is needed testing labels Jun 14, 2024
- name: Running tests (${{ matrix.os }})
run: coverage run -m unittest discover
env:
DTS_KBASE_DEV_TOKEN: ${{ secrets.DTS_KBASE_DEV_TOKEN }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should already be some kbase dev tokens in the environment if you check in the settings for this repo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried KBASE_CI_TOKEN, but it doesn't work. The GitHub documentation is entering the Enterprise Software Heat Death stage of its lifecycle, so it's hard to figure out the difference between org secrets and repo secrets, or even whether anyone at GitHub knows what the difference is supposed to be. So I'm reverting my change here for the time being. I'll revisit when we have this other stuff sorted.

dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
"""
`client.deleteTransfer(id) -> None
`client.cancel_transfer(id) -> None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably worth using one of the recognised python documentation formats -- depending on what IDE you use, you may be able to automatically generate documentation skeletons for functions and if you add a linter, you won't get a bunch of errors because the docs don't follow pydocstyle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've been hesitant to strap myself all the way into Python dev mode, but it's probably a good time to address some of this stuff. I don't use an IDE, though, because I find they make it easier to produce mountains of rubbish with almost no effort.

"one of the recognised python documentation formats" gets at the key issue -- there's 10M ways to do everything in Python. In this it takes after my other "favorite" language, C++, in which everyone strives to prove how clever they are by trotting out esoterically cute ways of doing trivial things that don't pertain to any other language.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: #7

Copy link
Collaborator

@ialarmedalien ialarmedalien Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I use the sphinx documentation style, and ruff for python code linting/formatting. I use VSCode for working with python -- there's an extension for generating python documentation for functions and a nice ruff integration, so you can handle all that stuff whilst you're working on the code. I'm sure that there are also emacs / vi / etc. integrations for ruff, too.

import os
import unittest

class TestClient(unittest.TestCase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would switch to using pytest for tests -- it is a lot more flexible that unittest as you can easily parametrise your tests (run the same test on a bunch of different inputs) and the test output reporting is much more helpful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, one of the 10M testing packages. Thanks for the recommendation!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an issue for this.

dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
dts/client.py Outdated Show resolved Hide resolved
Comment on lines +18 to +20
self.assertTrue(client.uri)
self.assertTrue(client.name)
self.assertTrue(client.version)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these aren't really worthwhile tests -- maybe they can be improved when you move to pytest, but I would expect the tests to check the values of these attributes, not just whether they are truthy or falsy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can definitely address these when we move to pytest.

Copy link
Collaborator

@ialarmedalien ialarmedalien Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest uses assert so these self.assertTrue and self.assertFalse statements would turn into something like:

assert client.uri == "some_client_uri"
assert client.name
assert isinstance(client.version, str)
assert len(client.name) > 15

@jeff-cohere jeff-cohere merged commit db00daa into main Aug 14, 2024
2 checks passed
@jeff-cohere jeff-cohere deleted the db-specific-search branch August 14, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants