Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecated get_s3fs_session #768

Merged
merged 7 commits into from
Jul 27, 2024
Merged

Deprecated get_s3fs_session #768

merged 7 commits into from
Jul 27, 2024

Conversation

Sherwin-14
Copy link
Contributor

@Sherwin-14 Sherwin-14 commented Jul 19, 2024

I have made the changes. Do we update all the instances of get_s3fs_session now or is that something to be decided later?


📚 Documentation preview 📚: https://earthaccess--768.org.readthedocs.build/en/768/

@Sherwin-14 Sherwin-14 changed the title Renamed get_s3fs_session with get_s3_filesystem" Renamed get_s3fs_session with get_s3_filesystem Jul 19, 2024
Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sherwin-14, please read through the conversation again: #766

tl;dr: (1) you must deprecate get_s3fs_session, (2) the new name should be get_s3_filesystem

@Sherwin-14
Copy link
Contributor Author

Sherwin-14 commented Jul 19, 2024

@Sherwin-14, please read through the conversation again: #766

tl;dr: (1) you must deprecate get_s3fs_session, (2) the new name should be get_s3_filesystem

Oops I missed that! There are two methods of deprecation(if I am not wrong) which one should we choose here? Should we make a change in docstrings or should we add a deprecated decorator

@chuckwondo
Copy link
Collaborator

chuckwondo commented Jul 19, 2024

For full details on use of the @deprecated decorator, see https://peps.python.org/pep-0702/

In summary, here's what you need to do (with an example afterward):

  1. In the imports section at the top of the file, add from typing_extensions import deprecated (or insert deprecated into an existing from typing_extensions import statement in the file)
  2. copy each existing function signature and paste it above the existing one
  3. rename the existing function
  4. from the newly copied function (with the "old" name), call the renamed function
  5. decorate the newly copied function with @deprecated("Use get_s3_filesystem instead")

Example:

In store.py we have the Store class with the following method (I've left out the implementation for brevity):

    def get_s3fs_session(
        self,
        daac: Optional[str] = None,
        concept_id: Optional[str] = None,
        provider: Optional[str] = None,
        endpoint: Optional[str] = None,
    ) -> s3fs.S3FileSystem:
        # Existing implementation

Here's how we want to change things:

    @deprecated("Use get_s3_filesystem instead")
    def get_s3fs_session(
        self,
        daac: Optional[str] = None,
        concept_id: Optional[str] = None,
        provider: Optional[str] = None,
        endpoint: Optional[str] = None,
    ) -> s3fs.S3FileSystem:
        # COPY OVER THE DOCSTRING HERE
        return self.get_s3_filesystem(daac, concept_id, provider, endpoint)

    def get_s3_filesystem(
        self,
        daac: Optional[str] = None,
        concept_id: Optional[str] = None,
        provider: Optional[str] = None,
        endpoint: Optional[str] = None,
    ) -> s3fs.S3FileSystem:
        # Existing implementation

You also need to add this import to store.py:

from typing_extensions import deprecated

@mfisher87
Copy link
Collaborator

mfisher87 commented Jul 19, 2024

Neat, TIL! But I don't think @deprecated drops until 3.13?

EDIT: Nevermind :) Should have read this part:

Creating a new decorator poses no backwards compatibility concerns. As with all new typing functionality, the @deprecated decorator will be added to the typing_extensions module, enabling its use in older versions of Python.

@Sherwin-14 Sherwin-14 force-pushed the renaming branch 2 times, most recently from ddcfb58 to 1a57e19 Compare July 20, 2024 05:41
@Sherwin-14 Sherwin-14 changed the title Renamed get_s3fs_session with get_s3_filesystem Deprecated get_s3fs_session Jul 20, 2024
@Sherwin-14 Sherwin-14 requested a review from chuckwondo July 20, 2024 15:07
Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sherwin-14, there's a bit more to do than only deprecating and renaming the Store.get_s3fs_session method.

Please do a search throughout the entire codebase for occurrences of get_s3fs_session and determine the appropriate changes as follows:

  • Change all references to the Store.get_s3fs_session method to get_s3_filesystem so that within the codebase we're not using the newly deprecated method, but are instead using the newly introduced method that replaces it.
  • There is at least one other place where you must do the same thing you did in store.py. For example, in api.py, there is a top-level get_s3fs_session function. You must do the same thing with that function: deprecate it, copy it to a new function named get_s3_filesystem, and modify all of its references to earthaccess.__store__.get_s3fs_session to earthaccess.__store__.get_s3_filesystem.
  • Update references in kerchunk.py, test_auth.py, and test_store.py (and see if there are any other places that I may have missed listing here)

Please also add an appropriate entry to CHANGELOG.md, but before doing so, please make sure you have merged the latest changes from the main branch, as Luis just released v0.10.0, so he has updated CHANGELOG.md, so you want to pull in those changes before adding your own entry.

Once you have merged in the change made by Luis, you should add an entry above the new v0.10.0 heading that reads like so:

### Changed

- Deprecate `earthaccess.get_s3fs_session` and `Store.get_s3fs_session`. Use
  `earthaccess.get_s3_filesystem` and `Store.get_s3_filesystem`, respectively,
  instead ([#766](https://github.com/nsidc/earthaccess/issues/766))

In general, before opening a PR, you should locally run unit tests to make sure they pass. Further, after pushing up your work to your fork, but before opening a PR, you should check the success of the automated build. See https://github.com/Sherwin-14/earthaccess/actions/workflows/test.yml. If the build fails (because unit tests fail, for example), you should attempt to fix the failure.

Once you get things to work, or if you cannot determine how to fix things, you can then open a PR. In the case of opening the PR even when you cannot determine how to fix the broken build, you can ask for help to get things working.

earthaccess/store.py Outdated Show resolved Hide resolved
earthaccess/store.py Show resolved Hide resolved
@Sherwin-14
Copy link
Contributor Author

@chuckwondo Oh I guess there is some confusion here. Maybe I wasn't able to communicate what I felt after making a PR, all I was asking was that should we replace all the occurrences of get_s3fs_session at this point in time or maybe it something that we reserve for later. I am quite new to all of this, plus I didn't know that immediately after deprecation we have to replace all the instances ( maybe for some weird reason I thought this is something to be done later). Also I will try to follow the best practices from now on.

@chuckwondo
Copy link
Collaborator

@chuckwondo Oh I guess there is some confusion here. Maybe I wasn't able to communicate what I felt after making a PR, all I was asking was that should we replace all the occurrences of get_s3fs_session at this point in time or maybe it something that we reserve for later. I am quite new to all of this, plus I didn't know that immediately after deprecation we have to replace all the instances ( maybe for some weird reason I thought this is something to be done later). Also I will try to follow the best practices from now on.

No worries. I'm happy to help you through the process. I'm also happy for us to collaborate on this during the next earthaccess hackday, which is tomorrow, if you're able to attend. If you're not able to attend tomorrow, I'm happy to schedule a pairing session with you some other time this week.

@chuckwondo
Copy link
Collaborator

@Sherwin-14, we'll also want to add a unit test to make sure the deprecation warning is actually emitted. For reference: https://docs.python.org/3/library/warnings.html#testing-warnings

@Sherwin-14
Copy link
Contributor Author

Did all the changes in the respective files. I am also working on coming up with the unit test for emitting deprecation warning as well, would probably amend in the same commit.

@Sherwin-14
Copy link
Contributor Author

Sherwin-14 commented Jul 23, 2024

This is something I came up with for the unit test.

import warnings
import earthaccess 
from earthaccess import api 
from earthaccess.store import Store

def test_deprecation_warning():
    store = earthaccess.store.Store()
    with warnings.catch_warnings(record=True) as w:
        # Cause all warnings to always be triggered.
        warnings.simplefilter("always")
        # Trigger a warning.
        api.get_s3fs_session()
        store.get_s3fs_session()
        # Verify some things
        assert len(w) == 2
        assert issubclass(w[0].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[0].message)
        assert issubclass(w[1].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[1].message)

I ran into an error while running this test on my local machine which goes something like this FAILED tests/unit/test_deprecations.py::test_deprecation_warning - TypeError: Store.__init__() missing 1 required positional argument: 'auth'. I guess we need some access token here to be passed into the Store Class. What can be the next steps from here?

Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a couple of suggestions for adjusting the docstrings for the new functions. I'll reply separately to your question about the unit test you're working on.

earthaccess/api.py Outdated Show resolved Hide resolved
earthaccess/store.py Outdated Show resolved Hide resolved
@chuckwondo
Copy link
Collaborator

@Sherwin-14, nice work so far! The unit test all pass now.

@mfisher87, any idea why the integration tests are failing? I see that authentication is failing:

INFO     test_api:test_api.py:17 Current username: 
INFO     test_api:test_api.py:18 earthaccess version: 0.10.0
INFO     earthaccess.auth:auth.py:312 Authentication with Earthdata Login failed with:
{"error":"invalid_header","error_description":"Invalid header"}

The workflow pulls secrets from the github environment, but that first line of output above shows that the username is an empty string. I don't have access to the environments in the repo, and I don't know whose EDL credentials are being used in the secrets. Do you know?

@chuckwondo
Copy link
Collaborator

This is something I came up with for the unit test.

import warnings
import earthaccess 
from earthaccess import api 
from earthaccess.store import Store

def test_deprecation_warning():
    store = earthaccess.store.Store()
    with warnings.catch_warnings(record=True) as w:
        # Cause all warnings to always be triggered.
        warnings.simplefilter("always")
        # Trigger a warning.
        api.get_s3fs_session()
        store.get_s3fs_session()
        # Verify some things
        assert len(w) == 2
        assert issubclass(w[0].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[0].message)
        assert issubclass(w[1].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[1].message)

I ran into an error while running this test on my local machine which goes something like this FAILED tests/unit/test_deprecations.py::test_deprecation_warning - TypeError: Store.__init__() missing 1 required positional argument: 'auth'. I guess we need some access token here to be passed into the Store Class. What can be the next steps from here?

@Sherwin-14, nice job on coming up with the initial unit test for this. Will you be at today's hackday? If so, we can continue with this then.

@Sherwin-14
Copy link
Contributor Author

Sherwin-14 commented Jul 23, 2024

This is something I came up with for the unit test.

import warnings
import earthaccess 
from earthaccess import api 
from earthaccess.store import Store

def test_deprecation_warning():
    store = earthaccess.store.Store()
    with warnings.catch_warnings(record=True) as w:
        # Cause all warnings to always be triggered.
        warnings.simplefilter("always")
        # Trigger a warning.
        api.get_s3fs_session()
        store.get_s3fs_session()
        # Verify some things
        assert len(w) == 2
        assert issubclass(w[0].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[0].message)
        assert issubclass(w[1].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[1].message)

I ran into an error while running this test on my local machine which goes something like this FAILED tests/unit/test_deprecations.py::test_deprecation_warning - TypeError: Store.__init__() missing 1 required positional argument: 'auth'. I guess we need some access token here to be passed into the Store Class. What can be the next steps from here?

@Sherwin-14, nice job on coming up with the initial unit test for this. Will you be at today's hackday? If so, we can continue with this then.

Yes I would be there :)

CHANGELOG.md Outdated Show resolved Hide resolved
@chuckwondo
Copy link
Collaborator

This is something I came up with for the unit test.

import warnings
import earthaccess 
from earthaccess import api 
from earthaccess.store import Store

def test_deprecation_warning():
    store = earthaccess.store.Store()
    with warnings.catch_warnings(record=True) as w:
        # Cause all warnings to always be triggered.
        warnings.simplefilter("always")
        # Trigger a warning.
        api.get_s3fs_session()
        store.get_s3fs_session()
        # Verify some things
        assert len(w) == 2
        assert issubclass(w[0].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[0].message)
        assert issubclass(w[1].category, DeprecationWarning)
        assert "Use get_s3_filesystem instead" in str(w[1].message)

I ran into an error while running this test on my local machine which goes something like this FAILED tests/unit/test_deprecations.py::test_deprecation_warning - TypeError: Store.__init__() missing 1 required positional argument: 'auth'. I guess we need some access token here to be passed into the Store Class. What can be the next steps from here?

@Sherwin-14, thanks for joining today's hackday! Here's a summary of what we covered today:

  1. You will attempt to resolve the rebasing issue you encountered. If this proves tricky, please reach out for help.
  2. You will either parametrize the test function above, or simply split it into 2 test functions (since there are very few lines of code, if parametrization proves tricky, splitting into 2 test functions is perfectly acceptable)
  3. You will resolve the TypeError shown above. After doing a bit more digging, I realized that we were looking at the wrong test_auth.py file. There are 2 of them: one in the unit tests and one in the integration tests. We were looking at the one in integration tests. To resolve the TypeError, do the following:

Within the same file where you put your new test code shown above, also add this:

@pytest.fixture(scope="session")
@responses.activate
@mock.patch("getpass.getpass")
@mock.patch("builtins.input")
def auth(user_input, user_password):
    user_input.return_value = "user"
    user_password.return_value = "password"
    json_response = [
        {"access_token": "EDL-token-1", "expiration_date": "12/15/2021"},
        {"access_token": "EDL-token-2", "expiration_date": "12/16/2021"},
    ]
    responses.add(
        responses.GET,
        "https://urs.earthdata.nasa.gov/api/users/tokens",
        json=json_response,
        status=200,
    )
    responses.add(
        responses.GET,
        "https://urs.earthdata.nasa.gov/profile",
        json={"email_address": "[email protected]"},
        status=200,
    )

    earthdata.login(strategy="interactive")

    return earthdata.__auth__

Then you should be able to modify your test like so:

def test_deprecation_warning(auth):
    store = earthaccess.store.Store(auth)
    ...

I think that should work, or at least get you much closer.

earthaccess/api.py Outdated Show resolved Hide resolved
@chuckwondo
Copy link
Collaborator

@mfisher87, can you tell if the failing integration tests are related to the recent integration test troubles you've been working on recently?

@Sherwin-14
Copy link
Contributor Author

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

@chuckwondo
Copy link
Collaborator

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

It's a typo. Apologies. It should be earthaccess.

@Sherwin-14
Copy link
Contributor Author

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

It's a typo. Apologies. It should be earthaccess.

Thanks this helps! But then should'nt it be earthacess.api since we are using the login function from the api.py file?

@chuckwondo
Copy link
Collaborator

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

It's a typo. Apologies. It should be earthaccess.

Thanks this helps! But then should'nt it be earthacess.api since we are using the login function from the api.py file?

Not necessary, as the functions in api.py are exported at the top level.

@Sherwin-14
Copy link
Contributor Author

Sherwin-14 commented Jul 24, 2024

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

It's a typo. Apologies. It should be earthaccess.

Thanks this helps! But then should'nt it be earthacess.api since we are using the login function from the api.py file?

Not necessary, as the functions in api.py are exported at the top level.

I ran into this error while using earthacess instead of earthdata FAILED tests/unit/test_deprecations.py::test_deprecation_warning_for_store - ValueError: At least one of the concept_id, daac, provider or endpointparam...

This something different now where can we access daac and provider?

@mfisher87
Copy link
Collaborator

@mfisher87, can you tell if the failing integration tests are related to the recent integration test troubles you've been working on recently?

Just closing the loop -- we think we need to set up credentials in the fork repo. More details in #773 #736

@chuckwondo
Copy link
Collaborator

chuckwondo commented Jul 24, 2024

Hey just a small doubt. What is earthdata in the above code for the authentication. I am running into an attribute error and I am not sure what is earthdata in this case?

It's a typo. Apologies. It should be earthaccess.

Thanks this helps! But then should'nt it be earthacess.api since we are using the login function from the api.py file?

Not necessary, as the functions in api.py are exported at the top level.

I ran into this error while using earthacess instead of earthdata FAILED tests/unit/test_deprecations.py::test_deprecation_warning_for_store - ValueError: At least one of the concept_id, daac, provider or endpointparam...

This something different now where can we access daac and provider?

Ah, okay. I think we can ignore the ValueError and still get the deprecation warning we expect. So instead of this line:

store.get_s3fs_session()

do this to suppress the ValueError:

with contextlib.suppress(ValueError):
    store.get_s3fs_session()

and at the top of the file, add this import:

import contextlib

@chuckwondo
Copy link
Collaborator

Just linking this PR to it's issue #766

@Sherwin-14
Copy link
Contributor Author

I have added unit test for deprecations

Copy link
Collaborator

@chuckwondo chuckwondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sherwin-14, nicely done!

@mfisher87, regarding the failing integration tests, do you think that once this PR is merged that perhaps they will succeed because they should end up pulling the creds from the nsidc/earthaccess repo? Any harm in merging and trying?

@mfisher87
Copy link
Collaborator

Awesome job @Sherwin-14 !

I do think they should pass on the main branch after a merge, because our repository credentials will be used. Would you be able to keep an eye on the tests after merge, and if they don't pass, open an issue?

Remember, the "10% rule" is still in place, so even if the integration tests say they passed, up to 10% may have failed. We really need to get on top of fixing that!

@Sherwin-14
Copy link
Contributor Author

Awesome job @Sherwin-14 !

I do think they should pass on the main branch after a merge, because our repository credentials will be used. Would you be able to keep an eye on the tests after merge, and if they don't pass, open an issue?

Remember, the "10% rule" is still in place, so even if the integration tests say they passed, up to 10% may have failed. We really need to get on top of fixing that!

So was my BASE repo theory right? I am kind of still confused this is quite complicated

@mfisher87
Copy link
Collaborator

I agree it is quite complicated. I think at this point we really don't know the cause, but we expect things to work after merging to main because the complexity of the fork rules are out of the picture :)

@chuckwondo
Copy link
Collaborator

I agree it is quite complicated. I think at this point we really don't know the cause, but we expect things to work after merging to main because the complexity of the fork rules are out of the picture :)

@mfisher87, shall I merge? @Sherwin-14 are you okay if I do a squash merge?

@mfisher87
Copy link
Collaborator

I'm good with a merge, but for whoever clicks merge: can you please keep an eye on the tests on the main branch to ensure they do pass? I am concerned that if they fail we may not notice right away.

@Sherwin-14
Copy link
Contributor Author

I agree it is quite complicated. I think at this point we really don't know the cause, but we expect things to work after merging to main because the complexity of the fork rules are out of the picture :)

@mfisher87, shall I merge? @Sherwin-14 are you okay if I do a squash merge?

Yeah I am okay with that

@chuckwondo chuckwondo merged commit 0945db5 into nsidc:main Jul 27, 2024
7 of 11 checks passed
@chuckwondo
Copy link
Collaborator

Fixes #766

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants