Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api): Fix bug where releases with slashes in their name failed to load #10940

Closed
wants to merge 2 commits into from

Conversation

wedamija
Copy link
Member

@wedamija wedamija commented Dec 5, 2018

Not sure this is actually practical in production, because it relies on REQUEST_URI, which I'm not sure will always be available. Also unsure whether using it to overwrite PATH_INFO could have other negative effects.

Possibly a safer option option could be to include the encoded project version in ReleaseSerializer as something like safeVersion. Then the FE could just include that as a parameter when making requests to the api, and it'd get double encoded when it encodes the url. Then we can just decode the version like in this diff and it'll work as expected.

Fixes #16012.

@mattrobenolt
Copy link
Contributor

I'd much rather we lean on some sort of double encoding instead of mutating the wsgi environ dict.

@wedamija
Copy link
Member Author

wedamija commented Dec 5, 2018

I'd much rather we lean on some sort of double encoding instead of mutating the wsgi environ dict.

Yeah, I think I agree. Downside is that the URL will be really ugly, but release urls are generally going to be ugly in most cases.

@mattrobenolt
Copy link
Contributor

I think the aesthetic of the URL is the least of the concerns imo. :)

@benvinegar
Copy link
Contributor

I think the aesthetic of the URL is the least of the concerns imo. :)

Agreed.

@markstory
Copy link
Member

Double encoding could work well, how would customers know that they need to double encode? Exposing the doubly encoded version as a property in the serializer would make this and our own front-end easier.

We will also need to update sentry-cli to do additional encoding.

@wedamija wedamija force-pushed the ProjectIssuesSlashHack branch from 1cd3ee5 to fdd270c Compare December 7, 2018 00:53
@wedamija wedamija changed the title (WIP) Hacky POC for supporting slashes in release urls fix(api): Fix bug where releases with slashes in their name failed to load Dec 7, 2018
@wedamija
Copy link
Member Author

wedamija commented Dec 7, 2018

We'll need follow-up FE diff to start using the new safeVersion value when we're passing version as part of the path. If it's in the querystring or a POST payload then the original version should still be used.

Also need to update sentry-cli

@@ -256,6 +257,7 @@ def serialize(self, obj, attrs, user, *args, **kwargs):
d = {
'version': obj.version,
'shortVersion': obj.short_version,
'safeVersion': quote(obj.version, safe=''),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to other names for this if anyone has opinions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other options could be:

  • slug
  • urlVersion
  • pathName

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not safeVersion

Slug is what we use elsewhere (e.g. project.slug).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slug sounds like the best option

@@ -256,6 +257,7 @@ def serialize(self, obj, attrs, user, *args, **kwargs):
d = {
'version': obj.version,
'shortVersion': obj.short_version,
'safeVersion': quote(obj.version, safe=''),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other options could be:

  • slug
  • urlVersion
  • pathName

@@ -40,3 +42,18 @@ def get_date_range_from_params(params):
raise InvalidParams('start must be before end')

return (start, end)


def get_release(organization, version, projects=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a class/static method on the Release model? 🚲🏠

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this, but I don't think so. It's really just an API concern, the rest of the codebase doesn't need to know about the double encoding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I don't know that utils is a great place either...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I lean towards 'utils' should never be used, but we're way past that point.


@fixture
def release_2(self):
release = Release.objects.create(organization_id=self.organization.id, version='12345678')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should one of these releases have a / in the version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these aren't used in the slash tests. I was mostly refactoring to remove a bunch of duplication here

Testing when a repo gets deleted leaving dangling last commit id and author_ids
Made the decision that the Serializer must handle the data even in the case that the
commit_id or the author_ids point to records that do not exist.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to this test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks!

… load

Releases with slashes in their names failed to load, even when encoded. This is because Django decodes the path before matching any urls, and so `%2F` is converted back to `/`, and then our url regexes are unable to properly match.

To resolve this we're adding a `safeVersion` to the `ReleaseSerializer`. This encodes the version before being returned, so that when the FE passes it back to us it'll be double encoded, and thus we'll preserve the encoding in the path. We then need to unencode before using the version to retrieve a `Release`.

Added a fallback in to handle the case where a version looks like it's encoded and we inadvertently cause problems with loading it.

Fixes ISSUE-203
@wedamija wedamija force-pushed the ProjectIssuesSlashHack branch from fdd270c to be709fc Compare December 7, 2018 18:40
@benvinegar
Copy link
Contributor

I'm a little nervous that this is a departure from what we've done before. For example, if this is slug, why encode it at all and just strip scary characters and call it a day?

@wedamija
Copy link
Member Author

wedamija commented Dec 7, 2018

I'm a little nervous that this is a departure from what we've done before. For example, if this is slug, why encode it at all and just strip scary characters and call it a day?

Slug is probably not a perfect name, we could go with something like pathName to be more specific.

One reason to not just strip those characters is so that we can show the release name as the user expects, rather than some munged name. I don't really have enough context to know whether that's important though.

We could just go down the path of adding an actual slug column and doing a data migration as well.

@mattrobenolt
Copy link
Contributor

If we strip characters, you can't reverse it back into the database column. So it becomes a lossy process that can't be reversed. Double encoding is reversable at least.

Ideally, you're right, we should store a real slug, but that's gonna be more work than it sounds on the surface. :)

@wedamija
Copy link
Member Author

@mattrobenolt @markstory Should we move forward with this then? As is, it's backwards compatible so we can patch the FE and sentry-cli once this is live.

@markstory
Copy link
Member

I think this moves us towards something better. If this solves the issues customers are facing we could revisit adding a column and filling it with 'clean' data in the future.

@BYK BYK self-assigned this Oct 28, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2021

This pull request has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Accepted, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@vercel
Copy link

vercel bot commented Jan 15, 2021

@wedamija is attempting to deploy a commit to the Sentry Team on Vercel.

A member of the Team first needs to authorize it.

@wedamija wedamija closed this Aug 31, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Sep 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decode(?) Sentry release name
5 participants