Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unexpected "Vary" #735

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion django_structlog/middlewares/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from django.core.exceptions import PermissionDenied
from django.core.signals import got_request_exception
from django.http import Http404, StreamingHttpResponse
from django.utils.functional import SimpleLazyObject

from .. import signals
from ..app_settings import app_settings
Expand All @@ -39,6 +40,7 @@
if TYPE_CHECKING: # pragma: no cover
from types import TracebackType

from django.contrib.auth.base_user import AbstractBaseUser
from django.http import HttpRequest, HttpResponse

logger = structlog.getLogger(__name__)
Expand Down Expand Up @@ -207,13 +209,30 @@ def format_request(request: "HttpRequest") -> str:
@staticmethod
def bind_user_id(request: "HttpRequest") -> None:
user_id_field = app_settings.USER_ID_FIELD
if hasattr(request, "user") and request.user is not None and user_id_field:
if not user_id_field:
return

session_was_accessed = (
request.session.accessed if hasattr(request, "session") else None
)

if hasattr(request, "user") and request.user is not None:
user_id = None
if hasattr(request.user, user_id_field):
user_id = getattr(request.user, user_id_field)
if isinstance(user_id, uuid.UUID):
user_id = str(user_id)
structlog.contextvars.bind_contextvars(user_id=user_id)
if session_was_accessed is not None and not session_was_accessed:
"""using SessionMiddleware but user was never accessed, must reset accessed state"""
user = request.user

def get_user() -> Any:
request.session.accessed = True
return user

request.user = cast("AbstractBaseUser", SimpleLazyObject(get_user))
request.session.accessed = False
Comment on lines -210 to +235
Copy link

@last-partizan last-partizan Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is working as expected, I'm not sure this is the best way to go, because we're accessing user and then lying to SessionMiddleware basically.

I see two possible alternatives here.

1. Use truly lazy user_id, something along these lines:

    @staticmethod
    def bind_user_id(request: "HttpRequest") -> None:
        class LazyUserId:
            def __repr__(self):
                user_id_field = app_settings.USER_ID_FIELD
                user_id = None
                if hasattr(request, "user") and request.user is not None and user_id_field:
                    if hasattr(request.user, user_id_field):
                        user_id = getattr(request.user, user_id_field)
                return str(user_id)

        structlog.contextvars.bind_contextvars(user_id=LazyUserId())

If log level is higher than info, these variables (logger.info("request_started", **log_kwargs)) aren't used, so we do not access user_id at all. No Vary: Cookie.

But, with log level <= INFO, those logs are written, and so Vary: Cookie added.

2. Check if request.session was already accessed, and add user_id only in this case.

    @staticmethod
    def bind_user_id(request: "HttpRequest") -> None:
        user_id_field = app_settings.USER_ID_FIELD
        if (
            hasattr(request, "session")
            and request.session.accessed
            and hasattr(request, "user")
            and request.user is not None
            and user_id_field
        ):
            user_id = None
            if hasattr(request.user, user_id_field):
                user_id = getattr(request.user, user_id_field)
                if isinstance(user_id, uuid.UUID):
                    user_id = str(user_id)
            structlog.contextvars.bind_contextvars(user_id=user_id)

That's also lying but from another angle - request was authorized, but it does not matter because user was not accessed.


I would prefer first option, because it's truly lazy and reflects reality as close as possible.

But, if you think modifying request.session.accessed is safe approach - it's good enough for me. Request has no vary: Cookie and that is what matters most.

After we agree on the best approach here, let me check this one more time in real app scenario, and I'll report back how it works.

Copy link
Owner Author

@jrobichaud jrobichaud Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is working as expected, I'm not sure this is the best way to go, because we're accessing user and then lying to SessionMiddleware basically.

The purpose of Vary is to tell if the content returned may change for caching purpose.

Normally checking if the user has been accessed is a good way to tell if the content may change. Ex: a condition to render something based on the user, checking permissions, etc.

However django-structlog accessing the user won't affect the template at all. django-structlog needs to tell which user accessed a route even if the user was not accessed in the view.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I have tested this code and it works as expected.

I had to make some minor changes, to catch a case when this was included before AuthenticationMiddleware, and my final version looks like this:

    @staticmethod
    def bind_user_id(request: "HttpRequest") -> None:
        user_id_field = app_settings.USER_ID_FIELD
        if not user_id_field:
            return

        if not hasattr(request, "user"):
            return

        session_was_accessed = (
            request.session.accessed if hasattr(request, "session") else None
        )

        if request.user is not None:
            user_id = None
            if hasattr(request.user, user_id_field):
                user_id = getattr(request.user, user_id_field)
                if isinstance(user_id, uuid.UUID):
                    user_id = str(user_id)
            structlog.contextvars.bind_contextvars(user_id=user_id)

        if session_was_accessed is False:
            """using SessionMiddleware but user was never accessed, must reset accessed state"""

            user = request.user

            def get_user() -> Any:
                request.session.accessed = True
                return user

            request.user = cast("AbstractBaseUser", SimpleLazyObject(get_user))
            request.session.accessed = False


def process_got_request_exception(
self, sender: Type[Any], request: "HttpRequest", **kwargs: Any
Expand Down
93 changes: 93 additions & 0 deletions test_app/tests/middlewares/test_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
from unittest.mock import AsyncMock, Mock, patch

import structlog
from django.contrib.auth.middleware import AuthenticationMiddleware
from django.contrib.auth.models import AnonymousUser, User
from django.contrib.sessions.middleware import SessionMiddleware
from django.contrib.sites.models import Site
from django.contrib.sites.shortcuts import get_current_site
from django.core.exceptions import PermissionDenied
Expand Down Expand Up @@ -236,6 +238,97 @@ class SimpleUser:
self.assertIn("user_id", record.msg)
self.assertIsNone(record.msg["user_id"])

@override_settings(
SECRET_KEY="00000000000000000000000000000000",
)
def test_process_request_session_middleware_without_vary(self) -> None:
def get_response(_request: HttpRequest) -> HttpResponse:
with self.assertLogs(__name__, logging.INFO) as log_results:
self.logger.info("hello")
self.log_results = log_results
return HttpResponse()

request = self.factory.get("/foo")

# simulate SessionMiddleware, AuthenticationMiddleware, and RequestMiddleware called in that order
request_middleware = RequestMiddleware(get_response)
authentication_middleware = AuthenticationMiddleware(
cast(
Any,
lambda r: request_middleware(r),
)
)
session_middleware = SessionMiddleware(
cast(Any, lambda r: authentication_middleware(r))
)
response = session_middleware(request)

self.assertEqual(1, len(self.log_results.records))
record = self.log_results.records[0]
self.assertIsNone(cast(HttpResponse, response).headers.get("Vary"))

self.assertEqual("INFO", record.levelname)

self.assertIn("user_id", record.msg)
self.assertIsNone(record.msg["user_id"])

@override_settings(
SECRET_KEY="00000000000000000000000000000000",
)
def test_process_request_session_middleware_with_vary(self) -> None:
def get_response(_request: HttpRequest) -> HttpResponse:
assert isinstance(
request.user, AnonymousUser
) # force evaluate user to trigger session middleware
with self.assertLogs(__name__, logging.INFO) as log_results:
self.logger.info("hello")
self.log_results = log_results
return HttpResponse()

request = self.factory.get("/foo")

# simulate SessionMiddleware, AuthenticationMiddleware, and RequestMiddleware called in that order
request_middleware = RequestMiddleware(get_response)
authentication_middleware = AuthenticationMiddleware(
cast(Any, lambda r: request_middleware(r))
)
session_middleware = SessionMiddleware(
cast(Any, lambda r: authentication_middleware(r))
)
response = session_middleware(request)

self.assertEqual(1, len(self.log_results.records))
record = self.log_results.records[0]
self.assertIsNotNone(cast(HttpResponse, response).headers.get("Vary"))

self.assertEqual("INFO", record.levelname)

self.assertIn("user_id", record.msg)
self.assertIsNone(record.msg["user_id"])

@override_settings(
DJANGO_STRUCTLOG_USER_ID_FIELD=None,
)
def test_process_request_no_user_id_field(self) -> None:
def get_response(_request: HttpRequest) -> HttpResponse:
with self.assertLogs(__name__, logging.INFO) as log_results:
self.logger.info("hello")
self.log_results = log_results
return HttpResponse()

request = self.factory.get("/foo")

middleware = RequestMiddleware(get_response)
response = middleware(request)
self.assertEqual(200, cast(HttpResponse, response).status_code)

self.assertEqual(1, len(self.log_results.records))
record = self.log_results.records[0]

self.assertEqual("INFO", record.levelname)

self.assertNotIn("user_id", record.msg)

def test_log_user_in_request_finished(self) -> None:
mock_response = Mock()
mock_response.status_code = 200
Expand Down
Loading