Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Cloudflare bad bot protection #4802

Open
5 tasks
lucyb opened this issue Jan 13, 2025 · 4 comments
Open
5 tasks

Enable Cloudflare bad bot protection #4802

lucyb opened this issue Jan 13, 2025 · 4 comments

Comments

@lucyb
Copy link
Contributor

lucyb commented Jan 13, 2025

We believe that Job Server is seeing an increase in bot-related traffic and that this is having an impact on page load times, which is affecting users (thread 1, 2, 3).

Moving Job Server behind Cloudflare and enabling its bad bot protection would demonstrate whether or not that's the case, and if so, it will resolve the issue.

We need to:

  • Set the user-agent for the OpenSAFELY CLI to something sensible. Currently it's set to Python/Requests and that will be falsely picked up as a "bad" bot. We found this was the case for OpenCodelists and recorded our experience in this issue.
  • Add custom nginx rules to allow traffic to the API endpoints directly from the secure backends, as they are only able to talk out to that single IP address.
  • Enable the Cloudflare firewall to count users, as with OpenCodelists in this issue and monitor for a short period.
  • Enable Cloudflare's "Bot Fight Mode".
  • Return to review the impact, using Honeycomb, after a few weeks.
@bloodearnest
Copy link
Member

We cannot currently put job-server behind cloudflare, for security reasons.

Cloudflare used shared IPs, and our firewall rules in TPP allow egress to that IP. So it would be possible for an attacker to use cloudflare to get that same IP, and then egress from TPP to their hosting.

This is true for all services hosted on dokku4.

We used to have cloudflare in front of job-server - we disabled it for this reasons.

@evansd
Copy link
Contributor

evansd commented Jan 13, 2025

I think we can do this, but the backends will need to continue to use the original IP to talk directly to dokku4 and we will need to configure things so that the backends (and only the backends) are allowed to do direct connections and bypass Cloudflare.

For our own services on the backend we use hardcoded /etc/hosts entries and bypass DNS completely so it doesn't matter to them where the public DNS entry for jobs.opensafely.org points.

The only question will be whether TPP use the hostname or the IP in their firewall config. I'm pretty sure it's the IP but obviously we'd need to confirm that first.

There was a tiny bit of discussion about this ages ago, but it fizzled out:
https://bennettoxford.slack.com/archives/C069YDR4NCA/p1723533439182359?thread_ts=1723462393.112779&cid=C069YDR4NCA

@bloodearnest
Copy link
Member

I think we can do this, but the backends will need to continue to use the original IP to talk directly to dokku4 and we will need to configure things so that the backends (and only the backends) are allowed to do direct connections and bypass Cloudflare.

For our own services on the backend we use hardcoded /etc/hosts entries and bypass DNS completely so it doesn't matter to them where the public DNS entry for jobs.opensafely.org points.

Yes, we do have control of backend DNS now, so this is possible, good point.

The only question will be whether TPP use the hostname or the IP in their firewall config. I'm pretty sure it's the IP but obviously we'd need to confirm that first.

So I thought it was IP based, but I think it may be domain based. When the DNS for archive.ubuntu.com changed recently, the TPP firewall tracked the change, and with our hardcoded DNS entries for the old IP address, we lost access. We'd need to check with TPP whether its IP or domain based.

And whilst this is possible, it's adding a level of network complexity, and that is generally the last place you want to add complexity.

@evansd
Copy link
Contributor

evansd commented Jan 14, 2025

A slightly more involved option, but still achievable without major architectural changes, would be to change the hostname the backends use to refer to job-server e.g. to api.opensafely.org. We could configure that to resolve directly to dokku4, while jobs.opensafely.org would go via Cloudflare.

If TPP are using domains to configure to their firewall then we'll have to use a different domain anyway if we want to go down that route.

(Re-opening as I'm assuming that was an inadvertent close.)

@evansd evansd reopened this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants