Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of requests to mirrobrain from Torrent client #290

Open
rgaudin opened this issue Oct 24, 2024 · 10 comments
Open

Lots of requests to mirrobrain from Torrent client #290

rgaudin opened this issue Oct 24, 2024 · 10 comments
Assignees
Labels
question Further information is requested upstream

Comments

@rgaudin
Copy link
Member

rgaudin commented Oct 24, 2024

While looking at the mirrorbrain logs, I found a single client in Germany to be constantly sending requests to a ZIM file.

download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 263 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:54 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:55 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:55 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:56 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:56 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:56 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:57 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:57 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:57 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:57 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:57 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:58 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:59 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:59 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:22:59 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:00 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:00 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:00 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:00 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:02 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 263 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:04 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:05 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:06 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 263 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:06 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 263 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:06 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:06 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:07 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:07 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:07 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:07 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:08 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:09 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:09 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:09 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:09 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:09 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 279 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:10 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:10 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:10 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:10 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 257 "-" "Transmission/4.0.5"
download.kiwix.org 80.239.140.XX - - [24/Oct/2024:11:23:10 +0000] "GET /zim/zimit/survivorlibrary.com_en_all_2024-09.zim HTTP/1.1" 302 303 "-" "Transmission/4.0.5"
  • A 302 response to a ZIM (as is in this case) is normal. MB sends a Location header to the most appropriate mirror.
  • Transmission/4.0.5 is a bittorrent client.
  • Our torrent does not include this URL. It contains a list of the mirrors links instead (see below)
  • Our magnet link (generated by mirrorbrain) includes only that URL in the Acceptable Source as field (fallback).
  • Testing with Transmission/4.0.6, I dont seem to trigger any request, while it reports downloading from the webseed (and the peers)
  • Transmission changelog does not indicate any fix in this regard

Image

{
   "announce": "http://tracker.openzim.org:6969/announce",
   "announce-list": [
      [
         "http://tracker.openzim.org:6969/announce",
         "udp://tracker.openzim.org:6969/announce"
      ]
   ],
   "comment": "survivorlibrary.com_en_all_2024-09.zim",
   "created by": "MirrorBrain/2.18.1",
   "creation date": 1726598029,
   "info": {
      "length": 249495052834,
      "md5sum": "c585c41542a6af8ec25e6c819a03766e",
      "name": "survivorlibrary.com_en_all_2024-09.zim",
      "piece length": 4194304,
      "pieces": "<hex>69 2F 54 B5 B9 [snip]</hex>",
      "sha1": "<hex>95 72 01 CF F3 F3 0B 0C 2D C2 26 9E 1E 2C 79 0E 80 C6 2D 67</hex>",
      "sha256": "<hex>9A 82 28 3F AD E1 07 CB B8 C4 85 92 50 32 D6 2C 41 C3 F4 74 F4 7C 97 21 E9 02 70 52 71 B2 A4 0E</hex>"
   },
   "nodes": [
      [
         "router.bittorrent.com",
         6881
      ],
      [
         "router.utorrent.com",
         6881
      ]
   ],
   "sources": [
      "https://mirror.download.kiwix.org/zim/zimit/survivorlibrary.com_en_all_2024-09.zim",
      "https://mirror-sites-in.mblibrary.info/mirror-sites/download.kiwix.org/zim/zimit/survivorlibrary.com_en_all_2024-09.zim",
      "https://mirrors.dotsrc.org/kiwix/zim/zimit/survivorlibrary.com_en_all_2024-09.zim"
   ],
   "url-list": [
      "https://mirror.download.kiwix.org/zim/zimit/survivorlibrary.com_en_all_2024-09.zim",
      "https://mirror-sites-in.mblibrary.info/mirror-sites/download.kiwix.org/zim/zimit/survivorlibrary.com_en_all_2024-09.zim",
      "https://mirrors.dotsrc.org/kiwix/zim/zimit/survivorlibrary.com_en_all_2024-09.zim"
   ]
}

I don't know what to do. @benoit74 @kelson42 ?

@rgaudin rgaudin added the question Further information is requested label Oct 24, 2024
@kelson42
Copy link
Contributor

@rgaudin Tested with magnet link or .torrent file?

@rgaudin
Copy link
Member Author

rgaudin commented Oct 24, 2024

@rgaudin Tested with magnet link or .torrent file?

Both. .torrent just cannot exhibit this but magnet I can understand if the client is not following the redirection.

@rgaudin
Copy link
Member Author

rgaudin commented Oct 24, 2024

Screenshot is magnet

@benoit74
Copy link
Collaborator

Not much idea, sorry ...

@rgaudin
Copy link
Member Author

rgaudin commented Oct 24, 2024

I see other occurences with Transmission 4.0.6 so definitely not a 4.0.5 bug. I couldn't repro when downloading nor seeding though. Not sure how they get there

@benoit74
Copy link
Collaborator

I have some more food for thought.

Our magnet link (generated by mirrorbrain) includes only that URL in the Acceptable Source as field (fallback).

This assertion seems wrong, this URL is also using as xs and ws (in addition to as):

magnet:?&xt=urn:btih:2db99105d8a93d2d79be696b295ae7945f336935&xt=urn:md5:c585c41542a6af8ec25e6c819a03766e&xl=249495052834&dn=survivorlibrary.com_en_all_2024-09.zim&as=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim&tr=http%3A%2F%2Ftracker.openzim.org%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.openzim.org%3A6969%2Fannounce%0A&ws=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim&xs=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim.torrent

2 more questions:

  • is it really correct to have all these 3 properties redirecting with a 302?
  • I suspect the scenario might be that for some reason, the mirror returned by MB for this source IP is down (low probability) or simply not reachable from this source IP (for whatever reason, might include a proxy - or something else - on their side), and hence Transmission tries again over and over to start the dl, maybe even looping between ws and as

Do we have a way to test which mirror is returned by MB for this IP?

@benoit74
Copy link
Collaborator

I can confirm that I also do not achieve to reproduce the problem with transmission 4.0.6 on Mac. Even when tweaking my /etc/hosts to redirect mirror.download.kiwix.org (the mirror MB assigns me) and tracker.openzim.org to an IP not responding to required protocols.

@rgaudin
Copy link
Member Author

rgaudin commented Oct 25, 2024

OK I found the discrepency ; I was using curl to retrieve the magnet ; and there is only as there. It's in libkiwix that we extend it.

MB original (but broken) version

magnet:?xt=urn:btih:2db99105d8a93d2d79be696b295ae7945f336935
&amp;xt=urn:md5:c585c41542a6af8ec25e6c819a03766e
&amp;xl=249495052834
&amp;dn=survivorlibrary.com_en_all_2024-09.zim
&amp;as=http://download.kiwix.org/zim/zimit/survivorlibrary.com_en_all_2024-09.zim
&amp;tr=http://tracker.openzim.org:6969/announce
&amp;tr=udp://tracker.openzim.org:6969/announce

libkiwix fixed version

magnet:?
&xt=urn:btih:2db99105d8a93d2d79be696b295ae7945f336935
&xt=urn:md5:c585c41542a6af8ec25e6c819a03766e
&xl=249495052834
&dn=survivorlibrary.com_en_all_2024-09.zim
&as=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim
&tr=http%3A%2F%2Ftracker.openzim.org%3A6969%2Fannounce
&tr=udp%3A%2F%2Ftracker.openzim.org%3A6969%2Fannounce%0A
&ws=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim
&xs=http%3A%2F%2Fdownload.kiwix.org%2Fzim%2Fzimit%2Fsurvivorlibrary.com_en_all_2024-09.zim.torrent

So it is included as well as ws (and not xs, you misread).

This doesn't change the situation though:

  • for .torrent, we include direct links to mirrors and as expected, those are hit directly.
  • for magnet, we include only the LB URL so that's what's being used.

The problem being that under certain undetermined conditions, the URL is not followed and apparently retried over and over.

It is now important to note that I also saw MB respond with 206 in logs meaning that the client is sometimes attempting to download directly from it (and probably receiving 206 on MB's response, and not actual content).

@rgaudin
Copy link
Member Author

rgaudin commented Nov 4, 2024

I could eventually reproduce and understand what happens. See transmission/transmission#7227

So Transmission only stores the original webseed URL (which makes sense) and then makes all its range requests to it (following the redirection), so the mirror gets the range request and answers it properly.

So from the User's POV, it works as expected.

The problem is that we (our mirrorbrain) stays in the middle of this conversation and it pollutes our log (because it's a lot of requests and scles linearly with the file size)

Main issue with this is that we count 302 requests to MB as downloads on stats.kiwix.org and we cannot distinguish those range requests from a single download-to-mirror request.

Unless Transmission agrees that the current behavior is not the proper one, we should include a flag in the logs if there is is a range request header and exclude those from matomo.

@rgaudin rgaudin self-assigned this Nov 4, 2024
@rgaudin
Copy link
Member Author

rgaudin commented Nov 11, 2024

After some investigations, it appears that the current behavior is the following:

  • we specifically mark downloads made via the period-less permalinks as 301 in apache config.
  • those permalinks are not used anymore: library.kiwix.org does not uses them nor any reader ; and we dont use the Wiki. Those who know them (integrators) might still use them though.
  • regular (mirrorbrain) downloads generate 302 records in the logs.
  • Those are marked as redirects (not downloads) by the import-log script but not marked for exclusion. those are thus sent to matomo.
  • in matomo, those are visible as exits (in behavior).

Image

  • in matomo, those are marked as Page Views but not as Unique Page Views

Image

@kelson42 WDYT ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested upstream
Projects
None yet
Development

No branches or pull requests

3 participants