Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment downloads 403 after 30s, requiring frequent re-extraction #221

Open
fren-archive opened this issue Sep 28, 2024 · 8 comments
Open

Comments

@fren-archive
Copy link

I and several others on Discord are seeing frequent 403s while downloading segments, though some users reported they are not seeing this behavior. Specifically it happens 30s after each page extraction, meaning each instance of ytarchive needs to extract the URLs 120 times per hour instead of the 1 time that is intended. The output with --debug set will look like this:

...
2024/09/25 03:45:33 DEBUG: audio3: HTTP Error for fragment 685: 403 Forbidden
2024/09/25 03:45:33 DEBUG: audio: Attempting to retrieve a new download URL
2024/09/25 03:45:33 DEBUG: audio2: HTTP Error for fragment 686: 403 Forbidden
2024/09/25 03:45:33 DEBUG: audio1: HTTP Error for fragment 687: 403 Forbidden
2024/09/25 03:45:34 DEBUG: Retrieving URLs from web DASH manifest
2024/09/25 03:45:34 DEBUG: Retrieving URLs from web adaptive formats
2024/09/25 03:46:03 DEBUG: video3: HTTP Error for fragment 770: 403 Forbidden
2024/09/25 03:46:03 DEBUG: video: Attempting to retrieve a new download URL
2024/09/25 03:46:04 DEBUG: Retrieving URLs from web DASH manifest
2024/09/25 03:46:04 DEBUG: Retrieving URLs from web adaptive formats
2024/09/25 03:46:33 DEBUG: video4: HTTP Error for fragment 1025: 403 Forbidden
2024/09/25 03:46:33 DEBUG: video: Attempting to retrieve a new download URL
2024/09/25 03:46:33 DEBUG: video3: HTTP Error for fragment 1024: 403 Forbidden
2024/09/25 03:46:34 DEBUG: Retrieving URLs from web DASH manifest
2024/09/25 03:46:34 DEBUG: Retrieving URLs from web adaptive formats
2024/09/25 03:47:03 DEBUG: video4: HTTP Error for fragment 1276: 403 Forbidden
2024/09/25 03:47:03 DEBUG: video: Attempting to retrieve a new download URL
2024/09/25 03:47:04 DEBUG: Retrieving URLs from web DASH manifest
2024/09/25 03:47:04 DEBUG: Retrieving URLs from web adaptive formats
...

With just 3-5 instances I was able to trigger bot detection measures, which caused the recordings to all fail nearly simultaneously. There are other effects, such as that content which is removed or set members-only will fail almost immediately instead of being able to continue for 6h. Passing cookies did not change the behavior.

This is, as far as I know, tied to the android client and potoken. When potoken is enabled (which it is on android client in most cases) URLs fail after 30s unless they have the proper value of pot. This is similar to nsig calculation but substantially worse because it relies on fingerprinting methods rather than a simple javascript check, and probably cannot reasonably be emulated outside a browser.

I have the same issue with yt-dlp if I try to use android or web clients, but ios and web_creator clients are not (yet) affected. web_creator also provides dash manifest urls which do not need nsig calculation so that seems like the easiest path to avoid the issue. Alternatively an option for the user to offload URL extraction to an external script (presumably yt-dlp) might be more robust long-term.

@Kethsar
Copy link
Owner

Kethsar commented Sep 28, 2024

I knew this would start eventually. I don't have the drive the yt-dlp people do to fight it much either. I might try updating which clients are used for requests, but once they inevitably all get hit, I'll probably just stop bothering.
Honestly surprised this has worked as long as it has anyway.

@nosoop
Copy link

nosoop commented Oct 31, 2024

It seems like an issue somewhere else — my application currently makes practically the same Android DASH manifest request and to my knowledge that's still working as it has (on the same machine that's seeing the issue with ytarchive, though that happens inconsistently I was using 0.4.0 so yeah it's just outdated; had to backport the relevant patches for interop). This doesn't seem like a difficult fix so I'll try to investigate.

Might be worth noting that I had to add some retry handling since I'll occasionally get failures in the Android player response with a missing streamingData. I haven't bothered inspecting what YouTube is sending back in those situations.

@nosoop
Copy link

nosoop commented Nov 1, 2024

@fren-archive Just to confirm, are you using 0.5.0? That's the first tagged release with d6a4be6 and 1671114; anything prior to those will only pull the web formats.

I'm not sure how ytarchive handles streams that require authentication since I rarely ever use that. If it only pulls the web format then you would need a valid POToken for sure, but beyond that it should work as it did before (i.e. failure on next player fetch if the cookies were rotated but should still continue using the valid manifest until it expires).

@fren-archive
Copy link
Author

fren-archive commented Nov 19, 2024

I updated to 0.5.0 a couple days after posting this and added cookies and the issue went away for a while (forgot to update here). However, as of today it is happening even on the most recent version with valid cookies.

@Kethsar
Copy link
Owner

Kethsar commented Nov 20, 2024

I've added support for the po token in commit 99a510d
It will require grabbing the token as outlined at https://github.com/yt-dlp/yt-dlp/wiki/Extractors#po-token-guide

It's been working for me. I only just started getting hit with the 30 second 403s around 24 hours ago, so finally had reason to implement it (and a way to verify since without it affecting me I'd have no idea if it was working).

@nosoop
Copy link

nosoop commented Nov 24, 2024

Yeah; the Android client that was used no longer returns streaming data. Not sure which clients are currently working if any.

Can confirm that change works with a token and cookies on a fresh profile.

In place of cookies for logged out users, you can pass the token's corresponding visitorData in the client context of the player request, here. Would allow for using the values from an iv-org/youtube-trusted-session-generator instance in the LAN.

Reference.

@Patrosi73
Copy link

Patrosi73 commented Dec 28, 2024

I now get 403's instantly even after extracting the poToken and the associated cookies from the same browser session.

ytarchive --cookies cookies.firefox-private.txt --potoken MpQBsn7X63JE0HmyRoJ5Zcf_dWZbLwxUYTtf5pBVMsizOp1j2PDKU-3FoHtVf_uI0XELONZAnOgMTy0BXX0PN0OPXnvjW1QbxFC-uMMW_R31AfoqfTAjeGQ0Neku2ztEEfSFngHBCjF_pBbeJ2yVypbhsJ5CYBR0OGW9esiJbIY8kAF4gzgdhUjhDI9FE1spMAp4UMGFxQ== --debug --trace https://www.youtube.com/watch?v=rgNcb8PTvgc best
ytarchive 0.5.0
2024/12/28 13:53:48 INFO: Loaded cookie file cookies.firefox-private.txt
2024/12/28 13:53:48 Channel: IShowSpeed
2024/12/28 13:53:48 Video Title: PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)
2024/12/28 13:53:48 TRACE: &{POST https://www.youtube.com/youtubei/v1/player?innertube_key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 HTTP/1.1 1 1 map[Content-Type:[application/json] Origin:[https://www.youtube.com] X-Youtube-Client-Name:[1] X-Youtube-Client-Version:[2.20241219.01.01]] {{
        'context': {
                'client': {
                        'clientName': 'WEB',
                        'clientVersion': '2.20241219.01.01',
                        'hl': 'en'
                }
        },
        'videoId': 'rgNcb8PTvgc',
        'playbackContext': {
                'contentPlaybackContext': {
                        'html5Preference': 'HTML5_PREF_WANTS'
                }
        },
        'serviceIntegrityDimensions': {
                'poToken': 'MpQBsn7X63JE0HmyRoJ5Zcf_dWZbLwxUYTtf5pBVMsizOp1j2PDKU-3FoHtVf_uI0XELONZAnOgMTy0BXX0PN0OPXnvjW1QbxFC-uMMW_R31AfoqfTAjeGQ0Neku2ztEEfSFngHBCjF_pBbeJ2yVypbhsJ5CYBR0OGW9esiJbIY8kAF4gzgdhUjhDI9FE1spMAp4UMGFxQ=='
        }
}
        } 0x10919e0 503 [] false www.youtube.com map[] map[] <nil> map[]   <nil> <nil> <nil> {{}} <nil> [] map[]}
2024/12/28 13:53:48 DEBUG: Retrieving URLs from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 139 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 140 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 133 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 134 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 135 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 160 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 298 from Web API DASH manifest
2024/12/28 13:53:49 TRACE: Setting itag 299 from Web API DASH manifest
2024/12/28 13:53:49 DEBUG: Retrieving URLs from Web API adaptive formats
2024/12/28 13:53:49 DEBUG: Retrieving URLs from web DASH manifest
2024/12/28 13:53:50 TRACE: Setting itag 244 from web adaptive formats
2024/12/28 13:53:50 TRACE: Setting itag 278 from web adaptive formats
2024/12/28 13:53:50 TRACE: Setting itag 302 from web adaptive formats
2024/12/28 13:53:50 TRACE: Setting itag 303 from web adaptive formats
2024/12/28 13:53:50 TRACE: Setting itag 243 from web adaptive formats
2024/12/28 13:53:50 TRACE: Setting itag 242 from web adaptive formats
2024/12/28 13:53:50 DEBUG: Retrieving URLs from web adaptive formats
2024/12/28 13:53:50 Selected quality: 1080p60 (h264)
2024/12/28 13:53:50 Stream started at time 2024-12-28T08:42:09+00:00
2024/12/28 13:53:50 INFO: Starting download to H:\User\Downloads\ytarchive\rgNcb8PTvgc__3778649676\PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)-rgNcb8PTvgc.f140.ts
2024/12/28 13:53:50 INFO: Starting download to H:\User\Downloads\ytarchive\rgNcb8PTvgc__3778649676\PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)-rgNcb8PTvgc.f299.ts
2024/12/28 13:53:50 DEBUG: audio1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:50 DEBUG: audio: Attempting to retrieve a new download URL
2024/12/28 13:53:50 DEBUG: video1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:50 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 13:53:51 DEBUG: audio1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:51 DEBUG: audio: Attempting to retrieve a new download URL
2024/12/28 13:53:51 DEBUG: video1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:51 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 13:53:52 DEBUG: audio1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:52 DEBUG: audio: Attempting to retrieve a new download URL
2024/12/28 13:53:52 DEBUG: video1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:52 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 13:53:53 DEBUG: video1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:53 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 13:53:53 DEBUG: audio1: HTTP Error for fragment 0: 403 Forbidden
2024/12/28 13:53:53 DEBUG: audio: Attempting to retrieve a new download URL

2024/12/28 13:53:54 WARNING: User Interrupt, Stopping download...
2024/12/28 13:53:54 DEBUG: video1: exiting
2024/12/28 13:53:54 DEBUG: audio1: exiting
2024/12/28 13:53:54 DEBUG: video-download thread closing
2024/12/28 13:53:54 DEBUG: audio-download thread closing


Download stopped prematurely. Would you like to merge the currently downloaded data? [y/N]:
Would you like to save any created files? [y/N]:
Exiting...

I've tried this multiple times, even attempting to extract cookies & the poToken on two different browsers. Also the browser was playing the stream normally just fine. I extracted the poToken from the YouTube embedded player. I also tried getting it from the main YouTube page, no dice. The browsers are logged out of YouTube.

Also tried using yt-dlp for this, it doesn't seem to work either, I get 403's on it as well.

Maybe passing through visitor_data is required now?

@Patrosi73
Copy link

Magically, yt-dlp started working on like.. the 10th attempt? That was probably somehow my fault, though I have no idea what I had done wrong. Then I tried ytarchive as well, I also managed to make it work - you have to add "web+" before the actual poToken. However, I am once again getting 403's every 30 seconds:

ytarchive --cookies cookies.firefox-private.txt --potoken web+MpQBo02xSrMuXXyjdSoNIvC7Xao25rHefLkfG8VFj63_1F7CAhi0yWdwtq8i7FCV59DxnE1R-iMgwFQRPcGRuTuXkl0LnqpBgxmmefWlTbu-KGNYCHyecNdLf57aEvR1hfMUMeoaSgyXLRbRX1taKW3mDQeMqBDERYGPbnwM5Jf51GlAScRg9NVsmGkQu8RklxF9kSrMkQ== --debug --trace https://www.youtube.com/watch?v=rgNcb8PTvgc best
ytarchive 0.5.0
2024/12/28 16:25:36 INFO: Loaded cookie file cookies.firefox-private.txt
2024/12/28 16:25:37 Channel: IShowSpeed
2024/12/28 16:25:37 Video Title: PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)
2024/12/28 16:25:37 TRACE: &{POST https://www.youtube.com/youtubei/v1/player?innertube_key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 HTTP/1.1 1 1 map[Content-Type:[application/json] Origin:[https://www.youtube.com] X-Youtube-Client-Name:[1] X-Youtube-Client-Version:[2.20241219.01.01]] {{
        'context': {
                'client': {
                        'clientName': 'WEB',
                        'clientVersion': '2.20241219.01.01',
                        'hl': 'en'
                }
        },
        'videoId': 'rgNcb8PTvgc',
        'playbackContext': {
                'contentPlaybackContext': {
                        'html5Preference': 'HTML5_PREF_WANTS'
                }
        },
        'serviceIntegrityDimensions': {
                'poToken': 'web+MpQBo02xSrMuXXyjdSoNIvC7Xao25rHefLkfG8VFj63_1F7CAhi0yWdwtq8i7FCV59DxnE1R-iMgwFQRPcGRuTuXkl0LnqpBgxmmefWlTbu-KGNYCHyecNdLf57aEvR1hfMUMeoaSgyXLRbRX1taKW3mDQeMqBDERYGPbnwM5Jf51GlAScRg9NVsmGkQu8RklxF9kSrMkQ=='
        }
}
        } 0x10919e0 507 [] false www.youtube.com map[] map[] <nil> map[]   <nil> <nil> <nil> {{}} <nil> [] map[]}
2024/12/28 16:25:37 DEBUG: Error getting Web API player response: returned non-200 status code 400
2024/12/28 16:25:37 DEBUG: Retrieving URLs from web DASH manifest
2024/12/28 16:25:38 TRACE: Setting itag 135 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 160 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 298 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 299 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 139 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 140 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 133 from web adaptive formats
2024/12/28 16:25:38 TRACE: Setting itag 134 from web adaptive formats
2024/12/28 16:25:38 DEBUG: Retrieving URLs from web adaptive formats
2024/12/28 16:25:38 Selected quality: 1080p60 (h264)
2024/12/28 16:25:38 Stream started at time 2024-12-28T08:42:09+00:00
2024/12/28 16:25:38 INFO: Starting download to H:\User\Downloads\ytarchive\rgNcb8PTvgc__3849644684\PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)-rgNcb8PTvgc.f140.ts
2024/12/28 16:25:38 INFO: Starting download to H:\User\Downloads\ytarchive\rgNcb8PTvgc__3849644684\PLAYING FORTNITE UNTIL WE WIN pt 2⛏️ ft. Kai Cenat (RANKED)-rgNcb8PTvgc.f299.ts
2024/12/28 16:26:06 DEBUG: video1: HTTP Error for fragment 249: 403 Forbidden
2024/12/28 16:26:06 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 16:26:07 TRACE: &{POST https://www.youtube.com/youtubei/v1/player?innertube_key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 HTTP/1.1 1 1 map[Content-Type:[application/json] Origin:[https://www.youtube.com] X-Youtube-Client-Name:[1] X-Youtube-Client-Version:[2.20241219.01.01]] {{
        'context': {
                'client': {
                        'clientName': 'WEB',
                        'clientVersion': '2.20241219.01.01',
                        'hl': 'en'
                }
        },
        'videoId': 'rgNcb8PTvgc',
        'playbackContext': {
                'contentPlaybackContext': {
                        'html5Preference': 'HTML5_PREF_WANTS'
                }
        },
        'serviceIntegrityDimensions': {
                'poToken': 'web+MpQBo02xSrMuXXyjdSoNIvC7Xao25rHefLkfG8VFj63_1F7CAhi0yWdwtq8i7FCV59DxnE1R-iMgwFQRPcGRuTuXkl0LnqpBgxmmefWlTbu-KGNYCHyecNdLf57aEvR1hfMUMeoaSgyXLRbRX1taKW3mDQeMqBDERYGPbnwM5Jf51GlAScRg9NVsmGkQu8RklxF9kSrMkQ=='
        }
}
        } 0x10919e0 507 [] false www.youtube.com map[] map[] <nil> map[]   <nil> <nil> <nil> {{}} <nil> [] map[]}
2024/12/28 16:26:07 DEBUG: Error getting Web API player response: returned non-200 status code 400
2024/12/28 16:26:07 DEBUG: Retrieving URLs from web DASH manifest
2024/12/28 16:26:07 TRACE: Setting itag 134 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 135 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 160 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 298 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 299 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 139 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 140 from web adaptive formats
2024/12/28 16:26:07 TRACE: Setting itag 133 from web adaptive formats
2024/12/28 16:26:07 DEBUG: Retrieving URLs from web adaptive formats
2024/12/28 16:26:36 DEBUG: video1: HTTP Error for fragment 483: 403 Forbidden
2024/12/28 16:26:36 DEBUG: video: Attempting to retrieve a new download URL
2024/12/28 16:26:37 TRACE: &{POST https://www.youtube.com/youtubei/v1/player?innertube_key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 HTTP/1.1 1 1 map[Content-Type:[application/json] Origin:[https://www.youtube.com] X-Youtube-Client-Name:[1] X-Youtube-Client-Version:[2.20241219.01.01]] {{
        'context': {
                'client': {
                        'clientName': 'WEB',
                        'clientVersion': '2.20241219.01.01',
                        'hl': 'en'
                }
        },
        'videoId': 'rgNcb8PTvgc',
        'playbackContext': {
                'contentPlaybackContext': {
                        'html5Preference': 'HTML5_PREF_WANTS'
                }
        },
        'serviceIntegrityDimensions': {
                'poToken': 'web+MpQBo02xSrMuXXyjdSoNIvC7Xao25rHefLkfG8VFj63_1F7CAhi0yWdwtq8i7FCV59DxnE1R-iMgwFQRPcGRuTuXkl0LnqpBgxmmefWlTbu-KGNYCHyecNdLf57aEvR1hfMUMeoaSgyXLRbRX1taKW3mDQeMqBDERYGPbnwM5Jf51GlAScRg9NVsmGkQu8RklxF9kSrMkQ=='
        }
}
        } 0x10919e0 507 [] false www.youtube.com map[] map[] <nil> map[]   <nil> <nil> <nil> {{}} <nil> [] map[]}
2024/12/28 16:26:37 DEBUG: Error getting Web API player response: returned non-200 status code 400
2024/12/28 16:26:37 DEBUG: Retrieving URLs from web DASH manifest
2024/12/28 16:26:37 TRACE: Setting itag 133 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 134 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 135 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 160 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 298 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 299 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 139 from web adaptive formats
2024/12/28 16:26:37 TRACE: Setting itag 140 from web adaptive formats
2024/12/28 16:26:37 DEBUG: Retrieving URLs from web adaptive formats
Video Fragments: 511; Audio Fragments: 521; Max Fragments: 24275; Max Sequence: 24275; Total Downloaded: 94.23MiB
2024/12/28 16:26:42 WARNING: User Interrupt, Stopping download...
2024/12/28 16:26:42 DEBUG: audio1: exiting
2024/12/28 16:26:42 DEBUG: video1: exiting
2024/12/28 16:26:42 DEBUG: audio-download thread closing
2024/12/28 16:26:42 DEBUG: video-download thread closing
Video Fragments: 511; Audio Fragments: 521; Max Fragments: 24275; Max Sequence: 24275; Total Downloaded: 94.23MiB

Download stopped prematurely. Would you like to merge the currently downloaded data? [y/N]:
Would you like to save any created files? [y/N]:
Exiting...

Maybe the DEBUG: Error getting Web API player response: returned non-200 status code 400 has something to do with it...

Also the 403's do not seem to happen on yt-dlp with the poToken and cookies passed through, it doesn't pop up an error even with the debug flags set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants