YouTube Comments #60

EtorixDev · 2023-11-08T04:40:22Z

Hello, I notice in #17 it's stated that getting comments is not part of the InnerTube API. I'm not sure if things have changed or if I am misunderstanding what constitutes as part of the InnerTube API, but by doing the following I have managed to get the comments:

Send a next request to https://www.youtube.com/youtubei/v1/next?key={key} with the specified video ID in the data.
Extract the continuation token. There's a default, a "Top" sort, and a "New" sort. I've only tried the default.
Sending a second next request without specifying the video ID, but instead specifying the continuation in the data block.
This should return the first 20 or so comments in a very ugly nested way.

Something I've yet to figure out is how to get a highlighted comment to appear at the top of the json list. If you click on a YouTube comment's date, it will open a link with a "&lc=" param that has the comment's ID. And in the comments it will appear at the top as "Highlighted".

If I use the continuation token for the second request from the dev tools inspector when loading the highlighted comment link in the browser then the second next request properly returns the highlighted comment at the top of the json list.

However, if I try using the continuation retrieved from the first next request programmatically then it always returns the comments without the highlighted comment at the top, so it can be assumed the highlighted comment is tied to the continuation token which seems to be generated outside of the scope of the next endpoint, unless I've simply not found the correct way yet.

The text was updated successfully, but these errors were encountered:

tombulled · 2024-01-06T13:28:36Z

Hi, apologies for the late reply, I'll take a look into this now

tombulled · 2024-01-06T14:27:57Z

I've been able to reproduce the ability to list the first n comments (either "top" or "newest").

Here's the (admittedly lashed together) script I used:

from innertube import InnerTube

ENGAGEMENT_SECTION_COMMENTS = "engagement-panel-comments-section"
C0MMENTS_TOP = "Top comments"
COMMENTS_NEWEST = "Newest first"


def parse_text(text):
    return "".join(run["text"] for run in text["runs"])


def extract_engagement_panels(next_data):
    engagement_panels = {}
    raw_engagement_panels = next_data.get("engagementPanels", [])

    for raw_engagement_panel in raw_engagement_panels:
        engagement_panel = raw_engagement_panel.get(
            "engagementPanelSectionListRenderer", {}
        )
        target_id = engagement_panel.get("targetId")

        engagement_panels[target_id] = engagement_panel

    return engagement_panels


def parse_sort_filter_sub_menu(menu):
    menu_items = menu["sortFilterSubMenuRenderer"]["subMenuItems"]

    return {menu_item["title"]: menu_item for menu_item in menu_items}


def extract_comments(next_continuation_data):
    return [
        continuation_item["commentThreadRenderer"]
        for continuation_item in next_continuation_data["onResponseReceivedEndpoints"][
            1
        ]["reloadContinuationItemsCommand"]["continuationItems"][:-1]
    ]


# YouTube Web CLient
client = InnerTube("WEB", "2.20240105.01.00")

# ShortCircuit - Dell just DESTROYED the Surface Pro! - Dell XPS 13 2-in-1
video = client.next("BV1O7RR-VoA")

engagement_panels = extract_engagement_panels(video)
comments = engagement_panels[ENGAGEMENT_SECTION_COMMENTS]
comments_header = comments["header"]["engagementPanelTitleHeaderRenderer"]
comments_title = parse_text(comments_header["title"])
comments_context = parse_text(comments_header["contextualInfo"])
comments_menu_items = parse_sort_filter_sub_menu(comments_header["menu"])
comments_top = comments_menu_items[C0MMENTS_TOP]
comments_top_continuation = comments_top["serviceEndpoint"]["continuationCommand"][
    "token"
]

print(f"{comments_title} ({comments_context})...")
print()

comments_continuation = client.next(continuation=comments_top_continuation)

comments = extract_comments(comments_continuation)

for comment in comments:
    comment_renderer = comment["comment"]["commentRenderer"]

    comment_author = comment_renderer["authorText"]["simpleText"]
    comment_content = parse_text(comment_renderer["contentText"])

    print(f"[{comment_author}]")
    print(comment_content)
    print()

$ python app.py
Comments (1.7K)...

[@ViXoZuDo]
I would 100% prefer the headphone jack over that camera...

[@ouilsen2]
As a Surface Pro user I have one observation...

...

(I'll add this to the examples/ directory in case it helps anyone else)

I'll have a fiddle with highlighting a comment now in case I can figure out what's going on there

tombulled · 2024-01-06T14:33:39Z

It looks like highlighting a comment sends off a request to the /next endpoint with some params and the videoId. I'll see if I can whip up a quick PoC for this now

tombulled · 2024-01-06T15:21:48Z

I think I've figured out what was happening with highlighting a comment not working. The continuation tokens for "top" and "newest" you can extract from engagementPanels aren't influenced by the params passed to the /next endpoint, however the continuation token for the comment-item-section does change.

The below example ignores the engagementPanels entirely and instead uses the continuation token for the comments item section:

from innertube import InnerTube

# YouTube Web CLient
CLIENT = InnerTube("WEB", "2.20240105.01.00")


def parse_text(text):
    return "".join(run["text"] for run in text["runs"])


def flatten(items):
    flat_items = {}

    for item in items:
        key = next(iter(item))
        val = item[key]

        flat_items.setdefault(key, []).append(val)

    return flat_items


def flatten_item_sections(item_sections):
    return {
        item_section["sectionIdentifier"]: item_section
        for item_section in item_sections
    }


def extract_comments(next_continuation_data):
    return [
        continuation_item["commentThreadRenderer"]
        for continuation_item in next_continuation_data["onResponseReceivedEndpoints"][
            1
        ]["reloadContinuationItemsCommand"]["continuationItems"][:-1]
    ]


def extract_comments_continuation_token(next_data):
    contents = flatten(
        next_data["contents"]["twoColumnWatchNextResults"]["results"]["results"][
            "contents"
        ]
    )
    item_sections = flatten_item_sections(contents["itemSectionRenderer"])
    comment_item_section_content = item_sections["comment-item-section"]["contents"][0]
    comments_continuation_token = comment_item_section_content[
        "continuationItemRenderer"
    ]["continuationEndpoint"]["continuationCommand"]["token"]

    return comments_continuation_token


def get_comments(video_id, params=None):
    video = CLIENT.next(video_id, params=params)

    continuation_token = extract_comments_continuation_token(video)

    comments_continuation = CLIENT.next(continuation=continuation_token)

    return extract_comments(comments_continuation)


def print_comment(comment):
    comment_renderer = comment["comment"]["commentRenderer"]

    comment_author = comment_renderer["authorText"]["simpleText"]
    comment_content = parse_text(comment_renderer["contentText"])

    print(f"[{comment_author}]")
    print(comment_content)
    print()


video_id = "BV1O7RR-VoA"

# Get comments for a given video
comments = get_comments(video_id)

# Select a comment to highlight (in this case the 3rd one)
comment = comments[2]

# Print the comment we're going to highlight
print("### Highlighting Comment: ###")
print()
print_comment(comment)
print("---------------------")
print()

# Extract the 'params' to highlight this comment
params = comment["comment"]["commentRenderer"]["publishedTimeText"]["runs"][0][
    "navigationEndpoint"
]["watchEndpoint"]["params"]

# Get comments, but highlighting the selected comment
highlighted_comments = get_comments(video_id, params=params)

print("### Comments: ###")
print()

for comment in highlighted_comments:
    print_comment(comment)

$ python app.py
### Highlighting Comment: ###

[@alphacompton]
The built in mic on the 2-1 is exceptional and the camera is excellent from your video sample. Look like a better buy especially if it's cheaper than the Surface pro.

---------------------

### Comments: ###

[@alphacompton]
The built in mic on the 2-1 is exceptional and the camera is excellent from your video sample. Look like a better buy especially if it's cheaper than the Surface pro.

[@ouilsen2]
As a Surface Pro user I have one observation....

...

Hope that helps!

Please let me know if you have any further questions, or if this answers your query

Best, Tom

EtorixDev · 2024-01-06T21:29:57Z

Hi, thanks for the detailed reply.

The idea behind the highlighting was to store a reference (such as the comment ID) to it in a database and come back to it later. One such use case would be a system that checks for the existence of a membership badge on a user's message monthly. That's why it would have been ideal to have a way to programmatically jump straight to the comment in 1 request like in the browser (on the initial lookup, not just subsequent ones).

Unfortunately from your response it seems "highlighting" a comment internally is done with the comment's watchEndpoint params, so the initial request for the comment will require scraping them all until the target comment is found by checking for the comment ID, and then storing the params instead of the comment ID for future immediate lookup.

Would this work, or do you suspect the params of comments change often?

Thanks again.

…point (#66)

tombulled · 2024-04-28T20:51:10Z

Hi @EtorixDev, apologies for the late turn around on a reply to your last comment. I believe the params field contains base-64 encoded protobuf data (potentially also url-encoded). You should be able to decode the contents of the param using a tool such as https://protobuf-decoder.netlify.app/. It is possible that the protobuf structure contains the comment ID, and that all other fields are static. If this is the case, you should be able to generate the correct params value knowing only the comment ID.

Unfortunately I went to test this using the examples/list-video-comments-highlighted.py example script I wrote a while back and it seems YouTube has changed their comments API around again. If I get some spare time I'll give the API another poke, however I hope this comment has at least given you a bit of a steer 🙂

tombulled added a commit that referenced this issue Jan 6, 2024

[#60] Add examples for listing comments for a video using /next endpoint

814fe0e

tombulled linked a pull request Jan 6, 2024 that will close this issue

✨ [#60] Add examples for listing comments for a video using /next endpoint #66

Merged

tombulled added a commit that referenced this issue Feb 25, 2024

[#60] Instruct MyPy to ignore examples/ directory

ae31085

tombulled closed this as completed in #66 Feb 25, 2024

tombulled added a commit that referenced this issue Feb 25, 2024

✨ [#60] Add examples for listing comments for a video using /next end…

3821e95

…point (#66)

tombulled reopened this Feb 25, 2024

tombulled closed this as completed Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YouTube Comments #60

YouTube Comments #60

EtorixDev commented Nov 8, 2023

tombulled commented Jan 6, 2024

tombulled commented Jan 6, 2024 •

edited

Loading

tombulled commented Jan 6, 2024

tombulled commented Jan 6, 2024

EtorixDev commented Jan 6, 2024

tombulled commented Apr 28, 2024

YouTube Comments #60

YouTube Comments #60

Comments

EtorixDev commented Nov 8, 2023

tombulled commented Jan 6, 2024

tombulled commented Jan 6, 2024 • edited Loading

tombulled commented Jan 6, 2024

tombulled commented Jan 6, 2024

EtorixDev commented Jan 6, 2024

tombulled commented Apr 28, 2024

tombulled commented Jan 6, 2024 •

edited

Loading