Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkify.tokenize breaks string as url and text on encountering org. #477

Open
PratyushGar opened this issue Apr 15, 2024 · 1 comment
Open
Assignees
Labels

Comments

@PratyushGar
Copy link

I have a text: org.org_custom_connector
On parsing through tokenize, it breaks it into a URL and text token -
url: "org.org" && text: "_custom_connector

[
    {
        "t": "url",
        "v": "org.org",
        "tk": [
            {
                "t": "TLD",
                "v": "org",
                "s": 0,
                "e": 3
            },
            {
                "t": "DOT",
                "v": ".",
                "s": 3,
                "e": 4
            },
            {
                "t": "TLD",
                "v": "org",
                "s": 4,
                "e": 7
            }
        ]
    },
    {
        "t": "text",
        "v": "_custom_connector",
        "tk": [
            {
                "t": "UNDERSCORE",
                "v": "_",
                "s": 7,
                "e": 8
            },
            {
                "t": "DOMAIN",
                "v": "custom",
                "s": 8,
                "e": 14
            },
            {
                "t": "UNDERSCORE",
                "v": "_",
                "s": 14,
                "e": 15
            },
            {
                "t": "DOMAIN",
                "v": "connector",
                "s": 15,
                "e": 24
            }
        ]
    }
] 

Is there any way to override this behavior of not splitting the URL?

@artemik
Copy link

artemik commented Nov 19, 2024

Confirm, I see the same issue. For example backend.eu-central1.internal:8080 gets parsed as backend.eu only. These kind of urls are common for cloud dns names. Would be nice to have a fix/option to handle these.

@nfrasser nfrasser self-assigned this Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants