Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add content len field #1033

Merged
merged 1 commit into from
Oct 31, 2024
Merged

add content len field #1033

merged 1 commit into from
Oct 31, 2024

Conversation

dogancanbakir
Copy link
Member

Closes #1032

Copy link
Member

@Mzack9999 Mzack9999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the issue, if the goal is mostly checking if the body is empty or not I think it's better to directly access it via jq, as the Content-Length header might be misleading or inconsistent as well. For example:

katana -u https://gmail.com | jq 'select(.body != "")'

What do you think?

@dogancanbakir
Copy link
Member Author

Yes, but in this case, we're calculating the len from the body -mentioning @Sab0tag3d for the discussion.

@Sab0tag3d
Copy link

@Mzack9999
Well, my goal is mostly to understand the size of the content on the pages.
It's really helpful to detect some anomaly contents.

Besides, if we would use tool like jq, we need to store html bodies in report (in my case it's json file).
I don't do it because of space, in memory and on disk.
Also, sometimes JSON is broken because of body, and can't be parsed. I think it's because of maximum response size to read option, but haven't done investigation yet.

Copy link
Member

@Mzack9999 Mzack9999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% go run . -u https://scanme.sh -fs fqdn -j | grep content_length

   __        __                
  / /_____ _/ /____ ____  ___ _
 /  '_/ _  / __/ _  / _ \/ _  /
/_/\_\\_,_/\__/\_,_/_//_/\_,_/                                                   

                projectdiscovery.io

[INF] Current katana version v1.1.0 (outdated)
[INF] Started standard crawling for => https://scanme.sh
{"timestamp":"2024-10-31T00:14:01.134533+01:00","request":{"method":"GET","endpoint":"https://scanme.sh","raw":"GET / HTTP/1.1\r\nHost: scanme.sh\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36\r\nAccept-Encoding: gzip\r\n\r\n"},"response":{"status_code":200,"headers":{"content-type":"text/plain; charset=utf-8","date":"Wed, 30 Oct 2024 23:14:00 GMT","content-length":"2"},"body":"ok","content_length":2,"raw":"HTTP/1.1 200 OK\r\nContent-Length: 2\r\nContent-Type: text/plain; charset=utf-8\r\nDate: Wed, 30 Oct 2024 23:14:00 GMT\r\n\r\nok"}}

@ehsandeep ehsandeep merged commit 6de8746 into dev Oct 31, 2024
13 checks passed
@ehsandeep ehsandeep deleted the add_content_len_field branch October 31, 2024 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Missing content_length in JSON output if server response lacks Content-Length header
4 participants