Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to filter duplicate results and deprecate --remove-extensions #1436

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shelld3v
Copy link
Collaborator

@shelld3v shelld3v commented Nov 8, 2024

Description

Close #1293

@shelld3v shelld3v mentioned this pull request Nov 13, 2024
2 tasks
@mikhailevtikhov
Copy link

Hi @shelld3v, sorry for the importunity :c It works great! But there is a scenario in which this logic will skip FP, I have met them in reality. They can be found in the API, for example, if we have api paths in the dictionary, then when accessing a non-existent API, it will return the information "{error: ... $uri ... not found}" or WAF, which blocked us on 1000 words out of 10000 and started returning a template with a lock for each of our subsequent requests, which reflects the $uri.
An example to reproduce:
You can raise nginx with the configuration

server {
    listen       80;
    listen  [::]:80;
    server_name  localhost;

    location ^~ /admin {
        return 200 "You had been blocked, because u want to check ($uri)";
    }

    location / {
        return 200 $uri;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

In this case, the result will be:

python3 dirsearch.py --url=http://exaple.com:80/ --extensions=json,txt,configz --threads=1 --timeout=4 --wordlist=/PATH/dirserach_dict.txt --filter-threshold=1

  _|. _ _  _  _  _ _|_    v0.4.3
 (_||| _) (/_(_|| (_| )

Extensions: json, txt, configz | HTTP method: GET | Threads: 1 | Wordlist size: 25

Target: http://exaple.com/

[16:08:19] Scanning:
[16:08:21] 200 -    54B - /admin
[16:08:21] 200 -    55B - /admin/
[16:08:22] 200 -    64B - /admin/something
[16:08:22] 200 -    59B - /admin/test

Task Completed

Do you think there is an opportunity to do something about it, or is it redundant functionality?

@shelld3v
Copy link
Collaborator Author

Hi @mikhailevtikhov, thanks for your feedback! Yes, I'm aware of your problem already but storing and comparing anything more than a hash is just way too expensive, I don't want people to complain about memory and performance. An idea is to store just a part of the responses, and to find a method to identify potential duplicates before performing high-level comparison. However, I'm avoiding any big changes before the release of v0.4.5, so I won't work on it at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggestions for a filter flag to improvie accuracy
2 participants