-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indicate duplicate string #953
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This is a great start. I wonder if we should not tag the first instance of the string, only the second+ as duplicate. Otherwise important, but duplicated, strings would all get muted every time. Do you think this is feasible @ooprathamm? |
@williballenthin Looking back at it, if we have the count of occurrences wouldn't showing like << duplicate{count} be better than #duplicate tag
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think like this is good. How would you otherwise show the << data? In a new column?
floss/qs/main.py
Outdated
@@ -651,13 +651,21 @@ def tag_strings(self, taggers: Sequence[Tagger]): | |||
this can be overridden, if a subclass has more ways of tagging strings, | |||
such as a PE file and code/reloc regions. | |||
""" | |||
string_counts = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use defaultdict(int) here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or collections.Counter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does save explicit (int) initialization, should I open a new PR ?
I was considering rendering count on the console directly ahead of strings, adding a new column for duplicate strings would be excessive. However, appending the count to the strings disrupts other functionalities. Hence, I suggest proceeding with the "#duplicate" tag. |
Co-authored-by: Moritz <[email protected]>
thanks! |
Closes #911
#duplicate tag + Mute