Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indicate duplicate string #953

Merged
merged 5 commits into from
Mar 8, 2024
Merged

Conversation

ooprathamm
Copy link
Contributor

Closes #911

#duplicate tag + Mute

  • Sample output
image

Copy link

google-cla bot commented Feb 29, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@williballenthin
Copy link
Collaborator

This is a great start. I wonder if we should not tag the first instance of the string, only the second+ as duplicate. Otherwise important, but duplicated, strings would all get muted every time.

Do you think this is feasible @ooprathamm?

@ooprathamm
Copy link
Contributor Author

@williballenthin Looking back at it, if we have the count of occurrences wouldn't showing like << duplicate{count} be better than #duplicate tag

  • Sample Output
image

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think like this is good. How would you otherwise show the << data? In a new column?

floss/qs/main.py Outdated
@@ -651,13 +651,21 @@ def tag_strings(self, taggers: Sequence[Tagger]):
this can be overridden, if a subclass has more ways of tagging strings,
such as a PE file and code/reloc regions.
"""
string_counts = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use defaultdict(int) here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or collections.Counter

Copy link
Contributor Author

@ooprathamm ooprathamm Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does save explicit (int) initialization, should I open a new PR ?

@ooprathamm
Copy link
Contributor Author

I was considering rendering count on the console directly ahead of strings, adding a new column for duplicate strings would be excessive. However, appending the count to the strings disrupts other functionalities. Hence, I suggest proceeding with the "#duplicate" tag.

floss/qs/main.py Outdated Show resolved Hide resolved
@mr-tz mr-tz merged commit 24bf661 into mandiant:quantumstrand Mar 8, 2024
10 checks passed
@mr-tz
Copy link
Collaborator

mr-tz commented Mar 8, 2024

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants