Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add % language distribution by LOC #140

Open
lifeguard999 opened this issue Mar 28, 2024 · 3 comments
Open

Add % language distribution by LOC #140

lifeguard999 opened this issue Mar 28, 2024 · 3 comments
Assignees

Comments

@lifeguard999
Copy link

lifeguard999 commented Mar 28, 2024

Currently in summary output I see % distribution based on file count, can you also add one based on LOC?
It feels to me this new stat would give more insight into where the largest weight of the project lies.

Example:
Today when I run this:

pygount --format=summary https://github.com/roskakori/pygount.git/v1.5.1

I get:

┏━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━━┓
┃ Language         ┃ Files ┃     % ┃ Code ┃    % ┃ Comment ┃    % ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━━┩
│ Python           │    18 │  47.4 │ 2132 │ 63.6 │     418 │ 12.5 │
(...)
│ Bash             │     2 │   5.3 │   12 │ 80.0 │       3 │ 20.0 │
(...)
├──────────────────┼───────┼───────┼──────┼──────┼─────────┼──────┤
│ Sum              │    38 │ 100.0 │ 4024 │ 68.4 │     431 │  7.3 │
└──────────────────┴───────┴───────┴──────┴──────┴─────────┴──────┘

So Python makes 47.4% of files and bash makes 5.3% of files.
I feel it'd be good to know % distribution based on LOC too, so Python would make 53% of the code and Bash only 0.3%.

Something like:

┏━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━━┓
┃ Language         ┃ Files ┃     % ┃ Code ┃ LOC % ┃    % ┃ Comment ┃    % ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━━┩
│ Python           │    18 │  47.4 │ 2132 │  52.9 │ 63.6 │     418 │ 12.5 │
(...)
│ Bash             │     2 │   5.3 │   12 │   0.3 │ 80.0 │       3 │ 20.0 │
(...)
├──────────────────┼───────┼───────┼──────┼───────┼──────┼─────────┼──────┤
│ Sum              │    38 │ 100.0 │ 4024 │ 100.0 │ 68.4 │     431 │  7.3 │
└──────────────────┴───────┴───────┴──────┴───────┴──────┴─────────┴──────┘

Thoughts?

@roskakori
Copy link
Owner

The --format=summary has the challenge that it should be somewhat terse to ideally fit within 80 characters per line. The thinking is that most terminals are configured to that width by default. Of course users can change the size of the terminal, but the --format=summary is intended to provide a out-of-the-box experience of showing something useful and interesting.

In your example, the width is already 76. Going beyond 80 isn't that far and can happen with projects that have millions of SLOC or languages with longer names.

At first thought I see the following ways to approach this:

  1. Leave things as they are.
  2. Add the "LOC %" as suggested as increase the risk of the line length exceeding 80 characters.
  3. Add the "LOC %" and remove another column to make space for it. That would raise the question what to remove.
  4. Add a command line option where the user can specify which columns they want to see in the summary format. For that we could use the existing names of the JSON format but extend them to also have the percentage values available.
    Something like --summury-columns=language,fileCount,filePercentage,....
    However, if the user is unhappy with the default, they would need to be aware of this option and dig up the desired names from the documentation, which would break the out-of-the-box-experience.
    Also, adding options always increases the complexity of the application, so I am reluctant to do that unless there is a good reason.
  5. Add another format that includes more columns like --format=summary-wide that has 120 as target instead of 80. Again, the user would need to be aware of this format, which breaks the out-of-the-box experience.
  6. Make pygount check the available width of the terminal. If there is none, assume 80. Otherwise use as many columns as possible to include as much information as reasonably possible with yet to be decided variants and priorities of columns.
    So with a smaller terminal, --format=summary would include fewer columns, and with a wider one more.
    This would make it somewhat unpredictable for the user which columns they are actually going to see. Also, it's not obvious that making the terminal window wider can change this.

I'm open to change this and, if necessary, even break backwards compatibility (there's a couple of other things to warrant a pygount 2.0).

But from the variants above, none strikes as the "obviously best" solution.

@roskakori roskakori self-assigned this Mar 29, 2024
@roskakori roskakori moved this from 🆕 New to 📋 Backlog in Open source projects Apr 9, 2024
@lifeguard999
Copy link
Author

I see, having limit on 80 chars puts this into perspective.
Thanks for checking. And thanks for supporting this great project!

If I may share my opinion, taking your arguments to account, to me best sounding options are either 1) leave as is, it's useful already or 4) allow custom fields (with default setting being as is for backward compatibility).

  1. sounds OK too, although feels like 4) is much more powerful.

2, 3 & 6 somehow don't feel right to me at first glance.

Thanks!

@lifeguard999
Copy link
Author

Just thinking about it...
For my use case, the below is what I'm ultimately after, so option 4) would come in handy :-).

┏━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━━┓
┃ Language         ┃ Files ┃ Code ┃ LOC % ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━━┩
│ Python           │    18 │ 2132 │  52.9 │
(...)
│ Bash             │     2 │   12 │   0.3 │
(...)
├──────────────────┼───────┼──────┼───────┤
│ Sum              │    38 │ 4024 │ 100.0 │
└──────────────────┴───────┴──────┴───────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants