Major refactor #7

sanzoghenzo · 2023-01-26T08:33:05Z

Hi there,

I couldn't help myself and I already did the refactor I promised in #5 !

This builds upon the previous pr ( #6 ) and:

uses PEP8 code style (proper naming style and formatting)
uses pytest instead of unittest
creates three subclasses of the CaptionTreeProcessor to handle the three elements separately, each class has the logic pertaining only to that type of element, and the common code is in the main class.
removes unused parameters from init, and moves the ones that are used only to store the specifics of subclasses as class attributes (overridden by the subclasses)

To make the configuration more flexible, there's a breaking change: caption_prefix is split into the three settings figure_caption_prefix, table_caption_prefix and listing_caption_prefix.

The only thing I'm not happy, and I didn't know what the plans were, is the link_process config: as of now it is only used for stripping the title from the image, so it would be better to go back to something like strip_image_title that defaults to True, and add more configs as the needs rise (so that tey can be fine-tuned separately just like the caption_prefixes).

sanzoghenzo · 2023-01-26T09:21:44Z

I just added 2 GitHub CI workflows:

the first tests the code with different versions of python
the second (not tested) publishes the package to PyPi when a release is created. It needs an access token to be saved as PYPI_API_TOKEN secret following these instructions.

flywire · 2023-01-26T13:10:37Z

Big day with lots of changes. I'm inclined to install your branch and if it works as expected on my PC just merge everything.

Did you refer to the Wiki?

To make the configuration more flexible, there's a breaking change: caption_prefix is split into the three settings figure_caption_prefix, table_caption_prefix and listing_caption_prefix.

I think you have misunderstood the intention.

There are essentially three content types, Figure, Table, and a flexible one, default to Listing. Any unrecognised content type (CaptionPrefix) would use Listing defaults and the options for that content type. For example, if Listing has a default top_caption = True, Video, Music, Widget, or whatever could have the option set to false for caption below content.

sanzoghenzo · 2023-01-26T13:54:05Z

Did you refer to the Wiki?

I didn't even know it existed 😅 I usually look for a contributing.md file or links in the readme, I must have miss it!

To make the configuration more flexible, there's a breaking change: caption_prefix is split into the three settings figure_caption_prefix, table_caption_prefix and listing_caption_prefix.

I think you have misunderstood the intention.

There are essentially three content types, Figure, Table, and a flexible one, default to Listing. Any unrecognised content type (CaptionPrefix) would use Listing defaults and the options for that content type. For example, if Listing has a default top_caption = True, Video, Music, Widget, or whatever could have the option set to false for caption below content.

I thought it was for internationalization purposes!

What you describe was not clear from the code or the tests. And it's still not totally clear to me: you say that the custom content type uses the Listing defaults, but then say the opposite, that can have the option set to false...

Regardless, you would you detect these custom content types? Should the custom_perfix be a list of strings to take into account multiple types (video, music and Widget in your example) ? If so, what if one wants to set different behavior for the different types?

(Btw, i'm still not sure how you would manage the top_caption if the caption is on an element of its own in the tree...)

flywire · 2023-01-26T21:08:18Z

OK seems like it is a good time for more discussion. Let's talk about the vision and focus on the dysfunctional ContentHack branch. If the description is conflicting then let's fix it.

First paragraph of readme:

caption manages captions by applying style and auto-numbering to content. The vision is to apply captions to any content. For now only figures are supported but the next candidate is likely to be tables.

Note:

Development stalled as discussed earlier but some of the functionality can be demonstrated by hard-coding content type in the ContentHack branch (but you have moved way past that)
There is always an applicable default (ie caption needs no options) for three unique caption styles, figure by default but defaults exist for table and listing
Any new content type, a previously undefined option, would use the listing defaults (some of the code is written around developing it this way)

This discussion is general and the detail like names or actual defaults might change in the implementation.

What you describe was not clear from the code or the tests. And it's still not totally clear to me: you say that the custom content type uses the Listing defaults, but then say the opposite, that can have the option set to false...

Check that again, the option is for a new content type setting top_caption = True to false. For example, a widget option might be used which would adopt all the same defaults as listing except caption location which might be defined with a different option.

I usually look for a contributing.md file or links in the readme, I must have miss it!

No, I missed it. A PR is accepted or not so it becomes a problem when there are many commits in the same PR and some commits are unacceptable. Significant PRs like the ones you have made justify a new feature branch.

sanzoghenzo · 2023-01-27T23:28:17Z

Thanks for taking the time to explain! English is not my native language, so it's possible that something eludes me...

1. Development stalled as discussed earlier but some of the functionality can be demonstrated by hard-coding content type in the ContentHack branch (but you have moved way past that)

I think this is why I didn't get what you were trying to do: I saw the broken unit tests and I supposed they were there to be fixed!

I apologize I still don't fully understand what you tried to explain about custom contents, defaults and so on, and I'm having a hard time imagining how a custom content can be identified in the html tree if there no rules (code) in place, but instead I propose that you provide me with unit tests, or even examples (input markdown, and output html for the given settings) of what you have in mind. That way we can do a TDD (Test driven development) and be aligned in the intents.

EDIT: I just realized that the open issues have some examples in them, I'll check them ASAP

A PR is accepted or not so it becomes a problem when there are many commits in the same PR and some commits are unacceptable. Significant PRs like the ones you have made justify a new feature branch.

I did many, small (well, except for the naming and pytest ones), commits for each "atomic" changes I made, all of which pass the tests, so it's easier to drop the changes that are not acceptable to you. If you review the code I can then rebase it to your liking.

flywire · 2023-01-28T22:21:51Z

To make the configuration more flexible, there's a breaking change: caption_prefix is split into the three settings

This doesn't look reasonable because content type inherits many settings, only some are exposed in the options.

It is intended an option may exist for each content type with suboptions modifying defaults.

flywire · 2023-01-29T02:55:54Z

The name style changes are so extensive that every file in the repository (except LICENSE) and all the configuration files have changed significantly. Referring to https://peps.python.org/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds this seems to be the wrong approach.

This is not a large project, significant parts are lifted straight from https://python-markdown.github.io/extensions/api/ which does not follow PEP8 but forms key project documentation. This issue was considered when Caption was first coded and consistency with the API style was considered the most important. Other MkDocs extensions use the same approach.

sanzoghenzo · 2023-01-29T08:53:38Z

Usually, if there are things to change, instead of merging and then provide comments, the reviewer uses the "review changes" button under the "files changed" section, and the contributor keeps adding commits (or rebases them) to fulfill the reviewer requests. In this way you could not bother about other people branches until they are ready to be merged in the main branch.
Obvously, if you want to check them in your local machine, you'd have to clone my fork and branch.
But anyway, let's keep it going 😉

This is not a large project, significant parts are lifted straight from https://python-markdown.github.io/extensions/api/ which does not follow PEP8 but forms key project documentation. This issue was considered when Caption was first coded and consistency with the API style was considered the most important. Other MkDocs extensions use the same approach.

Python-Markdown is a mixed bag of styles... and configurations options of the extensions that uses them are snake_case.

The Contributing code style guide section explicitly states to use PEP8 when it doesn't break backward compatibility.

So, as long as the API calls are intact, (namely, the makeExtension function, the extendMarkdown and getConfig methods), I think this is the right time to be PEP8 compliant.

But if you insist, I will disable the style check in my pycharm 😅

flywire · 2023-01-29T10:52:34Z

lol, breaking change is a good description.

I still haven't got my test file to do anything. Can you load some updated example files? Looks like it is mkdocs.yml and extra.css that need updating. The example should demonstrate most of the issues looking at index.html in the browser.

Since writing those comments I've had a closer look through the PEP8 guide I linked. My understanding is caption should be Caption for all the class names. As I read it, most of the mixedCase are originally methods but maybe they are used incorrectly as attributes. I note each new release of the extensions also seem to creep a bit closer to PEP8.

So I'm not insisting, if this is the right time to be PEP8 compliant then do it and anything else can be fixed with the tests.

sanzoghenzo · 2023-01-29T15:37:09Z

lol, breaking change is a good description.

I still haven't got my test file to do anything. Can you load some updated example files? Looks like it is mkdocs.yml and extra.css that need updating. The example should demonstrate most of the issues looking at index.html in the browser.

I focused on passing the unittests, so I used the python-markdown API only;
Since I had renamed the configs for PEP8, the mkdocs.yml needs to be adjusted accordingly.

I'm doing some TDD for the table captions, and that might change the API (= how do you declare options in the yaml file), so I'll update the documentation after that.

In the meantime, the basics are:

the plugin names are in setup.py's entry_points (as of now, it's only caption, but I will add image_captions and table_captions
as you know, configs for the plugin are specified in the Extension class - and the other extensions I'll create on the next iteration

sanzoghenzo added 22 commits January 25, 2023 17:35

fix: avoid exception if no title is specified

a5fd62c

fix: pass caption_class test

ff83b03

fix: pass caption_prefix test

e8be923

chore: add pycharm project folder to gitignore

0aa59d5

fix: pass caption prefix class test

103ec73

fix: pass content class test

99c09fe

fix: pass numbering test

1402818

fix: pass no strip title test

349ae66

refactor: use PEP8 style

1a7d194

refactor: move to pytest

ef55415

refactor: use static methods where applicable

098390b

refactor: split long method

19eeaa0

refactor: use subclasses

36079c3

refactor: move content_tag as class attribute

2bcd275

refactor: move defaults where they should be

fbaf772

refactor: remove unused parameter

165ec2c

refactor: move caption_tag as class attribute

2eab4cf

refactor: DRY code

c6b885f

refactor: remove unused parameters

690f9ce

docs: update readme

b41825b

build: setup github actions

fd90eea

build: rename build github action to test

7e15b54

sanzoghenzo added 2 commits January 26, 2023 10:26

fix: wrong entrypoint name

7248133

refactor: avoid nested ifs

b99384c

flywire changed the base branch from ContentHack to major-refactor January 28, 2023 12:19

flywire merged commit 6a724ba into flywire:major-refactor Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major refactor #7

Major refactor #7

sanzoghenzo commented Jan 26, 2023

sanzoghenzo commented Jan 26, 2023

flywire commented Jan 26, 2023 •

edited

Loading

sanzoghenzo commented Jan 26, 2023

flywire commented Jan 26, 2023

sanzoghenzo commented Jan 27, 2023 •

edited

Loading

flywire commented Jan 28, 2023 •

edited

Loading

flywire commented Jan 29, 2023

sanzoghenzo commented Jan 29, 2023

flywire commented Jan 29, 2023

sanzoghenzo commented Jan 29, 2023

Major refactor #7

Major refactor #7

Conversation

sanzoghenzo commented Jan 26, 2023

sanzoghenzo commented Jan 26, 2023

flywire commented Jan 26, 2023 • edited Loading

sanzoghenzo commented Jan 26, 2023

flywire commented Jan 26, 2023

sanzoghenzo commented Jan 27, 2023 • edited Loading

flywire commented Jan 28, 2023 • edited Loading

flywire commented Jan 29, 2023

sanzoghenzo commented Jan 29, 2023

flywire commented Jan 29, 2023

sanzoghenzo commented Jan 29, 2023

flywire commented Jan 26, 2023 •

edited

Loading

sanzoghenzo commented Jan 27, 2023 •

edited

Loading

flywire commented Jan 28, 2023 •

edited

Loading