Skip to content
This repository has been archived by the owner on Jun 8, 2024. It is now read-only.

[bug] Text wrapping: broken list case; 2x indent; length overflow #82

Open
4 of 6 tasks
Andrew15-5 opened this issue Aug 31, 2023 · 48 comments
Open
4 of 6 tasks

Comments

@Andrew15-5
Copy link
Contributor

Andrew15-5 commented Aug 31, 2023

This is a separate issue, related to the #74, since the text wrapping itself kinda works. Now it's time for bugs!

Before:
We have next things:
- thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.]

With default:

indent_space = 2
max_line_length = 50
experimental_args_breaking_consecutive = false

And version: 0.2.0#578fabc

After 1st run:
We have next things: - thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]
After 2nd run:
We have next things: - thing 1; - thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]
After 3rd run:
We have next things: - thing 1; - thing 2; - thing 3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]
After 4th run:
We have next things: - thing 1; - thing 2; - thing
3.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]

Next run will be as in 3rd, after as in 4th and so on.

But with max_line_length = 80 it stops formatting after 3rd run.

Note that the text lines exceed the 50 mark, which is also wrong (same with 80).

What's wrong:

  • merge of list lines with non-list lines if they are on adjacent lines (it could be //-/+/\d+\.)
  • it takes 2 runs to add 2 indents to the content block with a text (which is a function argument) — fixed at least in 0.2.1#1817538
  • it adds 2 indents to the content block with a text (which is a function argument)
  • some lines of text can exceed the limit from the config (file) — discussed in [bug] Line length checking is not accurate #86

How it supposed to look like:

We have next things:
- thing 1;
- thing 2;
- thing 3.

#f[Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]

But since the function name could be very long, it should format like this (either only for long function names or for all cases):

We have next things:
- thing 1;
- thing 2;
- thing 3.

#very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]
)

This last one can be debatable, I assume, but the main problem, is that Typst right now can't do something like this (shell):

var="\
first line
secondline"

Typst instead of one additional space will add a new paragraph if \ is used right after [ (same with ]). So, to line up all the content block lines, we can use otherwise redundant () to wrap the content block.

Note that if the function name is very long, it could exceed the max line limit, but we can't do anything about it (we can't cut it in half).

@astrale-sharp
Copy link
Owner

Thanks again for your helpful issues and uncovering the cursed fruit of my work !

#f[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

is stable and pretty good looking, i don't see a problem with it. The list && enum thing is bad tho

@Andrew15-5
Copy link
Contributor Author

I think you've missed some of my points, specifically:

  • you show 1 indent, which is generally ok, but in my case it adds 2 indents after 2 runs! Formatter always has to format everything on the 1st run.
  • I noted that some function names could be very long, and you have to keep [ and the content inside together, as to not introduce unwanted space. Therefore, if the function name is long enough, you just have to add () around [] to move content block on a new line (examples are in the first comment).

These are the main problems right now with version 0.2.0#578fabc when it comes to content block formatting.

@astrale-sharp
Copy link
Owner

I indeed missed that first point! I might still be missing the second one tbh

@astrale-sharp
Copy link
Owner

I can't reproduce #f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.] , are you sure it's still a thing on master?

@Andrew15-5
Copy link
Contributor Author

I've left tested version, which is indeed an old version right now. I'll update and test again.

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

I indeed missed that first point! I might still be missing the second one tbh

This is default behavior on 0.2.1#1817538:

#very-long-long-long-long-long-function-name[Lorem ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

As I already said, it should be something like this:

#very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.]
)

Because next formatting style will (could) introduce a space at the beginning:

#very-long-long-long-long-long-function-name[// Here an (un)wanted whitespace will appear.
  Lorem ipsum dolor sit amet, consectetur // Even if this line is not indented.
  adipiscing elit, sed do eiusmod tempor
  incididunt ut labore et dolore magnam aliquam
  quaerat voluptatem. Ut enim aeque doleamus
  animo, cum corpore dolemus, fieri.] // (1)
] // (2)

I prefer ] to be like (2), because it looks more like a block/function call, rather than (1).

@astrale-sharp
Copy link
Owner

I finally get it, you would want me to modify the ast in that manner! I don't know How I feel about that
I would want this to happen

                                                                                     ﹀ not breaking here cause that modifies the content which is forbidden
#very-long-long-long-long-long-function-name[Lorem // breaking here cause it's over max length
  ipsum dolor sit amet, consectetur adipiscing
  elit, sed do eiusmod tempor incididunt ut labore
  et dolore magnam aliquam quaerat voluptatem. Ut
  enim aeque doleamus animo, cum corpore dolemus,
  fieri.]

@astrale-sharp
Copy link
Owner

But that's not possible cause the child is not aware of the parent, what you propose is sensible, I'll think about it

@astrale-sharp
Copy link
Owner

I fixed the other one btw

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

I fixed the other one btw

I've just written a big-ish comment, and you've replied to my previous comments (probably). So you have to be more specific about "fixed the other one". I don't know what you are referring to.

Update: I've checked new commits. You were talking about list issue. I'll check it out.

@astrale-sharp
Copy link
Owner

Sorry XD

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

All lists except terms are now working (in simple cases, I haven't tried anything complex).

@astrale-sharp
Copy link
Owner

I also fixed

    #very-long-long-long-long-long-function-name(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor]
)

becoming

    #very-long-long-long-long-long-function-name([Lorem ipsum dolor sit amet, consectetur
  adipiscing elit, sed do eiusmod tempor])

@astrale-sharp
Copy link
Owner

Let's look at terms now

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

Your latest #56fbba6 broke your "1 indent" fix. Now it again adds 2 indents but in one run! This is progress with one step back.

@astrale-sharp
Copy link
Owner

I totally forgot about terms, right now they're a bit broken If I try to apply the same rules as enums and list for some unknown reason:tm:

Your latest #56fbba6 broke your "1 indent" fix. Now it again adds 2 indents but in one run! This is progress with one step back.

I don't know what you're talking about

@Andrew15-5
Copy link
Contributor Author

I don't know what you're talking about

Last commit again introduced the initial issue:

#f[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim aeque doleamus animo, cum corpore dolemus, fieri.]
#f[Lorem ipsum dolor sit amet, consectetur adipiscing
    elit, sed do eiusmod tempor incididunt ut labore
    et dolore magnam aliquam quaerat voluptatem. Ut
    enim aeque doleamus animo, cum corpore dolemus,
    fieri.]

2 indents (in 1 run) instead of 1.

@astrale-sharp
Copy link
Owner

I think there's one level of indent for the func call and one for the content block, wouldn't you say it makes sense?
if we wrote

#f([ .. ])

it might format as

#f(
  [
     .. 
  ]
)

would you say this deserves a special case and it should only be one level of indent?

@astrale-sharp
Copy link
Owner

I thought TermItems were broken but I discovered a bug! (I was stuck on it for a while), I'll open another issue and ping you

@Andrew15-5
Copy link
Contributor Author

I think there's one level of indent for the func call and one for the content block, wouldn't you say it makes sense? if we wrote

#f([ .. ])
it might format as

#f(
  [
     .. 
  ]
)

Yes, that would make sense, if [] wasn't \n/ sensitive. There are a lot of cases when you have to write content inside content block like this: [any content], instead of [ any content ] or [\n any content\n], because they can produce different output! So, you can't add another indent inside/for a content block, at least in Typst v0.7.

would you say this deserves a special case and it should only be one level of indent?

Since [] is sensitive to newlines and whitespaces it becomes a non-special case (like normal arguments with values, or just values). Therefore, it should be indented as any normal argument: 1 indent.

If [] were to be a {} then it's a completely different thing. Right now, you can't write #f{sym.RR}, you have to wrap {} inside () like this: #f({sym.RR}). But this already has a good formatting:

// If was written in one line
#f({ sym.RR })

// If was written differently
#f({
  sym.RR
})

@astrale-sharp
Copy link
Owner

astrale-sharp commented Aug 31, 2023

Content blocs are space sensitive but not indent sensitive 😎
[a] to [ a ] isn't okay but
[ a ] to [ a ] is
as well as the current example ;)
You may add as many indents as you wish but no Parbreak (more than two \n\n)

@Andrew15-5
Copy link
Contributor Author

[ a ] to [ a ] is

It's literally the same thing, nothing is changed. I don't understand what this can prove. Also, please use inline code formatting in Markdown, it improves readability of your examples.

@astrale-sharp
Copy link
Owner

astrale-sharp commented Aug 31, 2023

it shows as is but the left one has 4 spaces total and the right one has 2 spaces total, I don't know why it doesn't show on github
EDIT surrounded by ` it shows

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

Because Markdown is the same as LaTeX/Typst, it ignores multiple whitespaces no matter how many there are of them.

@astrale-sharp
Copy link
Owner

My point exactly then, are you convinced? knowing this which do you think is the prettiest, one or two indents in this case?

@Andrew15-5
Copy link
Contributor Author

Content blocs are space sensitive but not indent sensitive

It's not indent sensitive because you already introduced the first whitespace by inserting a newline character:

a#f[some content]b

a#f[
some content]b

These 2 already will produce different output, but after formatting:

a #f[some content]b

a #f[
  some content
]b

They will output the same output, but not the one that is desired.

In the first case, it adds a new whitespace (no one asked), which is very bad! We need to make a new issue for this one.

In the second case, it, again, adds a new whitespace before the function call, then a \n and 2 more whitespaces after that for a total of 4 chars. This combined also results in 1 whitespace.

This is a pretty valid use case and the formatter alters the output, which should be fixed right away.

@astrale-sharp
Copy link
Owner

the program doesn't break the line in the cases you mentioned, it only modifies the space if it's there
#[lore eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem]
becomes

#[lore
  eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeem]

@Andrew15-5
Copy link
Contributor Author

the program doesn't break the line in the cases you mentioned

"Doesn't break the line"? Which line are you talking about? The whole #82 (comment) was about introduction of an additional whitespace if there is some text next to a function call. Which supposed to counter your point in #82 (comment). Look at my examples in #82 (comment) more carefully and check the output in PDF.

@astrale-sharp
Copy link
Owner

To me

a#f[some content]b

becomes

a #f[some content]b

I didn't understand you were talking about that space, i thought you were talking about a linebreak elsewhere, indeed that's a problem yea, although not hard to fix

@Andrew15-5
Copy link
Contributor Author

Now, that you understood what I was talking about, think again about newline after [ and before ]:

#f([ .. ])

it might format as

#f(
  [
     .. 
  ]
)

These 2 will produce different output:

a#f(
  [..]
)b

a#f(
  [
     .. 
  ]
)b

And this is exactly why content block should be treated as any other simple argument with a value — 1 indent.

@astrale-sharp
Copy link
Owner

this

[..]

never becomes that

  [
     .. 
  ]

so I fail to see the issue

it might become that

  [..
    ..
     ..]

@astrale-sharp
Copy link
Owner

the problem I see is this

a#f() b

becoming that

a #f() b

@Andrew15-5
Copy link
Contributor Author

Exactly. See #84.

@astrale-sharp
Copy link
Owner

Yes so it's not about the two level of indentation!!?

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

this

[..]

never becomes that

  [
     .. 
  ]

You contradict yourself in #82 (comment). Maybe you should rewrite a more accurate example of what you meant, because right now your thoughts are disconnected from your examples (pseudocode).

Yes so it's not about the two level of indentation!!?

It's both. Now that there is a separate #84 issue, we are left with the double indentation problem.

@astrale-sharp
Copy link
Owner

there is no double indentation problem, double indentation has no influence on the final result.
There is a double indentation choice we can make, even though it's nested twice, we can decide to indent it just once, what I'm wondering is if this is the best looking result (and I think it is)

@astrale-sharp
Copy link
Owner

I am not contradiction myself cause

#f([ .. ])

is not

#f([..])

@astrale-sharp
Copy link
Owner

I'll write an example in a few!

@Andrew15-5
Copy link
Contributor Author

there is no double indentation problem, double indentation has no influence on the final result. There is a double indentation choice we can make, even though it's nested twice, we can decide to indent it just once, what I'm wondering is if this is the best looking result (and I think it is)

Ok, I should've used double indentation discussion/choice then. Of course, the level of indentation doesn't change the output, this we can agree with.

@Andrew15-5
Copy link
Contributor Author

In #82 (comment) you've written this:

it might format as

#f(
  [
     .. 
  ]
)

In #82 (comment) you've written this:

never becomes that

  [
     .. 
  ]

First, you show why there is a 2 times indent with #f([]), then you say that this (will) never happens:

  [
     .. 
  ]

This is a contradiction.

If this doesn't happen or won't happen, then where do you get the 2nd indentation? Show me a more accurate example, if your opinion on double indentation still stands.

@astrale-sharp
Copy link
Owner

Let me repeat filling the ..
this

//    notice the spaces
#f([ lorem ipsum dolores ombrage ])

might format as this

#f(
  [
     lorem ipsum dolores ombrage
  ]
)

but this never will, that's why there is no problem with 1 indent, 2 indent etc

#f([lorem ipsum dolores ombrage])

@Andrew15-5
Copy link
Contributor Author

Andrew15-5 commented Aug 31, 2023

Ok, now I understand you too!)) This is a tough one, because the minimalist in me wants to optimize the code by removing redundant () and only adding 1 indent. This will probably also improve the readability, because the less special syntax characters are present, the better the reading experience, don't you think?

There is also a question of length. Do we really need to break 1 line into 3+? If the one-liner does not exceed the max line length, then it should stay unformatted.

So, here is what I think:

before formatting
// length <= max:
#f([content])

#f[content]

#f([ content ])

#f[ content ]
// --------------

// length > max:
#f([Lorem ipsum dolor sit amet, consectetur adipiscing]) // Example with 4 potential versions of formatting.

#f[Lorem ipsum dolor sit amet, consectetur adipiscing] // Example with 4 potential versions of formatting.

#f([ Lorem ipsum dolor sit amet, consectetur adipiscing ])

#f[ Lorem ipsum dolor sit amet, consectetur adipiscing ]
// -------------
after formatting
// length <= max:
#f[content]

#f[content]

#f[ content ]

#f[ content ]
// --------------

// length > max:
// The next 2 examples should be formatted the same way, but which one is better?

// style 1
#f[Lorem ipsum dolor sit amet, consectetur
  adipiscing]

// style 2
#f[Lorem ipsum dolor sit amet, consectetur
   adipiscing]

// style 3
#f(
  [Lorem ipsum dolor sit amet, consectetur
  adipiscing]
)

// style 4
#f(
  [Lorem ipsum dolor sit amet, consectetur
   adipiscing]
)

#f[
  Lorem ipsum dolor sit amet, consectetur
  adipiscing
]

#f[
  Lorem ipsum dolor sit amet, consectetur
  adipiscing
]
// -------------

Notes:

  • All my formatted examples use only 1 indent. I think it's kinda like KISS principle. Which is a good principle!)
  • +1 space in styles 2 and 4 should be discussed in a new issue, since they are not about double indentation.
  • If we were to use more than 1 argument, then this could complicate the things quite a bit, so we can talk about this in the new issue/discussion/Discord. I was thinking about "smart" formatting, which could move named and unnamed arguments to make code more simple looking, but this will require a lot of test and discussion. And the arguments named/unnamed could be changed to required/optional in the future Typst versions, so we shouldn't rush the more complicated cases until the Typst language syntax is in the stable release.

@astrale-sharp
Copy link
Owner

I hear you, but I'm wondering, what do we do with this then?
#f([content],[content])
this?
#f[content][content]
this (no! cursed!)?

`#f(
    [content]
)[content]`

this? (like right now)

`#f(
    [content],
    [content],
)`

It messes with the ast very much and I need to think if that's really okay or not, I'm a slow woman, I like to take my time ^^'

I'll also ask in the discord to gather opinion on if this is ok

I'm also thinking, if i was a beginner and I saw some parenthesis go away, that would be weird (whereas, a warning that the parenthesis are not needed would be nice)

@Andrew15-5
Copy link
Contributor Author

Look, this again (2 args) is not what we were discussing at first, and it's not even an issue in this issue (#82). Don't make things more complicated, you should avoid doing that. For the 2 argument question, you probably should open a new issue, or we can discuss it elsewhere.

I hear you

If you agree with my formatting (in #82 (comment)) about 1 argument content block, then you can implement that, and we will be 1 step closer to closing this issue.

The last thing is an accurate line length checking, so that lines with text don't exceed the configured limit.

P.S. I don't know how hard/easy these new feature implementations are, but you are doing a great job! I don't think I've ever seen someone implementing new features/fixing bugs left and right just like that *snap*. It feels so cool to be able to open a feature/bug issue, and it will be done in the same day or maybe a couple of days. Unbelievable.

@astrale-sharp
Copy link
Owner

astrale-sharp commented Sep 1, 2023

There are too many things in this issue, let's open a discussion about accurate line length checking cause there are things to be thought about and said, but I just got back to work so this will have to wait a bit

@astrale-sharp
Copy link
Owner

Thanks a lot for the praise 😊 😊 ❤️
Some things are harder than others, I'm really familiar with the code base which helps ofc, but I'm kind of obsessive so when an issue is opened it really motivates me to go to the end for it
I'm also going to be working only two days a week which will leave me lot of times to dev so you can expect the pace to be this one to two days a week :D
Are you fluent in rust? if so and if you're interested it would be a pleasure to guide you into contributing to the project with code, but I must say, I think we have a good rhythm right now with you testing things and me implementing things (your help is invaluable 👀

@astrale-sharp
Copy link
Owner

One other thing that helps is that this project is of limited complexity, it's only 1626 LoC right now, I mean, it's a good bit already and I'm v proud but it's v humanly comprehensible

@Andrew15-5
Copy link
Contributor Author

let's open a discussion about accurate line length checking

Done: #86.

Are you fluent in rust?

I WISH!!! I just currently don't have time, and didn't have time for a while. Learning Rust (and Go) is my priority, when I'll have more free time.

if you're interested it would be a pleasure to guide you into contributing to the project

When I'll reach level "Beginner" in Rust, I'll definitely would like to contribute the code itself. But everything could change, so no promises. ;)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants