-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: further separate CLI logic from the API related functionality (see #117) #124
Conversation
…for each subcommand
…se `style_edition` instead.
CodSpeed Performance ReportMerging #124 will not alter performanceComparing Summary
|
Hi @Rolv-Apneseth! I have quickly looked at your PR, it looks very good! I will give it a better look later, but let me answer your first questions.
Well, I think So in conclusion,
We can give it a try :-) |
…ute` for each variant of `Command`
Great!
Sorry, I meant which things in let text = std::fs::read_to_string(filename)?;
let requests = request
.clone()
.with_text(text.clone())
.split(self.max_length, self.split_pattern.as_str());
let response = server_client.check_multiple_and_join(requests).await?; And for the
I've added it in there - I think it fits nicely. |
I've pushed some changes there for this, avoiding the need to clone the input text. |
Yes,
I think using
Yes and no. Yes because
Yes, this would be great. The split-text logic isn't really trivial to solve, because we want to avoid splitting text where it would break the meaning of a sentence, but also we cannot have long-running text either. I think working on a better split-logic can be a next step. For the moment, we can just use
Yes, the end-goal would be to remove the complex logic from |
Hey again. Had a lot of trouble with lifetimes and I could only get |
Hello @Rolv-Apneseth, rapidly looking at your commit, I think you just forgot to add the lifetime parameter in However, it makes little sense to have reference in responses, because This will replicate a similar behavior to that of |
.as_ref() | ||
.ok_or(Error::InvalidRequest("missing text field".to_string()))?; | ||
pub fn try_split(mut self, n: usize, pat: &str) -> Result<Vec<Self>> { | ||
let text = mem::take(&mut self.text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, one goal would be that try_split
returns a reference to &self
, so the signature should be as follows:
pub fn try_split<'source>(&'source self, n: usize, pat: &str) -> Result<Vec<Self<'source>>>
Note that we should not need to mutate the actual request, because we only take references.
I can have a go again but something was complaining about needing to live longer than |
No issue. If you can't make it compile, just push with |
So this code: #[cfg_attr(feature = "cli", derive(Args))]
#[derive(Clone, Debug, PartialEq, Eq, Serialize, Hash)]
#[serde(rename_all = "camelCase")]
#[non_exhaustive]
pub struct Request<'source> {
/// The text to be checked. This or 'data' is required.
#[cfg_attr(
feature = "cli",
clap(short = 't', long, conflicts_with = "data", allow_hyphen_values(true))
)]
#[serde(skip_serializing_if = "Option::is_none")]
pub text: Option<Cow<'source, str>>, Leads to this error:
So I don't think it will work as it is. Should I take out |
Oh yeah, I remember why I struggled with Creating a separate struct would work, but that means duplicating a lot of the code. I wonder if there is still a way to allow |
Not that I can think of anyway. I'd say the options are to either stick with |
Ok so some news: at the moment, it seems impossible to do that in one struct, see clap-rs/clap#5773 and the related discussion, so I suggest we duplicated the structure: to original |
I think it would have to be a procedural macro right? Can't see how to do that with a declarative one. I have a bit of time today so I'll have a look but feel free to do it if you have a good idea of how it should be done |
I actually started some work on your branch today, I will hopefully have something working this afternoon :) |
Hey - been busy moving this weekend. If I have any time in the coming days I can have a look for any crates that could solve this problem - if not, yes, copy paste might have to do for now |
No issue! Yeah, maybe let's copy and paste (keeping the |
Yeah couldn't find any crates for this situation. I do think just copying the struct is better but to provide another option, what do you think of something like this: pub struct Request<'source> {
// HACK: keep compiler happy when enabling the cli feature flag
#[cfg(feature = "cli")]
#[clap(skip)]
_lifetime: PhantomData<&'source ()>,
/// (...docs)
#[cfg(feature = "cli")]
#[clap(short = 't', long, conflicts_with = "data", allow_hyphen_values(true))]
#[serde(skip_serializing_if = "Option::is_none")]
pub text: Option<Cow<'static, str>>,
/// (...docs)
#[cfg(not(feature = "cli"))]
#[serde(skip_serializing_if = "Option::is_none")]
pub text: Option<Cow<'source, str>>,
/// (...docs)
#[cfg(feature = "cli")]
#[clap(short = 'd', long, conflicts_with = "text")]
#[serde(skip_serializing_if = "Option::is_none")]
pub data: Option<Data<'static>>,
/// (...docs)
#[cfg(not(feature = "cli"))]
#[serde(skip_serializing_if = "Option::is_none")]
pub data: Option<Data<'source>>, As for the upstream issue - I had a look but I believe it's beyond my skills. Needs quite a bit of familiarity with how their derive macros work, and the author seems to agree having assigned it that |
This is not a wrong idea, but the issue is that compiling with This is an edge case, but we don't want to have to keep a static lifetime reference to the content of each file, where we don't need it. So copying the struct is the only we to both have Clap happy and still be able to use non-static lifetime when desired.
I agree, this is not easy, especially as it involves proc macros! |
…eature Required cloning structs and methods from `api::check` to separate out the `clap` functionality, as `clap` wouldn't support the lifetime without it being `'static`
So, it's not ideal, and I had to use the lifetime crate, but what do you think of the latest commit? Perhaps it's better to leave those fields as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Rolv-Apneseth! It looks great, I just have a few comments that I left :-)
Please let me know what you think of it!
src/api/check.rs
Outdated
@@ -963,7 +953,7 @@ impl Response { | |||
#[derive(Debug, Clone, PartialEq)] | |||
pub struct ResponseWithContext { | |||
/// Original text that was checked by LT. | |||
pub text: Cow<'static, str>, | |||
pub text: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it have to be an owned string? If I am correct, the "context" is simply a source text, so should be &'source str
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had issues with lifetimes, in particular check_multiple_and_join
in server.rs
was a pain due to those tokio tasks that are spawned. However, I'm having a look at if it can maybe work with IntoStatic
.
What is the match positions section in this code block supposed to do? It's causing a lifetime issue and I don't see the values being returned / used for anything:
impl<'source> From<ResponseWithContext<'source>> for Response {
#[allow(clippy::needless_borrow)]
fn from(mut resp: ResponseWithContext<'source>) -> Self {
let iter: MatchPositions<'_, std::slice::IterMut<'_, Match>> = (&mut resp).into();
for (line_number, line_offset, m) in iter {
m.more_context = Some(MoreContext {
line_number,
line_offset,
});
}
resp.response
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum I see, probably using multiple tasks may conflict with lifetime.
The match iter
is an iterator that will automatically count lines and line offsets, so we can then add this information to the Match
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks that helped, so that piece of code modifies the response with some additional context before returning it.
Have a look at the latest commit and see if you're happy with the changes as I needed some workarounds for more lifetime issues due to that block of code above.
I could improve it if we drop the implementation for Iter
and just keep the one for IterMut
, so that it doesn't need to be generic. Also, maybe the modification of response
with that additional context could happen when ResponseWithContext
is first created, rather than only when converting it to Response
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Rolv-Apneseth! Your work looks very promising, but I have a paper deadline (next Thursday) that is taking most of my time, so I will (hopefully, if the deadline is not postponed) have time to review your work on next Friday :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, no worries, you can take your time. And good luck!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you alright with these changes then? If so I can move on to the de-duplication of the DataAnnotation
and CliRequest
methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - hope the paper went well
impl CliDataAnnotation { | ||
/// Instantiate a new `CliDataAnnotation` with text only. | ||
#[inline] | ||
#[must_use] | ||
pub fn new_text<T: Into<String>>(text: T) -> Self { | ||
Self { | ||
text: Some(text.into()), | ||
markup: None, | ||
interpret_as: None, | ||
} | ||
} | ||
|
||
/// Instantiate a new `CliDataAnnotation` with markup only. | ||
#[inline] | ||
#[must_use] | ||
pub fn new_markup<M: Into<String>>(markup: M) -> Self { | ||
Self { | ||
text: None, | ||
markup: Some(markup.into()), | ||
interpret_as: None, | ||
} | ||
} | ||
|
||
/// Instantiate a new `CliDataAnnotation` with markup and its | ||
/// interpretation. | ||
#[inline] | ||
#[must_use] | ||
pub fn new_interpreted_markup<M: Into<String>, I: Into<String>>( | ||
markup: M, | ||
interpret_as: I, | ||
) -> Self { | ||
Self { | ||
interpret_as: Some(interpret_as.into()), | ||
markup: Some(markup.into()), | ||
text: None, | ||
} | ||
} | ||
|
||
/// Return the text or markup within the data annotation. | ||
/// | ||
/// # Errors | ||
/// | ||
/// If this data annotation does not contain text or markup. | ||
pub fn try_get_text(&self) -> Result<String> { | ||
if let Some(ref text) = self.text { | ||
Ok(text.clone()) | ||
} else if let Some(ref markup) = self.markup { | ||
Ok(markup.clone()) | ||
} else { | ||
Err(Error::InvalidDataAnnotation(format!( | ||
"missing either text or markup field in {self:?}" | ||
))) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark as for CliRequest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Rolv-Apneseth, I finally have some time to review this work properly :-)
I did a new pass, and it looks excellent! I suggested a few changes, especially for readability, and added two questions regarding two static lifetimes.
I see that most tests are passing, except a few ones, but didn't yet understand why those tests were failing.
Co-authored-by: Jérome Eertmans <[email protected]>
Co-authored-by: Jérome Eertmans <[email protected]>
Co-authored-by: Jérome Eertmans <[email protected]>
Co-authored-by: Jérome Eertmans <[email protected]>
Co-authored-by: Jérome Eertmans <[email protected]>
Hello @Rolv-Apneseth, sorry for lagging a bit behind those last weeks. I looked at failing test a bit in depth and I don't mind merging this PR. We will take care of making those tests pass in the main PR :-) |
Oh - I hadn't done the de-duplication yet - shall I do that and you can merge this branch again? I was waiting for a reply to this:
|
Oh, sorry! Well, given your large contribution to this repo, I consider adding you as a contributor so you can directly edit Should I add you? |
I mean, as long as you're comfortable with it. Personally I wouldn't trust strangers on the internet so fast haha |
I know, but branch protection rules are good to prevent issues :-) |
Fair enough yeah. Thanks for the invite, I'll push that deduplication to v3 (probably) later today |
Thanks! I plan to be more active on this repo myself, as I plan on using this combined with Typst to write my thesis, but this is unfortunately not a priority at the moment, and this is why I prefer to give you more freedom to work on this project while you are motivated, rather than to stale it forever haha |
Continuation of #117.
As discussed, some refactors. You will probably want some more done here and that's OK, just let me know what to do.
One thing for sure is the
String
values inapi/check.rs
- which ones exactly would you like me to make&str
orCow<'_, str>
?Also, for
cli/mod.rs
, I could potentially remove the need for the manual call ofcmd.execute
for each subcommand using the enum_dispatch crate if you're OK with adding a dependency? Fine as it is anyway though in my opinion, as there aren't many subcommands.