Explicit imports: `import 'Raw "sample.html"` #2036

vi · 2024-09-10T22:23:32Z

Resolves #2033.
Resolves #1000.

Shall I continue this (i.e. try to make tests, error handling, docs, etc.) or it should be done some other way?

Shall it be based on master or e.g. on #2022?

jneem

The approach looks reasonable to me. I guess we might want to bikeshed the syntax a little...

jneem · 2024-09-11T02:28:50Z

core/src/term/mod.rs

@@ -207,7 +207,7 @@ pub enum Term {

    /// An unresolved import.
    #[serde(skip)]
-    Import(OsString),
+    Import{path: OsString, typ: Option<LocIdent>},


I'd suggest having typ: InputFormat here, and handling the supported types (including the auto-detection from the filename) while constructing the Term.

How would it interact with optionally present format like Nix? Ideally it should recognize it and preserve during round-tripping and only fail when actually trying to import the file. Maybe #[cfg] from InputFormat::Nix should be removed?

Does it move complexity to the grammar.lalrpop? (It's the first time I edit such a file)

Good question. I think since InputFormat::Nix is experimental, it's ok to error if it isn't supported.

I'd suggest adding the actual logic in core/src/parser/utils.rs, so that the changes to grammar.lalrpop will be very simple.

vi · 2024-09-11T11:00:39Z

What do you think about fallback to InputFormat::Nickel when filename extension cannot be recognized? Maybe that #[default] is not a good idea?

Falling back from passive format to code format may affect security if there is untrusted data near filename, if .txt appender slips somehow, Nickel would suddenly start executing included (using plain syntax) file instead of just reading a config or text snippet.

github-actions · 2024-09-11T13:10:13Z

Bencher Report

Branch	2036/merge
Testbed	ubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!
Latency
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results

Benchmark	Latency	nanoseconds (ns)
fibonacci 10	📈 view plot ⚠️ NO THRESHOLD	507,460.00
pidigits 100	📈 view plot ⚠️ NO THRESHOLD	3,294,600.00
product 30	📈 view plot ⚠️ NO THRESHOLD	820,460.00
scalar 10	📈 view plot ⚠️ NO THRESHOLD	1,495,700.00
sum 30	📈 view plot ⚠️ NO THRESHOLD	820,550.00

🐰 View full continuous benchmarking report in Bencher

github-actions · 2024-09-11T13:10:14Z

Bencher

Report	Wed, September 11, 2024 at 13:10:13 UTC
Project	nickel
Branch	2036/merge
Testbed	ubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!
Latency (latency)
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results

Benchmark	Latency	Latency Results nanoseconds (ns)
fibonacci 10	➖ (view plot)	496,000.00
pidigits 100	➖ (view plot)	3,177,500.00
product 30	➖ (view plot)	802,100.00
scalar 10	➖ (view plot)	1,482,000.00
sum 30	➖ (view plot)	796,270.00

Bencher - Continuous Benchmarking
View Public Perf Page
Docs | Repo | Chat | Help

vi · 2024-09-12T02:34:34Z

I'd suggest having typ: InputFormat here, and handling the supported types (including the auto-detection from the filename) while constructing the Term.

I'd suggest adding the actual logic in core/src/parser/utils.rs, so that the changes to grammar.lalrpop will be very simple.

Done.

Some questions:

Is it OK to require the tag when filename extension is unknown?
Shall it preserve whether the import was explicit or implicit in Term, so that it can pretty-print it back? Or just always pretty-print with a tag? Or auto-detect when filename extension is good and omit the tag?
What is new_from_inputs? Is it OK to hard code it to Nickel inputs or they should be auto-detected like other files?
Shall there be aliases, like 'Txt or 'Text as alternative for 'Raw, or 'Ncl as alternative for 'Nickel?

jneem · 2024-09-12T03:19:50Z

Is it OK to require the tag when filename extension is unknown?

I think for backwards-compatibility we should fall back to the default when the extension is unknown. (But because the tag is a new feature, it's fine to error for an unknown tag.)

Shall it preserve whether the import was explicit or implicit in Term, so that it can pretty-print it back? Or just always pretty-print with a tag? Or auto-detect when filename extension is good and omit the tag?

I think any of those options is ok; maybe @yannham has some input. The important thing for the pretty-printer is that it round-trips when doing (pretty-print -> parse). It doesn't need to round-trip when doing (parse -> pretty-print).

What is new_from_inputs? Is it OK to hard code it to Nickel inputs or they should be auto-detected like other files?

You mean Program::new_from_inputs? I think all the Program constructors can assume nickel format.

Shall there be aliases, like 'Txt or 'Text as alternative for 'Raw, or 'Ncl as alternative for 'Nickel?

My preference is to not have unnecessary aliases, but maybe other people feel differently...

yannham · 2024-09-12T08:50:13Z

I think for backwards-compatibility we should fall back to the default when the extension is unknown. (But because the tag is a new feature, it's fine to error for an unknown tag.)

I agree.

I think any of those options is ok; maybe @yannham has some input. The important thing for the pretty-printer is that it round-trips when doing (pretty-print -> parse). It doesn't need to round-trip when doing (parse -> pretty-print).

Same. We can always special-case later (like never showing when the format is Nickel, or/and when the extension agrees), but in general pretty-printing is used for error and result reporting, and isn't guaranteed to be stable (formatting doesn't use the pretty-printer).

My preference is to not have unnecessary aliases, but maybe other people feel differently...

I agree as well. We have a raw format for exporting stuff as a pure string, so we probably want a raw format as well for importing pure text without parsing it.

We discussed the syntax in today's weekly meeting. I'll make a separate issue soon, but this doesn't need to block this PR - we can change the syntax in a second PR.

yannham

Thanks for tackling this! Beside the syntax bikeshedding (which can be done later), and the type -> format question, this PR looks good.

yannham · 2024-09-12T08:51:48Z

core/src/error/mod.rs

+    /// Attempt to specify an import, type of which is not known at the moment of compilation.
+    /// `explicit` determines whether explicit import type annotation was used


Once again, I fear type is a bit ambiguous here (because imports can have a type as well). I suggest we use format consistently in the documentation and error reporting instead.

yannham · 2024-09-12T08:57:45Z

By the way, the CI is failing because some code in the lsp/nls crate needs to be updated now that Term::Import has changed.

vi · 2024-09-12T11:14:00Z

By the way, the CI is failing because some code in the lsp/nls crate needs to be updated now that Term::Import has changed.

The pull request is currently marked as draft because of not everything is implemented yet and it should not be merged yet.

Other missing things (besides format field name, fallback and nls):

Pretty printing
Adding to the book (if it shares the source code with the main code)
More unit tests for imports

vi · 2024-09-12T11:20:41Z

I think for backwards-compatibility we should fall back to the default when the extension is unknown.

Shall a warning be issued instead of a parse error in that case?

(Does/should Nickel in general have warnings?)

yannham · 2024-09-12T11:40:13Z

(Does/should Nickel in general have warnings?)

No, it doesn't. Something we've wanted to add for some time - this could be a good excuse, but it's a separate work. For now I would say let's stick to the previous behaviour (import anything unknown as Nickel, and it fails to parse, we show the parse error).

* Rename typ to format * Bring back the fallback * Implement pretty-printer part * Fix compilation of NLP

vi · 2024-09-12T19:58:49Z

Implemented additional changes:

Renamed typ to format. Moved code that deals with the format tags closer to the code that deals with filename extensions.
Brought back the fallback to Nickel files (the function is deliberately kept one step away from implementing the error again though).
Implemented the pretty-printer part (it omits the tag when needed and adds 'Nickel if previous code relied on the fallback).
Fixed compilation of the NLP
Added imports section to the book.

Not implemented:

Specific unit tests
Format tag completions for NLP.

yannham

Looks good! It's just missing some tests. You can take inspiration from nickel/core/tests/integration/inputs/imports, where the test files for imports are located, and nickel/core/tests/integration/inputs/imports/imported, which hosts the files imported by the former tests. If you put a .ncl file in imported which isn't a stand-alone test but just designed to be imported by another a test, as most of them are in imported, you'll need to add a heading annotation # test.type = 'skip'.

If you're not confident or can't devote more time to this, feel free to let us know and we can also take it from here.

core/src/cache.rs

core/src/pretty.rs

doc/manual/syntax.md

Co-authored-by: Yann Hamdaoui <[email protected]>

vi · 2024-09-13T22:04:09Z

Added some tests (without exhaustive tests for each possible input format).

Note that when the same file is imported twice with different formats, the cache remembers the first import and ignores the tag on further ones. Shall we address this potential confusion source? I see two main ways:

Enforce the consistency. Remember the format in cache and verify that the format is correct when attempting to retrieve it from cache. Prototype: vi@3ae48a9
Use input format as a part of the key for cache, allowing caching the same file multiple times (for distinct InputFormats). Prototype: vi@cfc46aa

jneem · 2024-09-16T02:53:29Z

I prefer the second option. I can imagine some use-case where you want to import something as a structured format and as raw text.

vi · 2024-09-16T13:57:24Z

Shall I include the implementation (vi@cfc46aa) into this branch?

This is a more invasive change compared to just some explicit tag support.

yannham · 2024-09-16T14:29:40Z

@vi Yes, I think it's fine to join both here. Splitting is good usually but the second change is really motivated by the current PR and has a ~ +30/-30 diff, which IMHO is entirely reasonable (and the current PR isn't too big either).

vi · 2024-09-17T14:06:27Z

Something's failed on Windows, it is not very clear from CI logs.

Does Windows mangle \n to \r\n during checkout or file read?

yannham · 2024-09-17T14:42:29Z

Mh, that's very likely, yeah. One possibility is to remove the newlines at the end of the files and call it a day. Another possibility is to do a regexp match, for example to just check the ~~suffix~~ prefix, or by matching the end of line with a pattern that will fit both Unix and Windows \\r?\\n or something like that.

We should probably have something like an OS-dependent std.string.new_line in the future, but for this PR we'll have to do without.

vi · 2024-09-18T14:35:00Z

Is there anything more to be done here (e.g. a syntax change for #2039) ?

yannham

Nope, we can handle this in a second step. Thanks a lot for carrying this through @vi !

vi · 2024-10-12T23:36:43Z

Noticed possible regression:

$ cargo run -p  nickel-lang-cli eval q.q q.q
...
error: unbound identifier `import'Nickel`
  ┌─ <generated main>:1:2
  │
1 │ (import'Nickel "q.q") & (import'Nickel "q.q")
  │  ^^^^^^^^^^^^^ this identifier is unbound

yannham · 2024-10-14T17:28:13Z

I guess the pretty printer is at fault, for not inserting a space between import and 'Nickel. However we will end up with a different syntax, so I think we shouldn't bother fixing this issue separately.

jneem reviewed Sep 11, 2024

View reviewed changes

Explicit imports

bb23cd7

vi force-pushed the explicit_import branch from 30c76fe to bb23cd7 Compare September 12, 2024 02:29

yannham reviewed Sep 12, 2024

View reviewed changes

vi added 2 commits September 12, 2024 21:34

explicit import: more changes.

97a9616

* Rename typ to format * Bring back the fallback * Implement pretty-printer part * Fix compilation of NLP

Document imports

b3752e0

vi marked this pull request as ready for review September 12, 2024 19:59

yannham mentioned this pull request Sep 13, 2024

The Bikeshedding Casa: import with explicit format #2039

Closed

yannham reviewed Sep 13, 2024

View reviewed changes

core/src/cache.rs Show resolved Hide resolved

core/src/pretty.rs Outdated Show resolved Hide resolved

doc/manual/syntax.md Outdated Show resolved Hide resolved

doc/manual/syntax.md Outdated Show resolved Hide resolved

doc/manual/syntax.md Outdated Show resolved Hide resolved

vi and others added 3 commits September 13, 2024 14:39

Apply suggestions from code review

73eb4e6

Co-authored-by: Yann Hamdaoui <[email protected]>

minor fix

7458e70

explicit imports: add tests

5302eb4

vi added 2 commits September 17, 2024 00:24

explicit imports: support importing the same file with different formats

a6c3533

fmt

d6f9aff

Reinforce explicit imports test against some whitespace changes.

75bd460

yannham approved these changes Sep 18, 2024

View reviewed changes

yannham added this pull request to the merge queue Sep 18, 2024

Merged via the queue into tweag:master with commit c50647d Sep 18, 2024
6 checks passed

BrewTestBot mentioned this pull request Nov 12, 2024

nickel 1.9.0 Homebrew/homebrew-core#197479

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit imports: `import 'Raw "sample.html"` #2036

Explicit imports: `import 'Raw "sample.html"` #2036

vi commented Sep 10, 2024 •

edited

Loading

jneem left a comment

jneem Sep 11, 2024

vi Sep 11, 2024 •

edited

Loading

jneem Sep 11, 2024

vi commented Sep 11, 2024

github-actions bot commented Sep 11, 2024 •

edited

Loading

github-actions bot commented Sep 11, 2024

vi commented Sep 12, 2024 •

edited

Loading

jneem commented Sep 12, 2024

yannham commented Sep 12, 2024

yannham left a comment

yannham Sep 12, 2024

yannham commented Sep 12, 2024

vi commented Sep 12, 2024

vi commented Sep 12, 2024 •

edited

Loading

yannham commented Sep 12, 2024

vi commented Sep 12, 2024 •

edited

Loading

yannham left a comment

vi commented Sep 13, 2024 •

edited

Loading

jneem commented Sep 16, 2024

vi commented Sep 16, 2024

yannham commented Sep 16, 2024

vi commented Sep 17, 2024

yannham commented Sep 17, 2024 •

edited

Loading

vi commented Sep 18, 2024

yannham left a comment

vi commented Oct 12, 2024

yannham commented Oct 14, 2024

		/// Attempt to specify an import, type of which is not known at the moment of compilation.
		/// `explicit` determines whether explicit import type annotation was used

Explicit imports: import 'Raw "sample.html" #2036

Explicit imports: import 'Raw "sample.html" #2036

Conversation

vi commented Sep 10, 2024 • edited Loading

jneem left a comment

Choose a reason for hiding this comment

jneem Sep 11, 2024

Choose a reason for hiding this comment

vi Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

jneem Sep 11, 2024

Choose a reason for hiding this comment

vi commented Sep 11, 2024

github-actions bot commented Sep 11, 2024 • edited Loading

Bencher Report

github-actions bot commented Sep 11, 2024

Bencher

vi commented Sep 12, 2024 • edited Loading

jneem commented Sep 12, 2024

yannham commented Sep 12, 2024

yannham left a comment

Choose a reason for hiding this comment

yannham Sep 12, 2024

Choose a reason for hiding this comment

yannham commented Sep 12, 2024

vi commented Sep 12, 2024

vi commented Sep 12, 2024 • edited Loading

yannham commented Sep 12, 2024

vi commented Sep 12, 2024 • edited Loading

yannham left a comment

Choose a reason for hiding this comment

vi commented Sep 13, 2024 • edited Loading

jneem commented Sep 16, 2024

vi commented Sep 16, 2024

yannham commented Sep 16, 2024

vi commented Sep 17, 2024

yannham commented Sep 17, 2024 • edited Loading

vi commented Sep 18, 2024

yannham left a comment

Choose a reason for hiding this comment

vi commented Oct 12, 2024

yannham commented Oct 14, 2024

Explicit imports: `import 'Raw "sample.html"` #2036

Explicit imports: `import 'Raw "sample.html"` #2036

vi commented Sep 10, 2024 •

edited

Loading

vi Sep 11, 2024 •

edited

Loading

github-actions bot commented Sep 11, 2024 •

edited

Loading

vi commented Sep 12, 2024 •

edited

Loading

vi commented Sep 12, 2024 •

edited

Loading

vi commented Sep 12, 2024 •

edited

Loading

vi commented Sep 13, 2024 •

edited

Loading

yannham commented Sep 17, 2024 •

edited

Loading