Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add iospec #51

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

cairomassimo
Copy link

@cairomassimo cairomassimo commented May 9, 2022

iospec is a tool to generate parsers for, validate and transform input/output files, given a specification of the format in an expressive language, as well as to generate documentation of the format itself.

Goals

  • Support a very large set of I/O formats (virtually for all IOI-style problems),
  • By writing the spec just once
    • Generate graders and templates automatically in many programming languages (incl. all those allowed in IOIs)
    • Validate automatically the format for input/output files
    • Validate automatically (a large set of) the assumptions about input/output files
    • Support the creation of I/O generators, validators and checkers, by generating I/O parsers for them automatically
    • Generate a description of the I/O format and assumptions in LaTeX, relying on user-provided macros for the natural-language parts
  • Be reasonably user-friendly to use
  • Be flexible and future-proof

What's implemented already?

  • Generating graders and templates in C and C++
  • Validating inputs/outputs against the spec
  • A large set of formats with numeric-only data (incl. multi-dimensional array with non-uniform dimensions, branches depending on input values, array dimesions that are arithemtic functions of input values, etc.)
  • rustc-like diagnostic messages for most errors when parsing a spec
  • a testing framework that compiles and runs generated graders

Proposed new API / contact points for problem developers

  • gen/IOSPEC file with description of input formats, assumptions (incl. subtask-specific ones), and solution interface,
  • extending the task-maker command (and/or adding one or more new task-maker-tools commands) to generate the following files from gen/IOSPEC (if given):
    • sol/template.{ext} for every configured and supported language,
    • sol/grader.{ext} for every configured and supported language,
    • gen/genlib.hpp, support lib for C++ generators, validators and checkers (see below),
    • statement/lib.tex, TeX macros for statements (see below),
    • statement/messages.{lang}.tex, TeX macros with translatable text (see below),
  • adding the following steps in the DAG:
    • validate iospec
      • runs task-maker-tools iospec-check (once),
      • before everything,
      • reports issues in gen/IOSPEC, and stops if any if found,
    • input pre-validation
      • for each test case, runs task-maker-tools iospec-check <input.txt> --cfg subtask_name=<name>,
      • after input generation, before validation (if any),
      • used to check input format and assumptions in gen/IOSPEC,
      • so that actual validator (if any) can ignore the input format and just check more complex assumptions (e.g. "graph is a DAG"),
    • output validation
      • for each test case, runs task-maker-tools iospec-check <input.txt> <output.txt> --cfg subtask_name=<name> on every input/output,
      • after output generation, before testing solutions,
      • used to check output format and assertions in IOSPEC (conditions on the output),
  • API provided by genlib.hpp:
    • struct IoData { ... } with public members corresponding to I/O data,
    • GENERATOR_MAIN(function1, function2, ...) macro to use in generators,
      • where function1, function2, etc. accept an arbitrary number of parameters and return IoData,
      • if only one argument function1, parameters are parsed automatically from argv,
      • else, the argv[1] is used to choose which of the functions to call, then as above,
      • can use primitive types as paramaters, as well as a random number generator Rng (API TBD) initialized from a seed,
    • VALIDATOR_MAIN(validator_function) macro to use in validators,
      • where validator_function is a void validator_function(IoData data);
      • return type TBD,
    • or, instead of the above, VALIDATOR_GRADER() macro to use in validators,
      • no arguments,
      • validators should implement the same functions as a solution, and perform their checks there,
    • CHECKER_MAIN(checker_function),
      • where checker_function is a void checker_function(IoData correct, IoData submission);
      • parameters are respectively the correct input/output data, and the input/output data for the submission,
      • note: the input is available as part of both correct and submission,
      • return type TBD,
    • std::string get_subtask_name() (or similar name, TBD) returning the current subtask name
  • TeX macros provided by statement/lib.tex (in a future PR):
    • task assumptions (TBD),
    • subtask table with scores and assumptions (TBD),
    • input format specification (TBD),
    • all macros in turn use language-specific macros defined in messages.{lang}.tex (e.g., a language-specific macro like for every #1 used to generate for every $i=0,\dots,N$:)
    • generated macros can be customized with annotations in gen/IOSPEC (TBD)

@cairomassimo cairomassimo force-pushed the iospec branch 2 times, most recently from f57dd61 to 748c66a Compare May 17, 2022 14:00
Copy link
Collaborator

@edomora97 edomora97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From great PR comes great reviews

Overall amazing job!

I've left quite a few comments, most of which are either questions or small suggestions you can apply directly from GitHub. There are some major questions that may go unnoticed, which are:

  • Can you describe further what do you mean by "Compile", "Eval", "EvalMut" and "Run"? I know that from the spec you produce source code (is this the "compile" step?), but also read the input file for validating it (which step is this?)
  • I assume the IR is also an interpreter of the spec for parsing the input file for the validation, how does it work? (i.e. which traits are involved in doing so?)
  • How fast is this interpreter compared to the equivalent validator compiled from C++? (and also compared to the same written in Python, and with pypy)
  • Can you describe how the mem module work? How is it used and what is it used for?
  • What does State contain?

@@ -14,6 +14,8 @@ use task_maker_rust::tools::task_info::main_task_info;
use task_maker_rust::tools::typescriptify::main_typescriptify;
use task_maker_rust::tools::worker::main_worker;

use task_maker_iospec::tools::*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like wildcard imports in general, maybe here it's fine, but my feelings about them are not great.
Apart from the obvious "they bring more than what is actually needed", I dislike them because it makes reading the code without an IDE with "Go to definition" support pretty much impossible (especially when more wildcards are used in the same file).

I don't want to ban them (there are still few good use cases, like for "prelude"s), but I always try to avoid them.

(I mention this here, and skip all the following instances of it)

Comment on lines +54 to +59
/// Check input/output files against a specification in the `iospec` language.
IospecCheck(iospec_check::Opt),
/// Generate graders or other I/O-related files given a specification in the `iospec` language.
IospecGen(iospec_gen::Opt),
/// Generate standard set of files given an I/O format specification in the `iospec` language.
IospecGenAll(iospec_gen_all::Opt),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    iospec-check           Check input/output files against a specification in the `iospec`
                               language
    iospec-gen             Generate graders or other I/O-related files given a specification in
                               the `iospec` language
    iospec-gen-all         Generate standard set of files given an I/O format specification in
                               the `iospec` language

Nit: is it possible to rephrase them so that they won't get a line break with --help?

@@ -0,0 +1,22 @@
[package]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub doesn't allow me to place comments on submodules, so I write it here.

task-maker-exec/tabox has been added as a submodule. Why? Is it really needed or left from previous iterations?

Comment on lines +7 to +22
[dependencies]
syn = { version = "1.0.92", features = ["extra-traits"] }
proc-macro2 = { version = "1.0.19", features = ["span-locations"] }
annotate-snippets = { version = "0.9.0", features = ["color"] }
codemap = "0.1.3"
num-traits = "0.2.12"
by_address = "1.0.4"
anyhow = "1.0.57"
clap = { version = "3.1", features = ["derive"] }

[dev-dependencies]
assert_cmd = "2.0.4"
goldenfile = "1.1.0"
tempdir = "0.3.7"
tempfile = "3.3.0"
walkdir = "2.3.2"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually document (with a super short comment) what each dependency do. For example:

https://github.com/edomora97/task-maker-rust/blob/ee326ebf9d4eae3e286a9846e19ede4404396ade/task-maker-exec/Cargo.toml#L12-L42

@@ -0,0 +1,2 @@
unstable_features = true
imports_granularity = "Item"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use module, but done by hand. And I don't usually force it (especially with unstable features).

.arg("solution.c")
.arg("-o")
.arg("main.c.bin")
// FIXME: missing `free` in generated C
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do. Or maybe just ignore memory leaks in AddressSanitizer.

LangOpt::Cpp => "cpp",
LangOpt::C => "c",
LangOpt::Inspect => "inspect",
// LangOpt::Tex => "tex",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code.

|| {
for call in main.inner.calls.iter() {
gen!(ctx, {
"std::function<{}({})> {} = [](auto...) {{{}}};"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this std::function instead of a plain function pointer?

#[derive(Debug, Clone)]
pub struct CallMetaStmt {
pub kw: kw::call,
pub name: Name,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have two @call statements for the same function name? Does it throw a nice error? I don't think we want to support function overloading.

}
}

// @call init(a=a);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are all the @call in this spec commented out? Is this the reason why the solution files are empty?

@veluca93
Copy link
Collaborator

High level comment: why don't we leave figuring out of formatting to appropriate language-specific tooling? (i.e. clang-format, autopep8 or whatever)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants