Handle unknown output filenames #7

jachris · 2017-08-01T08:55:43Z

Some rules (e.g. Java) might output different filenames which are difficult to know before executing the rule. Currently all outputs must be known before execution – other build tools solve this by zipping all produced files to a well-defined path.

jachris · 2017-11-24T20:46:13Z

The currently solution is to make the filepath deterministic by renaming and/or zipping. This may not be good enough for some people, but it will do for now. Feel invited to open an issue if you need this.

spacehamster · 2018-04-02T00:44:01Z

I'm looking at cook in the context of setting up projects. An example is a script that creates a visual studio project on windows, an xcode project on mac, and make project on linux. Not being able to create rules with unknown outputs makes this really difficult. Say I try to port a bash script to cook, such that if there is an mistake in one line, the whole thing doesn't need to be rebuilt:

mkdir build
cd build
curl -L https://github.com/commonmark/cmark/archive/0.28.3.tar.gz -o cmark.tar.gz
tar -xzvf cmark.tar.gz
mkdir cmark_build
cd cmark_build
cmake ../cmark-0.28.3 -G "Visual Studio 15 2017 Win64" -DCMAKE_INSTALL_PREFIX=../cmark_install
cmake --build . --target install
cd ..
mkdir my_project
cd my_project
cmake ../.. -G "Visual Studio 15 2017 Win64" -DCMARK_PATH=../cmark_install
#open myproject.sln

Can't use file.extract to extract cmark.tar.gz because the file contents aren't known until the tar is downloaded and extracted
The contents of cmark_build can't really be specified as they depend on cmake
The contents of cmark_install are hard to specify because while my_project depends explicitly on the libs, it depends implicitly on the header files
The contents of my_project change over time as they are used

Another example is generating documentation, such as doxygen or sphinx, which generate a doc folder full of html files.

I'm assuming the reason unspecified outputs are automatically deleted is to ensure correct, reproducible builds. Perhaps that could optionally be disabled, and rules without deterministic output could be given a target that stands in place of the output (a folder, a CMakeLists.txt file, a python function) to determine when to rebuild.

jachris · 2018-04-02T15:06:50Z

If you want you can disable the file removal right now by editing cook/core/record.py. Just remove everything shown below (and make sure you installed Cook with the develop option of setup.py, or alternatively reinstall).

cook/cook/core/record.py

Lines 40 to 48 in 7461a90

    
           for root, directories, files in os.walk(build('.')): 
        
               root = os.path.normpath(root) 
        
               if root.startswith(temporary): 
        
                   continue 
        
               for file in files: 
        
                   path = os.path.abspath(os.path.join(root, file)) 
        
                   if not path == record and not graph.has_file(path): 
        
                       log.warning('Removing non-declared file: ' + path) 
        
                       os.remove(path)

However, we both know that this is only a quickfix.

In the long run I want to improve the build process itself: Let's imagine that you want to download an archive and build its content (e.g. c++ files) using Cook itself. Doing this in a dynamic way which does not require explictly naming the source files is currently not possible, because all tasks must be instantiated before the first one is build. What would solve this problem and also yours is making the process more dynamic: Tasks should be creatable during building other ones. You should also be able to first download the archive and publish the output filenames after that. This may sound easy, but it has many implications down the chain (e.g. I only want to build X, but how do I know which task will create it?).

This requires changing the whole model and cannot be done properly without some time on hand. I don't know when I will realize it, since the development is currently paused. But this has a very high priority for me once I get back working on this.

I'm sorry I can't offer you a better solution for now. 😐

spacehamster · 2018-04-03T10:30:47Z

I was thinking of something like a command to disable checking for unknown file outputs, and then letting the user manage them with custom rules. Something like:

from cook import core, misc
core.strict_mode(False) #Disable deleting unknown files
#Doxygen manages its output folders
misc.run(
    inputs=['doxyfile'],
    outputs=['doxygen'],
    command=['doxygen', 'doxyfile'],
    phony=True,
    message='Generate doc directory.'
)

As for dynamic tasks, not knowing which tasks are available until prerequisite tasks are build sounds like quite the design challenge to solve.

A simpler method I was considering that may not be for everyone, is a system that allows folder dependencies that extends an existing build system.

I understand if any of this might conflict with current or future design considerations.

jachris · 2018-04-03T15:23:19Z

The solution you outlined above would work, but still requires some time to polish and implement. Also I'm not quite sure whether it's a good one or not (I especially dislike the global flag).

In the meantime you can use a simple hack often used in make: creating a dummy output. You can do that in Cook as well, but keep in mind that you basically throw correctness out of the window just like with make.

from cook import core


@core.rule
def dummy(
    command, inputs=None, message=None, env=None, timeout=None, cwd=None
):
    inputs = core.resolve(inputs or [])
    command[0] = core.which(command[0])
    dummy = core.checksum(command, env, cwd)

    if not command[0]:
        raise RuntimeError('Could not find program')

    yield core.publish(
        inputs=inputs + [command[0]],
        outputs=[dummy],
        message=message or 'Running "{}"'.format(command[0]),
        check=[env, command],
        phony=True
    )

    core.call(command, env=env, timeout=timeout, cwd=cwd)

    with open(dummy, 'w'):
        pass


out = core.resolve('docs')
docs = dummy(
    inputs=core.glob('src/**'),
    command=['touch', out],
    message='Generating documentation'
)


@core.task
def zipped_docs():
    zipped = core.build('docs.zip')
    zip = core.which('zip')
    if not zip:
        raise RuntimeError('could not find "zip" program')
    
    yield core.publish(
        inputs=[docs.output],
        message='Zipping documentation',
        outputs=[zipped]
    )

    core.call(['zip', zipped, out])

Doxygen (or in this case touch) may write its outputs to arbitrary folders including the build directory if you make the modifications to record.py. If you don't want to make the modifications, you can still write the outputs into the source directory. You can depend on it by mentioning docs.output in your inputs, so that you can zip it for example.

jachris added Core Feature Low Priority labels Aug 1, 2017

jachris mentioned this issue Aug 5, 2017

Investigate Rust Rules #12

Open

jachris closed this as completed Nov 24, 2017

jachris reopened this Apr 2, 2018

jachris added High Priority and removed Low Priority labels Apr 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle unknown output filenames #7

Handle unknown output filenames #7

jachris commented Aug 1, 2017

jachris commented Nov 24, 2017

spacehamster commented Apr 2, 2018

jachris commented Apr 2, 2018

spacehamster commented Apr 3, 2018 •

edited

Loading

jachris commented Apr 3, 2018

Handle unknown output filenames #7

Handle unknown output filenames #7

Comments

jachris commented Aug 1, 2017

jachris commented Nov 24, 2017

spacehamster commented Apr 2, 2018

jachris commented Apr 2, 2018

spacehamster commented Apr 3, 2018 • edited Loading

jachris commented Apr 3, 2018

spacehamster commented Apr 3, 2018 •

edited

Loading