Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle unknown output filenames #7

Open
jachris opened this issue Aug 1, 2017 · 5 comments
Open

Handle unknown output filenames #7

jachris opened this issue Aug 1, 2017 · 5 comments

Comments

@jachris
Copy link
Owner

jachris commented Aug 1, 2017

Some rules (e.g. Java) might output different filenames which are difficult to know before executing the rule. Currently all outputs must be known before execution – other build tools solve this by zipping all produced files to a well-defined path.

@jachris
Copy link
Owner Author

jachris commented Nov 24, 2017

The currently solution is to make the filepath deterministic by renaming and/or zipping. This may not be good enough for some people, but it will do for now. Feel invited to open an issue if you need this.

@jachris jachris closed this as completed Nov 24, 2017
@spacehamster
Copy link

I'm looking at cook in the context of setting up projects. An example is a script that creates a visual studio project on windows, an xcode project on mac, and make project on linux. Not being able to create rules with unknown outputs makes this really difficult. Say I try to port a bash script to cook, such that if there is an mistake in one line, the whole thing doesn't need to be rebuilt:

mkdir build
cd build
curl -L https://github.com/commonmark/cmark/archive/0.28.3.tar.gz -o cmark.tar.gz
tar -xzvf cmark.tar.gz
mkdir cmark_build
cd cmark_build
cmake ../cmark-0.28.3 -G "Visual Studio 15 2017 Win64" -DCMAKE_INSTALL_PREFIX=../cmark_install
cmake --build . --target install
cd ..
mkdir my_project
cd my_project
cmake ../.. -G "Visual Studio 15 2017 Win64" -DCMARK_PATH=../cmark_install
#open myproject.sln
  • Can't use file.extract to extract cmark.tar.gz because the file contents aren't known until the tar is downloaded and extracted
  • The contents of cmark_build can't really be specified as they depend on cmake
  • The contents of cmark_install are hard to specify because while my_project depends explicitly on the libs, it depends implicitly on the header files
  • The contents of my_project change over time as they are used

Another example is generating documentation, such as doxygen or sphinx, which generate a doc folder full of html files.

I'm assuming the reason unspecified outputs are automatically deleted is to ensure correct, reproducible builds. Perhaps that could optionally be disabled, and rules without deterministic output could be given a target that stands in place of the output (a folder, a CMakeLists.txt file, a python function) to determine when to rebuild.

@jachris jachris reopened this Apr 2, 2018
@jachris
Copy link
Owner Author

jachris commented Apr 2, 2018

If you want you can disable the file removal right now by editing cook/core/record.py. Just remove everything shown below (and make sure you installed Cook with the develop option of setup.py, or alternatively reinstall).

cook/cook/core/record.py

Lines 40 to 48 in 7461a90

for root, directories, files in os.walk(build('.')):
root = os.path.normpath(root)
if root.startswith(temporary):
continue
for file in files:
path = os.path.abspath(os.path.join(root, file))
if not path == record and not graph.has_file(path):
log.warning('Removing non-declared file: ' + path)
os.remove(path)

However, we both know that this is only a quickfix.

In the long run I want to improve the build process itself: Let's imagine that you want to download an archive and build its content (e.g. c++ files) using Cook itself. Doing this in a dynamic way which does not require explictly naming the source files is currently not possible, because all tasks must be instantiated before the first one is build. What would solve this problem and also yours is making the process more dynamic: Tasks should be creatable during building other ones. You should also be able to first download the archive and publish the output filenames after that. This may sound easy, but it has many implications down the chain (e.g. I only want to build X, but how do I know which task will create it?).

This requires changing the whole model and cannot be done properly without some time on hand. I don't know when I will realize it, since the development is currently paused. But this has a very high priority for me once I get back working on this.

I'm sorry I can't offer you a better solution for now. 😐

@spacehamster
Copy link

spacehamster commented Apr 3, 2018

I was thinking of something like a command to disable checking for unknown file outputs, and then letting the user manage them with custom rules. Something like:

from cook import core, misc
core.strict_mode(False) #Disable deleting unknown files
#Doxygen manages its output folders
misc.run(
    inputs=['doxyfile'],
    outputs=['doxygen'],
    command=['doxygen', 'doxyfile'],
    phony=True,
    message='Generate doc directory.'
)

As for dynamic tasks, not knowing which tasks are available until prerequisite tasks are build sounds like quite the design challenge to solve.

A simpler method I was considering that may not be for everyone, is a system that allows folder dependencies that extends an existing build system.

I understand if any of this might conflict with current or future design considerations.

@jachris
Copy link
Owner Author

jachris commented Apr 3, 2018

The solution you outlined above would work, but still requires some time to polish and implement. Also I'm not quite sure whether it's a good one or not (I especially dislike the global flag).

In the meantime you can use a simple hack often used in make: creating a dummy output. You can do that in Cook as well, but keep in mind that you basically throw correctness out of the window just like with make.

from cook import core


@core.rule
def dummy(
    command, inputs=None, message=None, env=None, timeout=None, cwd=None
):
    inputs = core.resolve(inputs or [])
    command[0] = core.which(command[0])
    dummy = core.checksum(command, env, cwd)

    if not command[0]:
        raise RuntimeError('Could not find program')

    yield core.publish(
        inputs=inputs + [command[0]],
        outputs=[dummy],
        message=message or 'Running "{}"'.format(command[0]),
        check=[env, command],
        phony=True
    )

    core.call(command, env=env, timeout=timeout, cwd=cwd)

    with open(dummy, 'w'):
        pass


out = core.resolve('docs')
docs = dummy(
    inputs=core.glob('src/**'),
    command=['touch', out],
    message='Generating documentation'
)


@core.task
def zipped_docs():
    zipped = core.build('docs.zip')
    zip = core.which('zip')
    if not zip:
        raise RuntimeError('could not find "zip" program')
    
    yield core.publish(
        inputs=[docs.output],
        message='Zipping documentation',
        outputs=[zipped]
    )

    core.call(['zip', zipped, out])

Doxygen (or in this case touch) may write its outputs to arbitrary folders including the build directory if you make the modifications to record.py. If you don't want to make the modifications, you can still write the outputs into the source directory. You can depend on it by mentioning docs.output in your inputs, so that you can zip it for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants