Stateless Huffman #8

jimcarreer · 2015-07-08T21:26:05Z

As discussed with @Lukasa the Huffman encoder / decoder is currently initialized using two static structures defined in huffman_constants.py. We could potentially remove the need for this pseudo-stateful nature and instead have a completely stateless Huffman encoder / decoder with the addition of another static structure related to decoding (encoding is currently trivial to make stateless). This may also offer a minor performance enhancement.

jimcarreer · 2015-10-13T14:51:26Z

I think I'm going to take a look at this, this weekend.

Lukasa · 2016-03-21T12:37:56Z

An FYI, I've migrated to an essentially stateless Huffman decoding implementation in #35.

Lukasa · 2016-12-12T09:30:17Z

Nope, we still have a stateful encoder.

Lukasa · 2016-12-12T10:51:58Z

I would like to avoid us having any native extensions in this module at this time.

Lukasa · 2016-12-12T11:05:01Z

Tentatively, yes, I'm opposed to having an optional native extension.

My thinking on this is a bit complex, so let me lay it out.

Generally speaking I have a policy regarding C code, which is that wherever possible I'd like to avoid having untrusted input reach C code. The risks of running C code on untrusted input are too high, and I find that generally speaking I and most software developers I work with are bad at spotting places where assumptions are made that data will be "reasonable". That makes C a great vector for bugs, and in particular for the kind of bug that does not affect a pure-Python program.

For that reason, if we were going to write a native extension I'd want it to be in Rust, not C. However,
The wheel ecosystem is not quite there yet. While it's now mostly possible to ship binary wheels for most platforms, there are a number of Linux distributions that cannot use manylinux1 wheels. This is a problem if you want to ship a Rust extension, because it's very difficult to indicate the need for a Rust compile chain on a random Linux system.

This problem also affects C extensions, by the way: one of the systems that doesn't support a manylinux1 wheel is Alpine, which uses musl libc. That means that if we shipped a C extension on that platform, we'd require the installation of a C toolchain for users to install the module: not great.
So the only option is, as you said, to make the module optional: we'd ideally want to check whether we could compile it and, if we couldn't, we'd simply fall back to pure-Python code.

I have extremely mixed feelings about this, most of them not good. In particular, it's a bit of a hairy debugging issue: what code the user is executing becomes conditional on difficult-to-introspect features of their platform; we double the number of code paths that need to be tested and debugged; and we double the workload for feature addition and API changes.

The TL;DR here, I think, is that right now I'm about -0.25 on having a native extension. If we could come up with a nice Rust extension then I'd reconsider, but I'd really rather not maintain a C extension if it can possibly be avoided.

Lukasa · 2016-12-12T11:08:08Z

In fact, let state an outline of what we'd need to make me happy here.

If any contributor can write a Rust extension that can be optionally compiled if a buildchain is present, that flags its presence or absence clearly in a module-scoped dunder variable, that uses CFFI, and that doesn't vomit a load of output into the terminal on install on systems that do not have a Rust toolchain present, I'd be willing to consider merging such an extension.

Lukasa · 2016-12-12T12:14:45Z

I've already written a Rust extension module in Python, that's not the hard bit. The harder bits are a) making it fail gracefully, b) keeping it maintained, and c) actually writing the extension. ;)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stateless Huffman #8

Stateless Huffman #8

jimcarreer commented Jul 8, 2015

jimcarreer commented Oct 13, 2015

Lukasa commented Mar 21, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Stateless Huffman #8

Stateless Huffman #8

Comments

jimcarreer commented Jul 8, 2015

jimcarreer commented Oct 13, 2015

Lukasa commented Mar 21, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016

Lukasa commented Dec 12, 2016