Block transform stores more data than needed #11

martinholters · 2024-09-27T17:11:14Z

Consider

Lines 33 to 34 in 1393bc5

    
           const remain = blockSize + ((data.length % blockSize) || blockSize);     
        
           const align = data.length > remain ? data.length - remain : 0;

Here remain is the amount of data to store and combine with the next chunk once it arrives, and align is the amount of data to process in this _transform call. I don't see why it should ever be necessary to store more than one blockSize for later. Doesn't it suffice to do

 const remain = data.length % blockSize;     
 const align = data.length - remain;

? Not that the current code is wrong - after the final flush, everything sorts out - but it adds unnecessary latency which may prevent this from working in a duplex communication scenario. (I.e. I'm waiting to receive an answer to my last request before sending the next request, but my last request or the last answer are "stuck" in the transform.)

The text was updated successfully, but these errors were encountered:

martinholters · 2024-09-27T17:45:30Z

Ah, no, it seems that upon decoding, the very last block has to be kept around until the stream finishes to apply unpadding. We can only decide whether a block is the last one when either more data arrives or the stream is finished. And we must not unpad blocks before the end. Still just

 const remain = (data.length % blockSize) || blockSize;

should suffice, i.e. preventing the _tail from becoming completely empty. And tests still pass with this. During encryption, I still think just data.length % blockSize should work.

tugrul · 2024-09-27T19:14:43Z

Padding is required in block encryption, even when data perfectly fits the block size, because padding schemes ensure that decryption can distinguish between data and padding. Without padding, there would be ambiguity during decryption, especially if the data matches the block size exactly. The padding bytes indicate how much padding to remove, making the decryption process reliable.

If you have still concerns about this issue, please write a test case to reproduce and verify this condition. The test should simulate specific state or condition and confirm whether the issue occurs as described. This will help determine if the error is legitimate and if further action is needed.

martinholters · 2024-09-28T06:37:19Z

EDIT: Fixed a typo in the first version of the script.

Sorry, it appears I wasn't quite clear what my concern is. It's not about correctness per se but about efficiency in a certain sense. So it's not an error. Let me try give a smallish example:

const {Transform} = require('node:stream');
const {default: {algorithm, mode}, padding, createDecryptStream, createEncryptStream} = require('cryptian');

const des = new algorithm.Des();
des.setKey(Buffer.alloc(8));

const encrypter = new mode.cbc.Cipher(des, Buffer.alloc(8));
const decrypter = new mode.cbc.Decipher(des, Buffer.alloc(8));

const encryptStream = createEncryptStream(encrypter, padding.Iso7816);
const decryptStream = createDecryptStream(decrypter, padding.Iso7816);

encryptStream.pipe(
  new Transform({
    transform(chunk, encoding, callback) {
      console.log('encrypted: ', chunk);
      callback(null, chunk);
    },
  }),
).pipe(decryptStream);
decryptStream.on('data', (chunk) => {
  console.log('decrypted: ', chunk);
});
decryptStream.on('finish', (chunk) => {
  console.log('finished');
});

console.log('Encrypting buffer 1');
encryptStream.write(Buffer.alloc(8, 0));
console.log('Encrypting buffer 2');
encryptStream.write(Buffer.alloc(8, 1));
console.log('Encrypting buffer 3');
encryptStream.write(Buffer.alloc(8, 2));
console.log('Encrypting buffer 4');
encryptStream.write(Buffer.alloc(8, 3));
console.log('Encrypting buffer 5');
encryptStream.write(Buffer.alloc(8, 4));
console.log('\nEnding encryption');
encryptStream.end();

This gives the following output:

Encrypting buffer 1
Encrypting buffer 2
Encrypting buffer 3
encrypted:  <Buffer 8c a6 4d e9 c1 b1 23 a7>
Encrypting buffer 4
encrypted:  <Buffer c5 b4 3c e7 b2 b1 c5 c3>
Encrypting buffer 5
encrypted:  <Buffer 39 bf bf 4d 07 65 e7 c9>
decrypted:  <Buffer 00 00 00 00 00 00 00 00>

Ending encryption
encrypted:  <Buffer 96 7b 7e f0 5e 63 bf 01 c5 b4 0f 31 05 76 45 3c d4 9a 1f 81 08 28 9e cd>
decrypted:  <Buffer 01 01 01 01 01 01 01 01 02 02 02 02 02 02 02 02 03 03 03 03 03 03 03 03>
decrypted:  <Buffer 04 04 04 04 04 04 04 04>
finished

Now with:

--- a/src/transform/block.ts
+++ b/src/transform/block.ts
@@ -30,7 +30,7 @@ export class Block extends Transform {
 
         data = Buffer.concat([this._tail, data]);
         
-        const remain = blockSize + ((data.length % blockSize) || blockSize);    
+        const remain = (data.length % blockSize) || blockSize;
         const align = data.length > remain ? data.length - remain : 0;
     
         this._tail = data.slice(align);

The output becomes:

Encrypting buffer 1
Encrypting buffer 2
encrypted:  <Buffer 8c a6 4d e9 c1 b1 23 a7>
Encrypting buffer 3
encrypted:  <Buffer c5 b4 3c e7 b2 b1 c5 c3>
decrypted:  <Buffer 00 00 00 00 00 00 00 00>
Encrypting buffer 4
encrypted:  <Buffer 39 bf bf 4d 07 65 e7 c9>
decrypted:  <Buffer 01 01 01 01 01 01 01 01>
Encrypting buffer 5
encrypted:  <Buffer 96 7b 7e f0 5e 63 bf 01>
decrypted:  <Buffer 02 02 02 02 02 02 02 02>

Ending encryption
encrypted:  <Buffer c5 b4 0f 31 05 76 45 3c d4 9a 1f 81 08 28 9e cd>
decrypted:  <Buffer 03 03 03 03 03 03 03 03 04 04 04 04 04 04 04 04>
finished

Note that the output data is exactly the same, but earlier in some sense of the word. So IMHO, with that patch, the amount of data in flight is reduced while maintaining correctness. (At least npm test passes.)

In fact, I think one could use

        const remain = n*blockSize + ((data.length % blockSize) || blockSize);

for any non-negative integer n while maintaining correctness. So the question is: Why choose 1 over 0?

tugrul · 2024-09-28T22:30:45Z

I think the problem is solved by referenced branch. Block shifting is only necessary on decrypt stream and I separated implementation of block shifting.

for any non-negative integer n while maintaining correctness. So the question is: Why choose 1 over 0?

I may not fully trust padding algorithms when the padding data is an empty buffer, but they work fine in this case because I wrote unit tests to cover this condition.

martinholters · 2024-10-01T15:04:55Z

I think the problem is solved by referenced branch.

Yes, that's exactly what I had been thinking of! 👍

I may not fully trust padding algorithms when the padding data is an empty buffer,

That's a valid a reason indeed 😄

And thinking about this: For the null and space padding, an empty last buffer could result in either padding with a blocksize full of nulls (or spaces), or no padding at all. If I read the code correctly, you're doing the former. Intuitively, I would have expected the latter, but there may be different trade-offs involved. So I'm not necessarily suggesting to change this. I just meant to point out it might be worth thinking about this case. And anyway, it's related but orthogonal to this issue.

Similarly, for null and space padding, when decrypting, output could be pushed more aggressively, as non-null (or non-space) data cannot by definition be part of the padding. So it would not be necessary to hold on to the last block if the corresponding plaintext doesn't end in a null (or space) byte. But I'm not sure it's worth it, as that sounds like it would need some larger changes for very limited profit.

tugrul · 2024-10-06T11:50:27Z

I merged the branch and published to the NPM.

Your advice can be implemented as an option to spaces and nulls padding modules. Complete skipping is not a good idea of empty buffer because some sort of implementations may expect always have any null or space bytes of stream and can consider as the data broken if there is no null or space byte.

Maybe there would be written the best approach on RFCs but I couldn't find any information when I quickly search.

martinholters changed the title ~~Block transform store more data than needed~~ Block transform stores more data than needed Sep 27, 2024

tugrul added a commit that referenced this issue Sep 28, 2024

Block transform enhancements by issue #11

b8ceb6f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block transform stores more data than needed #11

Block transform stores more data than needed #11

martinholters commented Sep 27, 2024

martinholters commented Sep 27, 2024

tugrul commented Sep 27, 2024

martinholters commented Sep 28, 2024 •

edited

Loading

tugrul commented Sep 28, 2024 •

edited

Loading

martinholters commented Oct 1, 2024

tugrul commented Oct 6, 2024 •

edited

Loading

Block transform stores more data than needed #11

Block transform stores more data than needed #11

Comments

martinholters commented Sep 27, 2024

martinholters commented Sep 27, 2024

tugrul commented Sep 27, 2024

martinholters commented Sep 28, 2024 • edited Loading

tugrul commented Sep 28, 2024 • edited Loading

martinholters commented Oct 1, 2024

tugrul commented Oct 6, 2024 • edited Loading

martinholters commented Sep 28, 2024 •

edited

Loading

tugrul commented Sep 28, 2024 •

edited

Loading

tugrul commented Oct 6, 2024 •

edited

Loading