Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block transform stores more data than needed #11

Open
martinholters opened this issue Sep 27, 2024 · 6 comments
Open

Block transform stores more data than needed #11

martinholters opened this issue Sep 27, 2024 · 6 comments

Comments

@martinholters
Copy link

Consider

const remain = blockSize + ((data.length % blockSize) || blockSize);
const align = data.length > remain ? data.length - remain : 0;

Here remain is the amount of data to store and combine with the next chunk once it arrives, and align is the amount of data to process in this _transform call. I don't see why it should ever be necessary to store more than one blockSize for later. Doesn't it suffice to do

 const remain = data.length % blockSize;     
 const align = data.length - remain; 

? Not that the current code is wrong - after the final flush, everything sorts out - but it adds unnecessary latency which may prevent this from working in a duplex communication scenario. (I.e. I'm waiting to receive an answer to my last request before sending the next request, but my last request or the last answer are "stuck" in the transform.)

@martinholters martinholters changed the title Block transform store more data than needed Block transform stores more data than needed Sep 27, 2024
@martinholters
Copy link
Author

Ah, no, it seems that upon decoding, the very last block has to be kept around until the stream finishes to apply unpadding. We can only decide whether a block is the last one when either more data arrives or the stream is finished. And we must not unpad blocks before the end. Still just

 const remain = (data.length % blockSize) || blockSize;

should suffice, i.e. preventing the _tail from becoming completely empty. And tests still pass with this. During encryption, I still think just data.length % blockSize should work.

@tugrul
Copy link
Owner

tugrul commented Sep 27, 2024

Padding is required in block encryption, even when data perfectly fits the block size, because padding schemes ensure that decryption can distinguish between data and padding. Without padding, there would be ambiguity during decryption, especially if the data matches the block size exactly. The padding bytes indicate how much padding to remove, making the decryption process reliable.

If you have still concerns about this issue, please write a test case to reproduce and verify this condition. The test should simulate specific state or condition and confirm whether the issue occurs as described. This will help determine if the error is legitimate and if further action is needed.

@martinholters
Copy link
Author

martinholters commented Sep 28, 2024

EDIT: Fixed a typo in the first version of the script.


Sorry, it appears I wasn't quite clear what my concern is. It's not about correctness per se but about efficiency in a certain sense. So it's not an error. Let me try give a smallish example:

const {Transform} = require('node:stream');
const {default: {algorithm, mode}, padding, createDecryptStream, createEncryptStream} = require('cryptian');

const des = new algorithm.Des();
des.setKey(Buffer.alloc(8));

const encrypter = new mode.cbc.Cipher(des, Buffer.alloc(8));
const decrypter = new mode.cbc.Decipher(des, Buffer.alloc(8));

const encryptStream = createEncryptStream(encrypter, padding.Iso7816);
const decryptStream = createDecryptStream(decrypter, padding.Iso7816);

encryptStream.pipe(
  new Transform({
    transform(chunk, encoding, callback) {
      console.log('encrypted: ', chunk);
      callback(null, chunk);
    },
  }),
).pipe(decryptStream);
decryptStream.on('data', (chunk) => {
  console.log('decrypted: ', chunk);
});
decryptStream.on('finish', (chunk) => {
  console.log('finished');
});

console.log('Encrypting buffer 1');
encryptStream.write(Buffer.alloc(8, 0));
console.log('Encrypting buffer 2');
encryptStream.write(Buffer.alloc(8, 1));
console.log('Encrypting buffer 3');
encryptStream.write(Buffer.alloc(8, 2));
console.log('Encrypting buffer 4');
encryptStream.write(Buffer.alloc(8, 3));
console.log('Encrypting buffer 5');
encryptStream.write(Buffer.alloc(8, 4));
console.log('\nEnding encryption');
encryptStream.end();

This gives the following output:

Encrypting buffer 1
Encrypting buffer 2
Encrypting buffer 3
encrypted:  <Buffer 8c a6 4d e9 c1 b1 23 a7>
Encrypting buffer 4
encrypted:  <Buffer c5 b4 3c e7 b2 b1 c5 c3>
Encrypting buffer 5
encrypted:  <Buffer 39 bf bf 4d 07 65 e7 c9>
decrypted:  <Buffer 00 00 00 00 00 00 00 00>

Ending encryption
encrypted:  <Buffer 96 7b 7e f0 5e 63 bf 01 c5 b4 0f 31 05 76 45 3c d4 9a 1f 81 08 28 9e cd>
decrypted:  <Buffer 01 01 01 01 01 01 01 01 02 02 02 02 02 02 02 02 03 03 03 03 03 03 03 03>
decrypted:  <Buffer 04 04 04 04 04 04 04 04>
finished

Now with:

--- a/src/transform/block.ts
+++ b/src/transform/block.ts
@@ -30,7 +30,7 @@ export class Block extends Transform {
 
         data = Buffer.concat([this._tail, data]);
         
-        const remain = blockSize + ((data.length % blockSize) || blockSize);    
+        const remain = (data.length % blockSize) || blockSize;
         const align = data.length > remain ? data.length - remain : 0;
     
         this._tail = data.slice(align);

The output becomes:

Encrypting buffer 1
Encrypting buffer 2
encrypted:  <Buffer 8c a6 4d e9 c1 b1 23 a7>
Encrypting buffer 3
encrypted:  <Buffer c5 b4 3c e7 b2 b1 c5 c3>
decrypted:  <Buffer 00 00 00 00 00 00 00 00>
Encrypting buffer 4
encrypted:  <Buffer 39 bf bf 4d 07 65 e7 c9>
decrypted:  <Buffer 01 01 01 01 01 01 01 01>
Encrypting buffer 5
encrypted:  <Buffer 96 7b 7e f0 5e 63 bf 01>
decrypted:  <Buffer 02 02 02 02 02 02 02 02>

Ending encryption
encrypted:  <Buffer c5 b4 0f 31 05 76 45 3c d4 9a 1f 81 08 28 9e cd>
decrypted:  <Buffer 03 03 03 03 03 03 03 03 04 04 04 04 04 04 04 04>
finished

Note that the output data is exactly the same, but earlier in some sense of the word. So IMHO, with that patch, the amount of data in flight is reduced while maintaining correctness. (At least npm test passes.)

In fact, I think one could use

        const remain = n*blockSize + ((data.length % blockSize) || blockSize);

for any non-negative integer n while maintaining correctness. So the question is: Why choose 1 over 0?

@tugrul
Copy link
Owner

tugrul commented Sep 28, 2024

I think the problem is solved by referenced branch. Block shifting is only necessary on decrypt stream and I separated implementation of block shifting.

for any non-negative integer n while maintaining correctness. So the question is: Why choose 1 over 0?

I may not fully trust padding algorithms when the padding data is an empty buffer, but they work fine in this case because I wrote unit tests to cover this condition.

@martinholters
Copy link
Author

I think the problem is solved by referenced branch.

Yes, that's exactly what I had been thinking of! 👍

I may not fully trust padding algorithms when the padding data is an empty buffer,

That's a valid a reason indeed 😄

And thinking about this: For the null and space padding, an empty last buffer could result in either padding with a blocksize full of nulls (or spaces), or no padding at all. If I read the code correctly, you're doing the former. Intuitively, I would have expected the latter, but there may be different trade-offs involved. So I'm not necessarily suggesting to change this. I just meant to point out it might be worth thinking about this case. And anyway, it's related but orthogonal to this issue.

Similarly, for null and space padding, when decrypting, output could be pushed more aggressively, as non-null (or non-space) data cannot by definition be part of the padding. So it would not be necessary to hold on to the last block if the corresponding plaintext doesn't end in a null (or space) byte. But I'm not sure it's worth it, as that sounds like it would need some larger changes for very limited profit.

@tugrul
Copy link
Owner

tugrul commented Oct 6, 2024

I merged the branch and published to the NPM.

Your advice can be implemented as an option to spaces and nulls padding modules. Complete skipping is not a good idea of empty buffer because some sort of implementations may expect always have any null or space bytes of stream and can consider as the data broken if there is no null or space byte.

Maybe there would be written the best approach on RFCs but I couldn't find any information when I quickly search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants