Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode_utf8 bug in FiSH_WriteKey10 #68

Open
maroonbells opened this issue Oct 12, 2020 · 1 comment
Open

decode_utf8 bug in FiSH_WriteKey10 #68

maroonbells opened this issue Oct 12, 2020 · 1 comment

Comments

@maroonbells
Copy link
Contributor

The 'decode_utf8' feature in FiSH_WriteKey10 and FiSH_GetKey10 is defective, in that it doesn't correctly handle the entire 1-255 range of byte values, and it doesn't reject inputs that it can't securely decode into single bytes.

The effect is that if your key contains any of $rand($chr(145),$chr(156)) it's the equivalent of using $chr(63).

On the one hand, this 'decode_utf8' feature is attractive compared to 'raw_bytes' because it allows more possible keys within the 56 length, by storing the byte values in the 128-255 range as single bytes - instead of the way UTF8 needs 2 bytes to store them. Since current mIRC versions are using UTF8 encoding, it's impossible for the raw_bytes input to receive anything that's not a valid UTF8 string, which means that using characters in the 128-255 range actually restricts the number of bytes which can fit into the key, because only 28 of those characters can fit within the 56 bytes.

I assumed that, since 0x00 is not being permitted as part of the text string, that 'decode_utf8' would at least allow each byte of the key to be any of the values 1-255. However, for 27 of the bytes in the 128-159 range, it stores them in the key as the "?" question mark, and for characters in the 256+ range it stores them in the key as single byte values which seem unrelated to the actual codepoint.

Since the purpose of 'decode_utf8' is to allow keys to be stored in a more compact form within the 56 length, I propose that using 'decode_utf8' should reject the key as invalid if it's being required to decode a codepoint above 255, and the range 128-255 should either be fixed so that it can decode that entire range correctly, or else it should reject the key instead of decoding any of those characters into the 0x3f question mark.

The following table shows the byte stored into the key when each of the codepoints in the 128-255 range is input into the key. I used the following command to obtain the results from the 1st 56 bytes, then adjusted the 127 to get the rest of the range.

//echo -a $dll(%FiSH_dll,FiSH_WriteKey10,decode_utf8 foo bar $regsubex(foo,$str(x,56),/x/g,$chr($calc(\n + 127))))

If 'decode_utf8' handled all bytes in the 128-255 range correctly, then this list of the bytes in the password would contain the hex digits from 80 to FF. Instead, most of the byte values in the 0x80-0x9f range are replaced by 0x3f questionmark in the key:

3f.81.3f.3f.3f.3f.3f.3f.3f.3f.3f.3f.3f.8d.3f.8f.
90.3f.3f.3f.3f.3f.3f.3f.3f.3f.3f.3f.3f.9d.3f.3f.
a0.a1.a2.a3.a4.a5.a6.a7.a8.a9.aa.ab.ac.ad.ae.af.
b0.b1.b2.b3.b4.b5.b6.b7.b8.b9.ba.bb.bc.bd.be.bf.
c0.c1.c2.c3.c4.c5.c6.c7.c8.c9.ca.cb.cc.cd.ce.cf.
d0.d1.d2.d3.d4.d5.d6.d7.d8.d9.da.db.dc.dd.de.df.
e0.e1.e2.e3.e4.e5.e6.e7.e8.e9.ea.eb.ec.ed.ee.ef.
f0.f1.f2.f3.f4.f5.f6.f7.f8.f9.fa.fb.fc.fd.fe.ff.

For $chr(256) and above, here's the hex encoding of how the 56 bytes in the range 256-311 are placed into the key as a single byte each:

41.61.41.61.41.61.43.63.43.63.43.63.43.63.44.64.
d0.64.45.65.45.65.45.65.45.65.45.65.47.67.47.67.
47.67.47.67.48.68.48.68.49.69.49.69.49.69.49.69.
49.69.3f.3f.4a.6a.4b.6b

When this encoding issue is fixed, it should be possible to create a channel key like this without having around 10% of all 'random' bytes being replaced by identical "?". Special handling isn't needed for excluding the 0x00 because mIRC returns $null for $chr(0).

//clipboard $left($regsubex(foo,$sha512(test),/(..)/g,$chr($base(\t,16,10))),56) | echo -a $cb

Until the bug is fixed, you would have to exclude the codepoints which are being badly handled. A lookup table could be created to identify only the 27 codepoints being handled wrong, or could exclude this range of 32 bytes from your key:

//clipboard $left($regsubex(foo,$sha512(test),/(..)/g,$iif($base(\t,16,10) !isnum 128-159,$chr($v1))),56) | echo -a $cb

@flakes
Copy link
Owner

flakes commented Oct 13, 2020

Oh the joys of legacy and backwards compat ... frankly I'm not sure if this is even worth fixing or if we should just start a new encryption plugin (e.g. based on libsodium) to avoid all these headaches ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants