Encoding problem with accent marks #195

solpahi · 2015-09-29T15:37:35Z

http://jbovlaste.lojban.org/dict/berl%C3%83%C6%92%C3%86%E2%80%99%C3%83%E2%80%9A%C3%82%C2%ACn

This was a test to see if accent marks are supported (for stress purposes in cmevla). It was recognized as a cmevla, but the encoding is all messed up.

lynn · 2023-03-08T18:11:41Z

This is still an issue. If you go to https://jbovlaste.lojban.org/dict/berlín and try to add the word, it gets increasingly krakozabra as you go through the steps (berlÃn, then berlÃƒÂn, etc).

rlpowell · 2023-03-11T07:58:09Z

I poked in a few places, but I don't actually have an accented valsi I want to add, so I haven't really tested it end-to-end.

Having said that, "berlín" is now recognized as nalvla, which isn't surprising to me as I don't think any of the parsers attempted to handle this.

The actual test that's being run here is vlatai.py from https://github.com/teleological/camxes-py , as far as I can tell.

I have not committed my current changes; let me know what you think of them.

rlpowell · 2023-03-12T17:06:28Z

Went ahead and checked in what I did, as it certainly doesn't make anything worse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding problem with accent marks #195

Encoding problem with accent marks #195

solpahi commented Sep 29, 2015

lynn commented Mar 8, 2023

rlpowell commented Mar 11, 2023

rlpowell commented Mar 12, 2023

Encoding problem with accent marks #195

Encoding problem with accent marks #195

Comments

solpahi commented Sep 29, 2015

lynn commented Mar 8, 2023

rlpowell commented Mar 11, 2023

rlpowell commented Mar 12, 2023