Failures with quotes #228

phajdan · 2014-12-20T16:20:47Z

Running hunspell in Emacs, or directly with "hunspell -a": single quote is taken as part of
the word, which leads to tons of bogus spelling suggestions. I'd expect something like this
to be known, but a few searches got me nowhere. This is using the version that comes with
Fedora 20: 1.3.3.

Original comment by: elibarzilay

Original Ticket: hunspell/bugs/259

tbm · 2018-04-06T16:09:40Z

I see the same issue with 1.4.1.

tbm · 2018-04-06T16:13:40Z

Still there in 1.6.2. Simple test case:

The next stable release of Debian is `buster'

hunspell will identify buster' (with the quote)

tbm · 2018-04-06T16:28:50Z

This sounds related: en-wl/wordlist#122

tbm · 2018-04-06T16:29:13Z

Related to #504

lmmarsano · 2018-12-07T09:01:48Z

Same issue nearly 4 years later.
According to the ChangeLog

hunspell/ChangeLog

Lines 89 to 95 in 4ddd8ed

    
           * better apostrophe usage: 
        
           - WORDCHARS only with one of the Unicode or ASCII apostrophe 
        
             results extended word tokenization: both of them will be part of 
        
             the words (if they are inside: eg. word's, but not words'). 
        
           - convert Unicode apostrophes to ASCII ones for 8-bit dictionaries 
        
             (eg. English dictionaries), or for UTF-8 dictionaries only 
        
             with ASCII apostrophe supports (eg. French dictionaries).

tokenization should treat interior apostrophes as part of words and exclude boundary apostrophes.
However, the test provided in lmmarsano/hunspell@c825888 fails the assertion: please checkout to see.

luism@lmm-notebook:~/project/hunspell/tests$ ./test.sh apostrophe.dic
=============================================
Fail in apostrophe.good. Good words recognised as wrong:
'is'

I wish I knew enough to PR a fix.

loretoparisi · 2019-03-14T16:28:39Z

This seems to happens for Italian as well:

In Della morte dell'amore, from the tokenizer dell will be considered as wrong with suggestions ["del","della","dello","delle","del l"], where the output for dell' (note the straight apostrophe) is

{ index: 0,
  word: 'dell\'',
  stems: [],
  suggestion: [ 'della', 'dello', 'delle' ],
  correct: false }

mcepl · 2019-08-27T16:19:35Z

Is this the root of this problem:

~$ echo "And no, spellchecking doesn’t work well in vim, because exactly this sentence is marked as misspelled." | hunspell -d en_US --check-apostrophe -l
doesn
~$

Hmm, with en_GB it seems to work, so I guess it is dictionary dependent.

astoff · 2023-03-05T10:57:47Z

This issue is still present in Hunspell 1.7.0, and includes the en_GB dictionary:

$ echo "He asked, 'Why can't I quote?'" | hunspell -d en_GB
Hunspell 1.7.0
*
*
& 'Why 1 10: why
*
*
*
& ' 15 29: e, s, i, a, n, r, t, o, l, c, d, u, g, m, f

Atemu · 2024-04-07T09:25:58Z

This is an issue with the dictionaries. The English hunspell dicts from https://sourceforge.net/projects/wordlist/files/speller/ contain this line in their aff files:

WORDCHARS 0123456789

but this must include the ' character like this:

WORDCHARS 0123456789'

in order to detect contractions like "doesn't" as a single word.

If I manually modify change the dict accordingly, it works as expected.

phajdan added sourceforge auto-migrated labels Jul 8, 2015

dimztimz removed v1.0 (example) labels Nov 13, 2017

tbm mentioned this issue Apr 6, 2018

Problems with apostrophes #504

Open

lmmarsano linked a pull request Dec 8, 2018 that will close this issue

fix #228 (failures with quotes) #613

Open

foghawk mentioned this issue Jun 16, 2019

spell: language-inappropriate tokenization of hyphenated words leads to false positives, errors weechat/weechat#1360

Open

astoff mentioned this issue Mar 25, 2023

Jit-spell causes noticeable increase in latency astoff/jit-spell#9

Closed

Jamim mentioned this issue Jan 29, 2024

What is the correct way to get hunspell to reconize the apostrophe? #641

Open

Atemu mentioned this issue Apr 7, 2024

Update request: english hunspellDicts 2018.04.16 -> 2020.12.07 / 2024-04-01 NixOS/nixpkgs#302305

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failures with quotes #228

Failures with quotes #228

phajdan commented Dec 20, 2014

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

lmmarsano commented Dec 7, 2018 •

edited

loretoparisi commented Mar 14, 2019 •

edited

mcepl commented Aug 27, 2019

astoff commented Mar 5, 2023 •

edited

Atemu commented Apr 7, 2024

Failures with quotes #228

Failures with quotes #228

Comments

phajdan commented Dec 20, 2014

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

tbm commented Apr 6, 2018

lmmarsano commented Dec 7, 2018 • edited

loretoparisi commented Mar 14, 2019 • edited

mcepl commented Aug 27, 2019

astoff commented Mar 5, 2023 • edited

Atemu commented Apr 7, 2024

lmmarsano commented Dec 7, 2018 •

edited

loretoparisi commented Mar 14, 2019 •

edited

astoff commented Mar 5, 2023 •

edited