You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Turkish Hunspell .aff file has over 50,000+ affixes, all of which say N (No) for the suffix. They are also including very long suffixes.
SFX 3 N 1
SFX 3 0 cilerdensin .
According to this blog post, there are I think around 700 suffixes last time I counted. Then they can be combined in arbitrary ways, sometimes having over 10+ suffixes concatenated onto the base word. I would think in principle you would store some sort of Directed Acyclic Graph for allowing dynamically computing possible/theoretical words which have never been encountered before, but it appears the Hunspell Turkish dictionary is precompiling possible suffix chains and just making them as SFX ... N (no chaining). Am I reading that correctly?
In newer Hunspell, is there a more idiomatic way of solving this with less suffixes?
I feel like I read somewhere that Hunspell can only support 2 prefixes or 2 suffixes, or 1 of each together. Is something like that an issue here, the reason for the way they organize the Turkish dictionary?
Thank you so much for your help!
The text was updated successfully, but these errors were encountered:
The Turkish Hunspell
.aff
file has over 50,000+ affixes, all of which sayN
(No) for the suffix. They are also including very long suffixes.According to this blog post, there are I think around 700 suffixes last time I counted. Then they can be combined in arbitrary ways, sometimes having over 10+ suffixes concatenated onto the base word. I would think in principle you would store some sort of Directed Acyclic Graph for allowing dynamically computing possible/theoretical words which have never been encountered before, but it appears the Hunspell Turkish dictionary is precompiling possible suffix chains and just making them as
SFX ... N
(no chaining). Am I reading that correctly?In newer Hunspell, is there a more idiomatic way of solving this with less suffixes?
I feel like I read somewhere that Hunspell can only support 2 prefixes or 2 suffixes, or 1 of each together. Is something like that an issue here, the reason for the way they organize the Turkish dictionary?
Thank you so much for your help!
The text was updated successfully, but these errors were encountered: