Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalizer error due to invalid number characters #150

Open
Kensvin28 opened this issue Jun 27, 2023 · 2 comments
Open

Normalizer error due to invalid number characters #150

Kensvin28 opened this issue Jun 27, 2023 · 2 comments
Labels
wontfix This will not be worked on

Comments

@Kensvin28
Copy link

(file:///C:/Users/PAVILION/AppData/Local/Programs/Python/Python310/lib/site-packages/malaya/normalizer/rules.py:165), in check_repeat(word)
162 return word, 1
164 if word[-1].isdigit() and not word[-2].isdigit():
--> 165 repeat = int(word[-1])
166 word = word[:-1]
167 else:

ValueError: invalid literal for int() with base 10: '²'

², ³, and some other Unicode characters like U+2776 (❶) - U+2792 (➒) returns true for isdigit(), but cannot be converted into int, so it returns a value error.

@huseinzol05
Copy link
Member

haha, nice one.

@MagusWyvern MagusWyvern added the wontfix This will not be worked on label Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

4 participants
@huseinzol05 @Kensvin28 @MagusWyvern and others