Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures in 2.5.0 with Python 3.12 #20

Open
eclipseo opened this issue Oct 26, 2023 · 4 comments
Open

Test failures in 2.5.0 with Python 3.12 #20

eclipseo opened this issue Oct 26, 2023 · 4 comments

Comments

@eclipseo
Copy link

test_guess_encoding, test_petro_iso_encoded, test_predict_encoding are failing in 2.5.0 with Python 3.12:

============================= test session starts ==============================
platform linux -- Python 3.12.0, pytest-7.4.2, pluggy-1.3.0
rootdir: /builddir/build/BUILD/normality-2.5.0
collected 19 items
tests/test_normality.py .....F..F.F....                                  [ 78%]
tests/test_paths.py ..                                                   [ 89%]
tests/test_scripts.py ..                                                 [100%]
=================================== FAILURES ===================================
______________________ NormalityTest.test_guess_encoding _______________________
self = <tests.test_normality.NormalityTest testMethod=test_guess_encoding>
    def test_guess_encoding(self):
        text = u"Порошенко Петро Олексійович"
        encoded = text.encode("iso-8859-5")
        out = guess_encoding(encoded)
>       self.assertEqual("iso8859-5", out)
E       AssertionError: 'iso8859-5' != 'cp1006'
E       - iso8859-5
E       + cp1006
tests/test_normality.py:72: AssertionError
_____________________ NormalityTest.test_petro_iso_encoded _____________________
self = <tests.test_normality.NormalityTest testMethod=test_petro_iso_encoded>
    def test_petro_iso_encoded(self):
        text = u"Порошенко Петро Олексійович"
        encoded = text.encode("iso8859-5")
        out = stringify(encoded)
>       self.assertEqual(text, out)
E       AssertionError: 'Порошенко Петро Олексійович' != 'ﺟﻐﻓﻐﻟﻁﻏﻌﻐ ﺟﻁﻗﻓﻐ ﺝﻍﻁﻌﻕﺉﻋﻐﺻﻊﻝ'
E       - Порошенко Петро Олексійович
E       + ﺟﻐﻓﻐﻟﻁﻏﻌﻐ ﺟﻁﻗﻓﻐ ﺝﻍﻁﻌﻕﺉﻋﻐﺻﻊﻝ
tests/test_normality.py:94: AssertionError
_____________________ NormalityTest.test_predict_encoding ______________________
self = <tests.test_normality.NormalityTest testMethod=test_predict_encoding>
    def test_predict_encoding(self):
        text = u"Порошенко Петро Олексійович"
        encoded = text.encode("iso-8859-5")
        out = predict_encoding(encoded)
>       self.assertEqual("iso8859-5", out)
E       AssertionError: 'iso8859-5' != 'cp1006'
E       - iso8859-5
E       + cp1006
tests/test_normality.py:78: AssertionError
=============================== warnings summary ===============================
tests/test_normality.py::NormalityTest::test_guess_encoding
  /builddir/build/BUILD/normality-2.5.0/normality/encoding.py:76: DeprecationWarning: guess_encoding is now deprecated. Use predict_encoding instead
    warnings.warn(
tests/test_normality.py::NormalityTest::test_guess_file_encoding
  /builddir/build/BUILD/normality-2.5.0/normality/encoding.py:95: DeprecationWarning: guess_encoding is now deprecated. Use predict_encoding instead
    warnings.warn(
tests/test_normality.py::NormalityTest::test_guess_file_encoding
  /builddir/build/BUILD/normality-2.5.0/normality/encoding.py:41: DeprecationWarning: normalize_result is now deprecated. Use tidy_result instead
    warnings.warn(
tests/test_normality.py::NormalityTest::test_guess_file_encoding
  /builddir/build/BUILD/normality-2.5.0/normality/encoding.py:16: DeprecationWarning: normalize_encoding is now deprecated. Use tidy_encoding instead
    warnings.warn(
tests/test_normality.py::NormalityTest::test_stringify_datetime
  /builddir/build/BUILD/normality-2.5.0/tests/test_normality.py:64: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
    dt = datetime.utcnow()
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_normality.py::NormalityTest::test_guess_encoding - Assertio...
FAILED tests/test_normality.py::NormalityTest::test_petro_iso_encoded - Asser...
FAILED tests/test_normality.py::NormalityTest::test_predict_encoding - Assert...
=================== 3 failed, 16 passed, 5 warnings in 0.21s ===================
@pudo
Copy link
Owner

pudo commented Oct 27, 2023

Thanks for reporting this - looks super funky (cp1006 is Urdu, as far as I can tell). Can you tell me what version of charset-normalizer you have installed?

@eclipseo
Copy link
Author

@eclipseo
Copy link
Author

They have a bug with incorrect detection in 3.3.1: jawah/charset_normalizer#371
though not cp1006

@Ousret
Copy link

Ousret commented Nov 1, 2023

Is this fixed? I issued a release to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants