Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TestDeepScan.py #1357

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update TestDeepScan.py #1357

wants to merge 1 commit into from

Conversation

verdy-p
Copy link

@verdy-p verdy-p commented Apr 19, 2022

safer exclusion of *~ backup files starting by a file, and the last character before ~ may be also a digit (e.g. index.php7~ or README2~

@abitrolly
Copy link
Contributor

Test are broken. Are all commits pushed to this PR?

@az0
Copy link
Member

az0 commented Apr 28, 2022

@verdy-p Yes, please address the build error as @abitrolly mentioned

For convenience, here is the error from AppVeyor

======================================================================
ERROR: test_DeepScan (tests.TestDeepScan.DeepScanTestCase)
Unit test for class DeepScan.  Preview real files.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\projects\bleachbit\tests\TestDeepScan.py", line 81, in test_DeepScan
    for cmd in ds.scan():
  File "C:\projects\bleachbit\bleachbit\DeepScan.py", line 105, in scan
    compiled_searches = [CompiledSearch(s) for s in searches]
  File "C:\projects\bleachbit\bleachbit\DeepScan.py", line 105, in <listcomp>
    compiled_searches = [CompiledSearch(s) for s in searches]
  File "C:\projects\bleachbit\bleachbit\DeepScan.py", line 67, in __init__
    self.regex = re_compile(search.regex)
  File "C:\projects\bleachbit\bleachbit\DeepScan.py", line 65, in re_compile
    return re.compile(regex, fs_scan_re_flags) if regex else None
  File "c:\python34\lib\re.py", line 223, in compile
    return _compile(pattern, flags)
  File "c:\python34\lib\re.py", line 294, in _compile
    p = sre_compile.compile(pattern, flags)
  File "c:\python34\lib\sre_compile.py", line 568, in compile
    p = sre_parse.parse(p, flags)
  File "c:\python34\lib\sre_parse.py", line 780, in parse
    p = _parse_sub(source, pattern, 0)
  File "c:\python34\lib\sre_parse.py", line 377, in _parse_sub
    itemsappend(_parse(source, state))
  File "c:\python34\lib\sre_parse.py", line 730, in _parse
    raise error("unbalanced parenthesis")
sre_constants.error: unbalanced parenthesis
----------------------------------------------------------------------

Full log: https://ci.appveyor.com/project/az0/bleachbit/builds/43277935#L2245

@verdy-p
Copy link
Author

verdy-p commented Apr 28, 2022

This is a building process error, and not an invalid regexp.
The message given "sre_constants.error: unbalanced parenthesis" is wrong, because parentheses are correctly balanced. I just made this:

- for regex in ('^Makefile$', '~$', 'bak$', '^Thumbs.db$', '^Thumbs.db:encryptable$'):
+ for regex in ('^(Makefile|Thumbs(:encryptable)?\\.db|[^.].*[0-9A-Za-z]~|bak)$'):

I think that AppSurveyor is confused, maybe by the ```(:encryptable)```` part. But the regexp is valid in all Regexp syntaxes checked for example on https://regex101.com/

There's no such error report when using it in any version of Python I know.

This is then an unexeplained bug of AppSurveyor making an incorrect parsing.

@abitrolly
Copy link
Contributor

@verdy-p did you run the test you've modified yourself? I see at least one logical error in this modification.

@verdy-p
Copy link
Author

verdy-p commented Apr 29, 2022

Yes I have run this in BleachBit itself. It works perfectly (and it is considerably faster too! in fact the "for" loop here is not necessary as it will loop only once, but it has been kept)

What can of logical error do you see? Here the regexp is anchored on both sides (start and end), and the two regexps '~$', 'bak$' have been anchored too, by prefixing them by '[^.].*' (to exclude "hidden" filenames starting by a dot, that should not be "cleaned", then match remaining characters of the filename). This change however does not invalidate the regexp.

This is a one-line modification, only in the regexp string, and the regexp is perfectly valid, in Python (the parsing made by AppSurveyor is clearly wrong, all parentheses are correctly paired).

Note that AppSurveyor compiles the regexp in the source, but does not show which "flags" it uses to compile the constant string containing the regexp. Does AppSurveyor use the standard regexp engine of Python (the same engine as used by Bleachbit itself)?

@az0
Copy link
Member

az0 commented Jan 7, 2023

@verdy-p

Want to try with the new mkhon-python310 branch? It is a major update of the Python version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants