Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: added support for bitcode-alt-1.0.0 #3506

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

35C4n0r
Copy link
Collaborator

@35C4n0r 35C4n0r commented Sep 3, 2023

No description provided.

@pombredanne
Copy link
Member

@35C4n0r did you check that you still have a difference in license detection results if you use the regular intbitset or the bitcode with your small test data set and index?

@pombredanne
Copy link
Member

And did you try to run each licensedcode/test_xxx module alone without running the big data driven tests?

@pombredanne
Copy link
Member

One extra thing to consider: Things may fail with different detections because they are slower. The detection runs with a deadline after which anything detected so far is returned. This may affect very slow detections with the new library.

@35C4n0r
Copy link
Collaborator Author

35C4n0r commented Sep 9, 2023

@35C4n0r did you check that you still have a difference in license detection results if you use the regular intbitset or the bitcode with your small test data set and index?

@pombredanne
I have put trace = True in index.py

result for scancode toolkit

============================= test session starts =============================
collecting ... 
1 tests selected, 0 tests skipped.
collected 1 item

fails\test1.py::TestFailed::test_stable_firmwar_realtek_copyright_detailed_expected_yml 

============================== 1 passed in 1.20s ==============================
PASSED [100%]{'licensed', 'license', 'licence'}
LicenseIndex: building index.
rules_by_rid: 0 Rule(identifier='intel.LICENSE', license_expression='intel', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 1 Rule(identifier='other-permissive.LICENSE', license_expression='other-permissive', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 2 Rule(identifier='free-unknown_88.RULE', license_expression='free-unknown', minimum_coverage=0, is_continuous=False, relevance=50, has_stored_relevance=True, length=0)
rules_by_rid: 3 Rule(identifier='free-unknown_96.RULE', license_expression='free-unknown', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 4 Rule(identifier='intel_1.RULE', license_expression='intel', minimum_coverage=0, is_continuous=False, relevance=95, has_stored_relevance=True, length=0)
rules_by_rid: 5 Rule(identifier='license-intro_2.RULE', license_expression='unknown-license-reference', minimum_coverage=0, is_continuous=False, relevance=50, has_stored_relevance=True, length=0)
rules_by_rid: 6 Rule(identifier='other-permissive_66.RULE', license_expression='other-permissive', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
LicenseIndex: token, frequency
LicenseIndex: built index with 7 rules in 0.024969 seconds.
Index statistics will be approximate: `pip install pympler` for correct structure sizes
Index statistics:
   dictionary             : length    : 4567
   dictionary             : size in MB: 0.14
   tokens_by_tid          : length    : 4567
   tokens_by_tid          : size in MB: 0.04
   rid_by_hash            : length    : 7
   rid_by_hash            : size in MB: 0.0
   rules_by_rid           : length    : 7
   rules_by_rid           : size in MB: 0.0
   tids_by_rid            : length    : 7
   tids_by_rid            : size in MB: 0.0
   sets_by_rid            : length    : 7
   sets_by_rid            : size in MB: 0.0
   msets_by_rid           : length    : 7
   msets_by_rid           : size in MB: 0.0
   regular_rids           : length    : 7
   regular_rids           : size in MB: 0.0
   approx_matchable_rids  : length    : 4
   approx_matchable_rids  : size in MB: 0.0
   false_positive_rids    : length    : 0
   false_positive_rids    : size in MB: 0.0
    TOTAL internals in MB: 0.18
    TOTAL real size in MB: 0.0
Index.match: for: None query: <licensedcode.query.Query object at 0x000001DAB1CE0F40>

match_query: matching with matcher: aho
matched with: aho: 2
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)

match_query: matching with matcher: spdx_lid
matched with: spdx_lid: 0

match_query: matching with matcher: seq
get_query_run_approximate_matches: near dupe candidates:
1 ScoresVector(is_highly_resemblant=True, containment=1.0, resemblance=0.7, matched_length=14.7) ScoresVector(is_highly_resemblant=True, containment=1.0, resemblance=0.7426384083044983, matched_length=293) intel_1.RULE
2 ScoresVector(is_highly_resemblant=True, containment=1.0, resemblance=0.7, matched_length=14.7) ScoresVector(is_highly_resemblant=True, containment=0.9932203389830508, resemblance=0.7339779761294074, matched_length=293) intel.LICENSE
get_query_run_approximate_matches: rule_matches:: 1
get_query_run_approximate_matches: rule_matches: MATCHED TEXTS
LicenseMatch: 'intel', lines=(18, 53), matcher='3-seq', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of [Realtek] [Semiconductor] Corporation nor the names of its suppliers
may be used to endorse or promote products derived from this software without
specific prior written permission. * No reverse engineering, decompilation, or
disassembly of this software is permitted. . Limited patent license. [Realtek]
[Semiconductor] Corporation grants a world-wide, royalty-free, non-exclusive
license under patents it now or hereafter owns or controls to make, have made,
use, import, offer to sell and sell ("Utilize") this software, but solely to the
extent that any such patent is necessary to Utilize the software alone, or in
combination with an operating system licensed under an approved Open Source
license as listed by the Open Source Initiative at
http://opensource.org/licenses. The patent license shall not apply to any other
combinations which include this software. No hardware per se is licensed
hereunder. . DISCLAIMER. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of corporation nor the names of its suppliers may be used to endorse or
promote products derived from this software without specific prior written
permission no reverse engineering decompilation or disassembly of this software
is permitted limited patent license corporation grants world wide royalty free
non exclusive license under patents it now or hereafter owns or controls to make
have made use import offer to sell and sell utilize this software but solely to
the extent that any such patent is necessary to utilize the software alone or in
combination with an operating system licensed under an approved open source
license as listed by the open source initiative at http opensource org licenses
the patent license shall not apply to any other combinations which include this
software no hardware per se is licensed hereunder disclaimer this software is
provided by the copyright holders and contributors as is and any express or
implied warranties including but not limited to the implied warranties of
merchantability and fitness for particular purpose are disclaimed in no event
shall the copyright owner or contributors be liable for any direct indirect
incidental special exemplary or consequential damages including but not limited
to procurement of substitute goods or services loss of use data or profits or
business interruption however caused and on any theory of liability whether in
contract strict liability or tort including negligence or otherwise arising in
any way out of the use of this software even if advised of the possibility of
such damage
get_query_run_approximate_matches: rule_matches:: 3
get_query_run_approximate_matches: rule_matches: MATCHED TEXTS
LicenseMatch: 'intel', lines=(18, 25), matcher='3-seq', rid=intel.LICENSE, sc=14.92, cov=14.92, len=44, hilen=10, rlen=295, qreg=(47, 90), ireg=(0, 43)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of
LicenseMatch: 'intel', lines=(25, 31), matcher='3-seq', rid=intel.LICENSE, sc=12.88, cov=12.88, len=38, hilen=10, rlen=295, qreg=(91, 128), ireg=(45, 82)
  MATCHED QUERY TEXT: Corporation nor the names of its suppliers may be used to endorse or promote
products derived from this software without specific prior written permission. *
No reverse engineering, decompilation, or disassembly of this software is
permitted. . Limited patent license.
  MATCHED RULE TEXT: corporation nor the names of its suppliers may be used to endorse or promote
products derived from this software without specific prior written permission no
reverse engineering decompilation or disassembly of this software is permitted
limited patent license
LicenseMatch: 'intel', lines=(31, 53), matcher='3-seq', rid=intel.LICENSE, sc=71.53, cov=71.53, len=211, hilen=51, rlen=295, qreg=(129, 339), ireg=(84, 294)
  MATCHED QUERY TEXT: Corporation grants a world-wide, royalty-free, non-exclusive license under
patents it now or hereafter owns or controls to make, have made, use, import,
offer to sell and sell ("Utilize") this software, but solely to the extent that
any such patent is necessary to Utilize the software alone, or in combination
with an operating system licensed under an approved Open Source license as
listed by the Open Source Initiative at http://opensource.org/licenses. The
patent license shall not apply to any other combinations which include this
software. No hardware per se is licensed hereunder. . DISCLAIMER. THIS SOFTWARE
IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  MATCHED RULE TEXT: corporation grants world wide royalty free non exclusive license under patents
it now or hereafter owns or controls to make have made use import offer to sell
and sell utilize this software but solely to the extent that any such patent is
necessary to utilize the software alone or in combination with an operating
system licensed under an approved open source license as listed by the open
source initiative at http opensource org licenses the patent license shall not
apply to any other combinations which include this software no hardware per se
is licensed hereunder disclaimer this software is provided by the copyright
holders and contributors as is and any express or implied warranties including
but not limited to the implied warranties of merchantability and fitness for
particular purpose are disclaimed in no event shall the copyright owner or
contributors be liable for any direct indirect incidental special exemplary or
consequential damages including but not limited to procurement of substitute
goods or services loss of use data or profits or business interruption however
caused and on any theory of liability whether in contract strict liability or
tort including negligence or otherwise arising in any way out of the use of this
software even if advised of the possibility of such damage
get_approximate_matches: len(query.query_runs): 2
get_query_run_approximate_matches: candidates:
get_query_run_approximate_matches: query_run not matchable: QueryRun(start=0, len=12, start_line=1, end_line=4)
get_query_run_approximate_matches: candidates:
1 ScoresVector(is_highly_resemblant=False, containment=1.0, resemblance=0.4, matched_length=1.1) ScoresVector(is_highly_resemblant=False, containment=1.0, resemblance=0.43183673469387757, matched_length=23) other-permissive_66.RULE
get_query_run_approximate_matches: rule_matches:: 1
get_query_run_approximate_matches: rule_matches: MATCHED TEXTS
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='3-seq', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
  MATCHED QUERY TEXT: Permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format, provided this copyright notice is accompanying
it.
  MATCHED RULE TEXT: permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format provided this copyright notice is accompanying
it
get_query_run_approximate_matches: rule_matches:: 0
get_query_run_approximate_matches: rule_matches: MATCHED TEXTS
matched with: seq: 3
LicenseMatch: 'intel', lines=(18, 25), matcher='3-seq', rid=intel.LICENSE, sc=99.32, cov=99.32, len=293, hilen=71, rlen=295, qreg=(47, 339), ireg=(0, 294)
LicenseMatch: 'intel', lines=(18, 53), matcher='3-seq', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='3-seq', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)

matches before final merge: 2
matches before final merge MATCHED TEXTS
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of [Realtek] [Semiconductor] Corporation nor the names of its suppliers
may be used to endorse or promote products derived from this software without
specific prior written permission. * No reverse engineering, decompilation, or
disassembly of this software is permitted. . Limited patent license. [Realtek]
[Semiconductor] Corporation grants a world-wide, royalty-free, non-exclusive
license under patents it now or hereafter owns or controls to make, have made,
use, import, offer to sell and sell ("Utilize") this software, but solely to the
extent that any such patent is necessary to Utilize the software alone, or in
combination with an operating system licensed under an approved Open Source
license as listed by the Open Source Initiative at
http://opensource.org/licenses. The patent license shall not apply to any other
combinations which include this software. No hardware per se is licensed
hereunder. . DISCLAIMER. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of corporation nor the names of its suppliers may be used to endorse or
promote products derived from this software without specific prior written
permission no reverse engineering decompilation or disassembly of this software
is permitted limited patent license corporation grants world wide royalty free
non exclusive license under patents it now or hereafter owns or controls to make
have made use import offer to sell and sell utilize this software but solely to
the extent that any such patent is necessary to utilize the software alone or in
combination with an operating system licensed under an approved open source
license as listed by the open source initiative at http opensource org licenses
the patent license shall not apply to any other combinations which include this
software no hardware per se is licensed hereunder disclaimer this software is
provided by the copyright holders and contributors as is and any express or
implied warranties including but not limited to the implied warranties of
merchantability and fitness for particular purpose are disclaimed in no event
shall the copyright owner or contributors be liable for any direct indirect
incidental special exemplary or consequential damages including but not limited
to procurement of substitute goods or services loss of use data or profits or
business interruption however caused and on any theory of liability whether in
contract strict liability or tort including negligence or otherwise arising in
any way out of the use of this software even if advised of the possibility of
such damage
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
  MATCHED QUERY TEXT: Permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format, provided this copyright notice is accompanying
it.
  MATCHED RULE TEXT: permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format provided this copyright notice is accompanying
it
final matches: 2
final matches MATCHED TEXTS
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
  MATCHED QUERY TEXT: Permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format, provided this copyright notice is accompanying
it.
  MATCHED RULE TEXT: permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format provided this copyright notice is accompanying
it
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of [Realtek] [Semiconductor] Corporation nor the names of its suppliers
may be used to endorse or promote products derived from this software without
specific prior written permission. * No reverse engineering, decompilation, or
disassembly of this software is permitted. . Limited patent license. [Realtek]
[Semiconductor] Corporation grants a world-wide, royalty-free, non-exclusive
license under patents it now or hereafter owns or controls to make, have made,
use, import, offer to sell and sell ("Utilize") this software, but solely to the
extent that any such patent is necessary to Utilize the software alone, or in
combination with an operating system licensed under an approved Open Source
license as listed by the Open Source Initiative at
http://opensource.org/licenses. The patent license shall not apply to any other
combinations which include this software. No hardware per se is licensed
hereunder. . DISCLAIMER. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of corporation nor the names of its suppliers may be used to endorse or
promote products derived from this software without specific prior written
permission no reverse engineering decompilation or disassembly of this software
is permitted limited patent license corporation grants world wide royalty free
non exclusive license under patents it now or hereafter owns or controls to make
have made use import offer to sell and sell utilize this software but solely to
the extent that any such patent is necessary to utilize the software alone or in
combination with an operating system licensed under an approved open source
license as listed by the open source initiative at http opensource org licenses
the patent license shall not apply to any other combinations which include this
software no hardware per se is licensed hereunder disclaimer this software is
provided by the copyright holders and contributors as is and any express or
implied warranties including but not limited to the implied warranties of
merchantability and fitness for particular purpose are disclaimed in no event
shall the copyright owner or contributors be liable for any direct indirect
incidental special exemplary or consequential damages including but not limited
to procurement of substitute goods or services loss of use data or profits or
business interruption however caused and on any theory of liability whether in
contract strict liability or tort including negligence or otherwise arising in
any way out of the use of this software even if advised of the possibility of
such damage
[LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22), LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)]

Process finished with exit code 0

result for scancode toolkit bitcode

============================= test session starts =============================
collecting ... 
1 tests selected, 0 tests skipped.
collected 1 item

test1.py::TestFailed::test_stable_firmwar_realtek_copyright_detailed_expected_yml 

============================== 1 passed in 1.18s ==============================
PASSED [100%]{'license', 'licensed', 'licence'}
LicenseIndex: building index.
rules_by_rid: 0 Rule(identifier='intel.LICENSE', license_expression='intel', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 1 Rule(identifier='other-permissive.LICENSE', license_expression='other-permissive', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 2 Rule(identifier='free-unknown_88.RULE', license_expression='free-unknown', minimum_coverage=0, is_continuous=False, relevance=50, has_stored_relevance=True, length=0)
rules_by_rid: 3 Rule(identifier='free-unknown_96.RULE', license_expression='free-unknown', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
rules_by_rid: 4 Rule(identifier='intel_1.RULE', license_expression='intel', minimum_coverage=0, is_continuous=False, relevance=95, has_stored_relevance=True, length=0)
rules_by_rid: 5 Rule(identifier='license-intro_2.RULE', license_expression='unknown-license-reference', minimum_coverage=0, is_continuous=False, relevance=50, has_stored_relevance=True, length=0)
rules_by_rid: 6 Rule(identifier='other-permissive_66.RULE', license_expression='other-permissive', minimum_coverage=0, is_continuous=False, relevance=100, has_stored_relevance=True, length=0)
LicenseIndex: token, frequency
LicenseIndex: built index with 7 rules in 0.009996 seconds.
Index statistics will be approximate: `pip install pympler` for correct structure sizes
Index statistics:
   dictionary             : length    : 4567
   dictionary             : size in MB: 0.14
   tokens_by_tid          : length    : 4567
   tokens_by_tid          : size in MB: 0.04
   rid_by_hash            : length    : 7
   rid_by_hash            : size in MB: 0.0
   rules_by_rid           : length    : 7
   rules_by_rid           : size in MB: 0.0
   tids_by_rid            : length    : 7
   tids_by_rid            : size in MB: 0.0
   sets_by_rid            : length    : 7
   sets_by_rid            : size in MB: 0.0
   msets_by_rid           : length    : 7
   msets_by_rid           : size in MB: 0.0
   regular_rids           : length    : 7
   regular_rids           : size in MB: 0.0
   approx_matchable_rids  : length    : 4
   approx_matchable_rids  : size in MB: 0.0
   false_positive_rids    : length    : 0
   false_positive_rids    : size in MB: 0.0
    TOTAL internals in MB: 0.18
    TOTAL real size in MB: 0.0
Index.match: for: None query: <licensedcode.query.Query object at 0x000002304DE74940>

match_query: matching with matcher: aho
matched with: aho: 2
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)

match_query: matching with matcher: spdx_lid
matched with: spdx_lid: 0

match_query: matching with matcher: seq
get_approximate_matches: len(query.query_runs): 2
get_query_run_approximate_matches: candidates:
get_query_run_approximate_matches: query_run not matchable: QueryRun(start=0, len=12, start_line=1, end_line=4)
get_query_run_approximate_matches: candidates:
1 ScoresVector(is_highly_resemblant=False, containment=1.0, resemblance=0.4, matched_length=1.1) ScoresVector(is_highly_resemblant=False, containment=1.0, resemblance=0.43183673469387757, matched_length=23) other-permissive_66.RULE
get_query_run_approximate_matches: rule_matches:: 0
get_query_run_approximate_matches: rule_matches: MATCHED TEXTS
matched with: seq: 0

matches before final merge: 2
matches before final merge MATCHED TEXTS
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of [Realtek] [Semiconductor] Corporation nor the names of its suppliers
may be used to endorse or promote products derived from this software without
specific prior written permission. * No reverse engineering, decompilation, or
disassembly of this software is permitted. . Limited patent license. [Realtek]
[Semiconductor] Corporation grants a world-wide, royalty-free, non-exclusive
license under patents it now or hereafter owns or controls to make, have made,
use, import, offer to sell and sell ("Utilize") this software, but solely to the
extent that any such patent is necessary to Utilize the software alone, or in
combination with an operating system licensed under an approved Open Source
license as listed by the Open Source Initiative at
http://opensource.org/licenses. The patent license shall not apply to any other
combinations which include this software. No hardware per se is licensed
hereunder. . DISCLAIMER. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of corporation nor the names of its suppliers may be used to endorse or
promote products derived from this software without specific prior written
permission no reverse engineering decompilation or disassembly of this software
is permitted limited patent license corporation grants world wide royalty free
non exclusive license under patents it now or hereafter owns or controls to make
have made use import offer to sell and sell utilize this software but solely to
the extent that any such patent is necessary to utilize the software alone or in
combination with an operating system licensed under an approved open source
license as listed by the open source initiative at http opensource org licenses
the patent license shall not apply to any other combinations which include this
software no hardware per se is licensed hereunder disclaimer this software is
provided by the copyright holders and contributors as is and any express or
implied warranties including but not limited to the implied warranties of
merchantability and fitness for particular purpose are disclaimed in no event
shall the copyright owner or contributors be liable for any direct indirect
incidental special exemplary or consequential damages including but not limited
to procurement of substitute goods or services loss of use data or profits or
business interruption however caused and on any theory of liability whether in
contract strict liability or tort including negligence or otherwise arising in
any way out of the use of this software even if advised of the possibility of
such damage
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
  MATCHED QUERY TEXT: Permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format, provided this copyright notice is accompanying
it.
  MATCHED RULE TEXT: permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format provided this copyright notice is accompanying
it
final matches: 2
final matches MATCHED TEXTS
LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22)
  MATCHED QUERY TEXT: Permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format, provided this copyright notice is accompanying
it.
  MATCHED RULE TEXT: permission is hereby granted for the distribution of this firmware data in
hexadecimal or equivalent format provided this copyright notice is accompanying
it
LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)
  MATCHED QUERY TEXT: Redistribution. Redistribution and use in binary form, without modification, are
permitted provided that the following conditions are met: . * Redistributions
must reproduce the above copyright notice and the following disclaimer in the
documentation and/or other materials provided with the distribution. * Neither
the name of [Realtek] [Semiconductor] Corporation nor the names of its suppliers
may be used to endorse or promote products derived from this software without
specific prior written permission. * No reverse engineering, decompilation, or
disassembly of this software is permitted. . Limited patent license. [Realtek]
[Semiconductor] Corporation grants a world-wide, royalty-free, non-exclusive
license under patents it now or hereafter owns or controls to make, have made,
use, import, offer to sell and sell ("Utilize") this software, but solely to the
extent that any such patent is necessary to Utilize the software alone, or in
combination with an operating system licensed under an approved Open Source
license as listed by the Open Source Initiative at
http://opensource.org/licenses. The patent license shall not apply to any other
combinations which include this software. No hardware per se is licensed
hereunder. . DISCLAIMER. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
  MATCHED RULE TEXT: redistribution redistribution and use in binary form without modification are
permitted provided that the following conditions are met redistributions must
reproduce the above copyright notice and the following disclaimer in the
documentation and or other materials provided with the distribution neither the
name of corporation nor the names of its suppliers may be used to endorse or
promote products derived from this software without specific prior written
permission no reverse engineering decompilation or disassembly of this software
is permitted limited patent license corporation grants world wide royalty free
non exclusive license under patents it now or hereafter owns or controls to make
have made use import offer to sell and sell utilize this software but solely to
the extent that any such patent is necessary to utilize the software alone or in
combination with an operating system licensed under an approved open source
license as listed by the open source initiative at http opensource org licenses
the patent license shall not apply to any other combinations which include this
software no hardware per se is licensed hereunder disclaimer this software is
provided by the copyright holders and contributors as is and any express or
implied warranties including but not limited to the implied warranties of
merchantability and fitness for particular purpose are disclaimed in no event
shall the copyright owner or contributors be liable for any direct indirect
incidental special exemplary or consequential damages including but not limited
to procurement of substitute goods or services loss of use data or profits or
business interruption however caused and on any theory of liability whether in
contract strict liability or tort including negligence or otherwise arising in
any way out of the use of this software even if advised of the possibility of
such damage
[LicenseMatch: 'other-permissive', lines=(9, 11), matcher='2-aho', rid=other-permissive_66.RULE, sc=100.0, cov=100.0, len=23, hilen=5, rlen=23, qreg=(18, 40), ireg=(0, 22), LicenseMatch: 'intel', lines=(18, 53), matcher='2-aho', rid=intel_1.RULE, sc=93.72, cov=100.0, len=293, hilen=71, rlen=293, qreg=(47, 339), ireg=(0, 292)]

Process finished with exit code 0

Signed-off-by: Jay <jaykumar20march@gmail.com>
@35C4n0r
Copy link
Collaborator Author

35C4n0r commented Sep 9, 2023

the match printed here is the same for both intbitset and bitcode.
825ce00#diff-3fc71171a288b34ad77670f5b8fb15e68b7208f1a41219cd5c4cb8756082f5a2R39

The difference is in the Traces
https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/index.py#L853

if TRACE_APPROX_CANDIDATES:

self.debug_matches(

@pombredanne
Copy link
Member

pombredanne commented Sep 9, 2023

@35C4n0r re:

the match printed here is the same for both intbitset and bitcode.

The goal of this test is to use a small index to reproduce the issue of different match results... if the matches are the same, then the test may not be the best. But the intermediate extra sequence matches seem to exhibit the same issue as the test failure with the larger index, so I would try to debug this. The steps using match_set.py are likely the culprit because they make extensive usage of bitsets. There must be some difference in process when manipulating sets, and my hunch would be that you do not intersect sets exactly the same way as sets intersect and as intbitset intersects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants