The major features of this release
Improved PICA handling
PICA is an alternative bibliographic metadata schema used in Germany, The Netherlands and France. The development of PICA related features were done in cooperation with K10Plus, the largest union catalogue of Germany. Now the analyses of PICA records covers completeness, validation, subject heading and authority name analyses plus searching and displaying individual records.
Handling union catalogues
Union catalogues covers the collections of multiple libraries. Now QA catalogue could display the results of completeness, validation, searching and term list for both the whole catalogue and for any individual library.
SHACL4bib
Shape Expression Constraints Language (SHACL) has been adapted to MARC and PICA records. It provides a customized analysis for a library, so it can write a configuration file to check records against their own customs and ruleset which are not part of the core standard. This feature was party developed by Jean Michel Nzi Mba as part of his Bachelor thesis.
Other features
Improved command line interface, documentation. The code base has been more robust thanks to hints from code quality assessment framework Sonar.
Contributors
In the creation of this release Jakob Voß (VZG) and Jean Michel Nzi Mba (University of Göttingen) provided important contributions. Special thanks to Verbundzentrale des GBV (VZG), GWDG and JetBrains for supporting the development.
Details
Group values by library
- #199: Group results in completeness
- #200: Group results in issues
- #246: Filter results in data tab
- #254: Fixing performance issue for groupping validation
- #253: Creation of id-groupid.csv required for validation
PICA changes
- #163: PICA: general changes
- #190: Extend PICA subject fields
- #215: issue #215: Completeness: check occurrence numbers
- #232: Adding XML serialization for PICA
- #234: Making occurrence a first class citizen of PICA data fields
- #247: Uniqueness of PICA field ranges reported wrongly
- #251: PICA: fixing reading of gzipped files
- #250: Copy Avram schema to output directory
- Adjust K10plus Avram schema
Shacl4bib
Command line interface
- common-script: die if input files don't exist
- common-script: disable colors if not run via terminal
- common-script: emit DONE only for processing steps
- common-script: show UPDATE on config
- Add default settings to setdir.sh
- Add configuration varaible UPDATE and summarize configuration
- Add configuration variable ANALYSES for all-analyses
- Refactor common-script
- Allow globs in MASK
- Fixing parameter removal from catalogue specific params
- Ignore default input/output also when they are symlinks
- Improve downloaders
- Improve KB downloader
- Update ONB downloader
- Improve output of common-script
- Add input directory to ONB downloader
- #223: Create a configuration file for Zentralbibliothek Zürich #223
- masking ZB
- #265: 'all' command should run only the selected tasks if schema is PICA #265
- Update catalogue scripts
- Update catalogues
- Make common-script more robust
- Make setdir.sh optional
- Make sqlite more robust
- Remove unnecessary ; chars
- Simplify bash scripts
- Simplify catalogues/k10plus_*.sh
- Remove duplicated DONE in catalog scripts
- Remove unused parts
- Support setting MASK in setdir.sh (k10plus_pica only)
Documentation
- README.md: Adjust path to run helper script
- Create CONTRIBUTING
- Better definition of the tool in the README
- Adding sponsors section
- Adding Binghampton University Libraries to the list of users
- Add SonarCloud badge
- #196: issue #196: update README
- #244: Document dependencies (close #244)
- Rename CONTRIBUTING to CONTRIBUTING.md
- Update test schema README file
CSV generation
- #216: Completeness: use proper CSV library to generate .csv
- #242: Validation: use proper CSV library to generate .csv
other
- #227: The data field (without subfields) are categorized as "unknown origin" in marc-elements.csv #227
Dependency updates
- upgrade com.fasterxml.jackson.core from 2.13.4 to 2.15.0
- upgrade org.apache.logging.log4j from 2.19.0 to 2.20.0
- upgrade org.apache.solr from 9.1.0 to 9.2.0
- upgrade org.apache.spark from 3.3.1 to 3.3.2
- upgrade org.mongodb:bson from 4.7.2 to 4.9.1
- upgrade org.mongodb:mongo-java-driver from 3.12.11 to 3.12.13
- upgrade org.xerial:sqlite-jdbc from 3.39.3.0 to 3.41.2.1
Debugging, refactoring, performance inmprovement
- Implement Sonar suggestions.
- #269: Build failure: testing
- Add coveralls report integration
- Improve performance of classification analysis
- Improve test coverage
- Improving performance
- Fix a missing character from the Docker description.
Files
qa-catalogue-0.7.0-release.zip
: all the files which need to run the software. Download, unzip and go!qa-catalogue-0.7.0-jar-with-dependencies.jar
: the Java library file with all the dependenciesqa-catalogue-0.7.0.jar
: the Java library file without dependencies