Changelog

Unreleased

[v1.9.2] - 2024-05-01 Wed

Add #261: a constructor to return a pool of a given size of gnparsers.

v1.9.1 - 2023-10-13 Fri

Add: update modules.
Fix #259: allow diacritics in any UTF-8 normalization form.
Fix #258: allow authors with 2 dashes in the name.
Fix #256: fix normalization where a misplacced year changes the year of original authors.

v1.9.0 - 2023-10-12 Thu

Add: restore backward compatibility by creating a new flag --species-group-cut.

v1.8.0 - 2023-10-11 Wed

Add #255: normalize stemmed canonical of Aus bus bus to Aus bus. WARNING this creates some backward incompatibility.
Add: sorting uses slices package.

v1.7.5 - 2023-09-26 Tue

Add: CSV and TSV files provide now verbatim authorship instead of normalized one.
Add: a few more "termination words"
Fix #254: treat fa as forma.
Fix #253: process dem as an author word for Von dem Bush and like.
Fix #251: do not process y as and for Rafael Arango y Molina.
Fix #249: allow cf at the end of the strings, cf for infraspecies.
Fix #248: do not escape double quotes for TSV output.
Fix #246: ignore ms at the end of the strings.

v1.7.4 - 2023-08-22 Tue

Fix #243: parse correctly Nassa pagoda var. acuta P. P. Carpenter, 1857.

v1.7.3 - 2023-06-17 Sat

Add #241: allow comma before ex authors.

v1.7.2 - 2023-03-09 Thu

Add #240: add tr. subtr. as ranks for combo-uninomials.

v1.7.1 - 2023-03-07 Tue

Add: upgrade all modules.

v1.7.0 - 2023-03-07 Tue

Add #238: stem takes in account -ii suffix, macdonaldii -> macdonald.

v1.6.9 - 2022-11-10 Thu

Add #237: detect and normalize non-breaking hyphens. In case if other non-typical hythens will appear, they will be dealt the same way.

v1.6.8 - 2022-10-01 Sat

Add: update all modules.

v1.6.7 - 2022-08-22 Mon

Add #231: more edge cases.
Add #230: Take into account mihi annotation.

v1.6.6 - 2022-05-15 Sun

Add #224: Creation of Nix packages for gnparser.

v1.6.5 - 2022-03-21 Mon

Add #223: Use PEG parser for preprocess instead of RegEx. This approach gives 15-17% speed increase.

v1.6.4 - 2022-03-19 Sat

Add #224: Parse correctly italian authors with degli.

v1.6.3 - 2022-02-08 Tue

Add #222: Improve logs for NSQ, switch to zerologs library.

v1.6.2 - 2022-02-04 Fri

Fix #221: No parsing for names with cyanobacterium.
Fix #220: Crenarchaeote enrichment culture clone should stop parsing at enrichment.
Fix #219: filter out complex word during preprocessing for names like Aegla uruguayana complex.

v1.6.1 - 2022-02-01 Sat

Add: use NSQ logger from sfgrp/lognsq

v1.6.0 - 2022-01-22 Sat

Add #218: enable/disable logs for web-services, allow logs aggregation with NSQd.

v1.5.7 - 2021-11-26 Fri

Fix: parsed.NormalizeByType preserves period char.

v1.5.6 - 2021-11-21 Sun

Add #212: Set year from 'ex' authorship as a year of a name. Add 'ex' authors to list of all authors.
Add #211: PR #214 by @tobymarsden, general approach for non-... specific epithets.
Add #208: PR #210 by @tobymarsden, option to preserve diaereses.
Fix #213: Stop generating space between Mc, Mac and the rest of an an author name.

v1.5.5 - 2021-11-17 Wed

Add #207: PR #209 by @tobymarsden, fix parsing of names with nudum specific epithet.

v1.5.4 - 2021-11-14 Sun

Add: different approach for normalize-by-type for words.
Add #205: allow genera starting with De-, Eu-, Le-, Ne- (by @tobymarsden).
Add #203: allow up to 2 dashes in genera (by @tobymarsden).

v1.5.3 - 2021-11-13 Sat

Add #202: add NormalizeMore function for Word.

v1.5.2 - 2021-11-10 Wed

Add #200: support for 'div.' rank in uninomial combinations.
Add #199: fixes for several names that were not parsed correctly.
Add #198: parse "Solanum tuberosum wila-k`oyu".
Add #97: do not parse "Cyanophage".
Add #85: parse names with a dagger character.
Add #84: parse "Muscicapa randi Amadon & duPont, 1970".
Add #83: parse authors like 'Laverde-R.'.

v1.5.1 - 2021-11-01 Mon

Add #191: support for ambiguous specific epithets

v1.5.0 - 2021-10-22 Fri

Add #194: support for cultivars' graft-chymeras (courtesy of @tobymarsden)

v1.4.2 - 2021-10-21 Thu

Add #196: parse authors with prefix 'ver'

v1.4.1 - 2021-10-07 Thu

Fix #195: parse multinomials where authorshp is not separated by space.

v1.4.0 - 2021-09-4 Sat

Add #193: add TSV format for output.
Add #190: support prefixes do and de los for authors.
Add #187: support ter suffix for authors.
Add #186: support non-ASCII apostrophe in authors.

v1.3.3 - 2021-09-11 Wed

Add #176: refactoring of hybrid sign treatment (use PEG instead of RegEx for normalizing x, X, and ×.
Add #183: stop parsing after nec, non, fide, vide, treat ms in as in or ex for exAuthors.
Add #182: support for authors with prefixes ten, delle, dos.

v1.3.2 - 2021-08-02 Mon

Add #182: support Do, Oo, Nu 2-letter genera.
Add [#53]: exceptions to annotations (Bottaria nudum for example).
Fix: names where sp epithet starts with cf can be parsed now.

v1.3.1 - 2021-07-17 Sat

Add #180: Zenodo DOI.

v1.3.0 - 2021-06-29 Tue

Add #179: cultivars info to README.
Add #178: parse cultivars via REST API.
Add #177: parse botanical cultivars via web.
Add #173: cultivars parsing #174 @tobymarsden.
Add #172: authors initials with a dash like "B.-E.van Wyk".
Add: tests for cultivars (Toby Marsden)
Fix #174: Hybrid character is missed or wrong in details' Words section.

v1.2.0 - 2021-04-08 Thu

Add #169: option to capitalize first letter of name-strings.
Add #166: support 'fm.' as 'forma'.

v1.1.0 - 2021-03-21 Sun

Add #163: support bacterial Candidatus names.
Add #162: show PEG AST tree for debugging.
Add #161: add automatic tools dependency.
Add #160: use embed feature of Go v1.16.

v1.0.13 - 2021-02-23 Tue

Add: limit nightly builds to master only.
Fix #159: POST method contains w18rong URL.

v1.0.12 - 2021-02-21 Sun

Add #154: parse names with ambiguous f. as forma if there is a space between authr and f.. If there is no space, parse as filius. Give ambiguity warning in both cases.
Add: PHP example from @barotto about using pipes with gnparser.

v1.0.11 - 2021-02-20 Sat

Fix #153: flags csv=false and with_details=false trigger opposite behavior.

v1.0.10 - 2021-02-19 Fri

Add #152: change auto-prereleases from nightly to on master submit.
Add #151: do not parse names with (endo|ecto)?symbiont.
Add #150: ignore serovar/serotype in bacerital names.
Add #149: support abbreviated subgenus (Aus (B.) cus).

v1.0.9 - 2021-02-17 Wed

Add #146: unordered flag.
Add #145: better CI/build actions, add nightly binaries.
Fix #144: remove configuration file as it creates more problems than solves.

v1.0.8 - 2021-02-15 Mon

Add: remove config message for CLI app.
Add: ldflags -s -w to decrease binary size.
Fix: header does not show in CSV format for stream.

v1.0.7 - 2021-02-14 Sun

Add #143: quiet flag to suppress showing progress output.
Fix #142: stream waits until certain names number is equal the batch size.
Fix #141: config file is not created.

v1.0.6 - 2021-02-04 Thu

Add: update version handling, readme.

v1.0.5 - 2021-02-01 Mon

Add: remove gnlib package.
Add #140: remove config package.

v1.0.4 - 2021-01-23 Sat

Add: cleanup constructor methods names.

v1.0.3 - 2021-01-23 Sat

Add #139: make package names less abstract.

v1.0.2 - 2021-01-22 Fri

Fix #137: add correct VerbatimID for HTML-containing names.

v1.0.1 - 2021-01-20 Wed

Add #136: Man page
Add #100: Switch continuous integration to use GitHub Actions.
Add #129: Make c-binding usable for biodiversity parser.
Fix #135: Changes: SubGenus->Subgenus, InfraSpecies->Infraspecies

v1.0.0 - 2021-01-19 Tue

Add #127: Update documentation to v1.0.0.
Add #122: Implement parsing as a stream in addition to batch parsing.
Add #126: Update c-binding to v1.0.0.
Add #131: Add parameters "with_details" and "csv" to REST API.
Add #134: Transoform "positions" section to "words" section.
Add #128: Add more examples to OpenAPI specification.
Add #125: Describe changes from v0.x to 1.x.
Add #132: Add context.Context to control lifespan of go routines.
Add #115: Migrate tests from ginkgo to plain tests.
Add #109: Move web package to io.
Add #124: Document warnings for each quality category.
Add #121: Convert package parser to use interfaces.
Add #120: CLI app for newly created functionality.
Add #119: Formatted output for output.Parsed.
Add #117: Convert failed parsing results to output.Parsed.
Add #114: Convert parsing result to output.Parsed.
Add #118: Add Verbatim and Year fields to the root of Authorship.
Add #107: Move grammar package to entity and rename to parser.
Add #110: Move stemmer to entity.
Add #113: Move str package to entity.
Add #112: Move preprocess package to entity.
Add #105: Move fs package to io.
Add #111: Move dict package to io.
Add #106: Describe main use-case via interface.
Add #104: Add configuration package.
Add #103: Create an output.Parsed object that can be used in Go and as JSON.
Add #101: Start using gnlib where it makes sense.
Add #99: Move code to GitHub and change links accordingly.
Add #95: Remove dependency on gRPC and protobuf.

v0.14.4 - 2020-12-15 Tue

Add #96: Do not parse names starting with "Candidatus".
Add #93: Parse 'y' (Spanish '&') as an author separator.

v0.14.3 - 2020-12-13 Sun

Add #95: Remove make dependency on gRPC tooling.
Add #94: Do not parse names with "bacterium" "epithet.

v0.14.2 - 2020-05-12 Tue

Add #90: Allow ß in names.
Add #89: Support subspec. as a rank.
Add #82: Support authors with prefix zu.

v0.14.1 - 2020-05-07 Thu

Fix: Change web API from default to Compact format to get correct API output.

v0.14.0 - 2020-05-07 Thu

Add #81: Add year range in format "1888/89".
Add #80: Add Cardinality to parser outputs.
Add #79: Make CSV the default format for CLI.
Add #78: Take into account non-virus names that look like virus names.

v0.13.1 - 2020-03-05 Thu

Fix #77: Memory leak when used as clib.
Fix #76: Non ASCII apostrophe does not show up in canonical.

v0.13.0 - 2020-02-12 Wed

Add #74: Simple format output is now in CSV format.
Add #73: Improve speed by using ragel's FSM instead of regex.
Fix #75: Normalize subspecies to subsp. instead of ssp..
Fix #72: Surrogate detection by gnparser.ParseToObject method.

v0.12.0 - 2019-11-18 Mon

Add #71: do not parse 'Unnamed clade...'.
Add #69: gnparser as a shared C library.
Add: Make dynamic version using ldflags.
Fix #70: parse 'Remera cvancarai' correctly.

v0.11.0 - 2019-10-24 Thu

Add #68: add stemmed version of canonical form to outputs.
Add: benchmarks to gnparser_test.go

v0.10.0 - 2019-09-10 Tue

Add #67: field authorship of the name for JSON output
Add #66: remove HTML tags during parsing instead of a separate step.
Add #61: handle authors that end with a word "bis".
Add #60: handle correctly deprecated ranks with Greek letters.
Fix #62: parser breaks on Drepanolejeunea (Spruce) (Steph.).

v0.9.0 - 2019-08-16 Fri

Add #65: gRPC is able to return a protobuf object now instead of JSON. string (only for ParseArray function so far). The same protobuf object is now also used by gnparser.ParseToObject function.
Add #64: gRPC method ParseArray that cleans and parses an input from an array of names instead of a stream.
Add #63: abbreviation for form or forma is now f. instead of fm..

v0.8.0 - 2019-04-10 Wed

Add [#51]: strings like Aus (Bus) are parsed differently for ICN and ICZN names. If string inside of parenthesis matches known ICN author name is parsed as Uninomial (Author), otherwise it is parsed as Aus subgen. Bus.

v0.7.5 - 2019-03-31 Sun

Add #59: method ParseToObject to avoid JSON in Go programs.
Add #58: parse Aus (Bus) as Uninomial (Author) to prevent botanical authors appear as subgenera. We need a better solution for this.
Add #57: warning in cases of an ambiguous filius.
Fix #56: bug Ambrysus-Stål, 1862 breaks parser.

v0.7.4 - 2019-02-12 Tue

Add #48: transliteration of diacriticals.
Add #43: notho- (hybrids) rank supported.
Add #52: genera with hyphens with lower or upper char after hyphen.
Add #49: multiple hyphens in specific epithet.

v0.7.3 - 2019-02-04 Mon

Add #54: add cleaning functions to gRPC
Add #46: add supg. rank
Add #45: add natio rank (deprecated ICZN rank)
Add #44: documentation for canonicalName fields
Add #42: tests for command line app

v0.7.2 - 2019-02-01 Fri

Add #41: parse/clean multiple names from standard input.

v0.7.1 - 2019-01-24 Thu

Add #40: add names with missing parenthesis for combination authors.
Fix: remove typo for Scala parser URL on the parser web-page.

v0.7.0 - 2019-01-23 Wed

Add #38: docker image can do gRPC, REST, CLI
Add #37: flag for cleanup HTML entities and tags, underscores are part of parsing.
Add #39: documentation for contributors.
Add #31: continuous integration.
Add #36: substitute underscores to spaces for Newick format.
Add #34: escape HTML entities, remove common tags.
Add #33: Web-based user interface and REST API.

v0.6.0 - 2019-01-16 Wed

Add #35: gRPC method to preserve order in output according to input
Add #30: write inline and README documentation.
Add #29: docker and dockerhub support.
Add #26: get all parser rules to CamelCase format.

v0.5.1 - 2019-01-15 Tue

Add: fix Makefile
Add #28: non-ASCII apostrophe support.
Add #27: agamosp. agamossp. agamovar. ranks.
Add #25: reorganize output to be more readable and logical.
Add #24: gRPC server for receiving name-strings and streaming back the parsed results.
Add #23: Remove multiple years. Now name can have only one year.
Add #22: Run the parser against 24 million names from global names index and fix found problems.
Add #21: Rebuilds tests into test_data_new.txt file. It is important for making global changes in tests.
Add #20: Pass all tests made for Scala gnparser. Tickets 1-19 are about approaching #20.

Footnotes

This document follows changelog guidelines

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

Unreleased

[v1.9.2] - 2024-05-01 Wed

v1.9.1 - 2023-10-13 Fri

v1.9.0 - 2023-10-12 Thu

v1.8.0 - 2023-10-11 Wed

v1.7.5 - 2023-09-26 Tue

v1.7.4 - 2023-08-22 Tue

v1.7.3 - 2023-06-17 Sat

v1.7.2 - 2023-03-09 Thu

v1.7.1 - 2023-03-07 Tue

v1.7.0 - 2023-03-07 Tue

v1.6.9 - 2022-11-10 Thu

v1.6.8 - 2022-10-01 Sat

v1.6.7 - 2022-08-22 Mon

v1.6.6 - 2022-05-15 Sun

v1.6.5 - 2022-03-21 Mon

v1.6.4 - 2022-03-19 Sat

v1.6.3 - 2022-02-08 Tue

v1.6.2 - 2022-02-04 Fri

v1.6.1 - 2022-02-01 Sat

v1.6.0 - 2022-01-22 Sat

v1.5.7 - 2021-11-26 Fri

v1.5.6 - 2021-11-21 Sun

v1.5.5 - 2021-11-17 Wed

v1.5.4 - 2021-11-14 Sun

v1.5.3 - 2021-11-13 Sat

v1.5.2 - 2021-11-10 Wed

v1.5.1 - 2021-11-01 Mon

v1.5.0 - 2021-10-22 Fri

v1.4.2 - 2021-10-21 Thu

v1.4.1 - 2021-10-07 Thu

v1.4.0 - 2021-09-4 Sat

v1.3.3 - 2021-09-11 Wed

v1.3.2 - 2021-08-02 Mon

v1.3.1 - 2021-07-17 Sat

v1.3.0 - 2021-06-29 Tue

v1.2.0 - 2021-04-08 Thu

v1.1.0 - 2021-03-21 Sun

v1.0.13 - 2021-02-23 Tue

v1.0.12 - 2021-02-21 Sun

v1.0.11 - 2021-02-20 Sat

v1.0.10 - 2021-02-19 Fri

v1.0.9 - 2021-02-17 Wed

v1.0.8 - 2021-02-15 Mon

v1.0.7 - 2021-02-14 Sun

v1.0.6 - 2021-02-04 Thu

v1.0.5 - 2021-02-01 Mon

v1.0.4 - 2021-01-23 Sat

v1.0.3 - 2021-01-23 Sat

v1.0.2 - 2021-01-22 Fri

v1.0.1 - 2021-01-20 Wed

v1.0.0 - 2021-01-19 Tue

v0.14.4 - 2020-12-15 Tue

v0.14.3 - 2020-12-13 Sun

v0.14.2 - 2020-05-12 Tue

v0.14.1 - 2020-05-07 Thu

v0.14.0 - 2020-05-07 Thu

v0.13.1 - 2020-03-05 Thu

v0.13.0 - 2020-02-12 Wed

v0.12.0 - 2019-11-18 Mon

v0.11.0 - 2019-10-24 Thu

v0.10.0 - 2019-09-10 Tue

v0.9.0 - 2019-08-16 Fri

v0.8.0 - 2019-04-10 Wed

v0.7.5 - 2019-03-31 Sun

v0.7.4 - 2019-02-12 Tue

v0.7.3 - 2019-02-04 Mon

v0.7.2 - 2019-02-01 Fri

v0.7.1 - 2019-01-24 Thu

v0.7.0 - 2019-01-23 Wed

v0.6.0 - 2019-01-16 Wed

v0.5.1 - 2019-01-15 Tue

Footnotes