Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzy matching during submission processing to correct spelling mistakes #1504

Open
spwoodcock opened this issue Apr 30, 2024 · 0 comments
Labels
backend enhancement New feature or request

Comments

@spwoodcock
Copy link
Member

Is your feature request related to a problem? Please describe.

  • Sometimes the user types in the name of something in ODK Collect slightly wrong.
  • For example in Bali we mapped buildings inside of compounds. When providing the compound names as a free text entry, users entered similar names that only varied slightly.

Describe the solution you'd like

  • We could probably use a library like thefuzz (already in osm-fieldwork I believe) for fuzzy matching of strings.
  • We could lump together similar string values, which would provide better insights into data when it is nicely processed/cleaned.

Describe alternatives you've considered

  • Of course, there is a risk of false positives and incorrectly modifying values.
  • We should set a high threshold for matching, so only very similar words can match.
@spwoodcock spwoodcock added enhancement New feature or request backend labels Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant