Skip to content
This repository has been archived by the owner on Sep 21, 2022. It is now read-only.

Implement dedup managers #10

Open
2 tasks
jeremi opened this issue Jun 4, 2022 · 5 comments
Open
2 tasks

Implement dedup managers #10

jeremi opened this issue Jun 4, 2022 · 5 comments
Assignees
Milestone

Comments

@jeremi
Copy link
Member

jeremi commented Jun 4, 2022

  • Dedup by Phone numbers
  • Dedup by ID document
@jeremi jeremi added this to the 1.0 milestone Jun 4, 2022
@jeremi jeremi modified the milestones: 1.0, 0.3 Jul 2, 2022
@emjay0921 emjay0921 self-assigned this Jul 11, 2022
@jeremi
Copy link
Member Author

jeremi commented Jul 20, 2022

It seems not finished:

  • Does not duplicate individuals by ID or phone number; it does it only for group members
  • Dedup on phone numbers should be done on the normalized phone number by libphonenumbers, not the raw one.
  • Dedup of ID number should be done in context to the ID type. ie, If two persons have the same ID number, but for different ID types, this is not a duplicate.

@jeremi
Copy link
Member Author

jeremi commented Jul 27, 2022

Can you link to where the code is, Do you have tests?

@emjay0921
Copy link
Contributor

I'll fix this now.

@emjay0921
Copy link
Contributor

I updated this issue.

  • It now deduplicate individuals by Phone and ID
  • It now sanitized the phone_number for the checking of duplicates, for some reason I can't get the phone_sanitized field so the code now sanitized the phone_number by itself.
  • The ID numbers was now checked by its ID Type and ID Number. ie, two person, different ID Type but same ID Numbers is ignored
  • I also tried optimized some codes in the dedup to minimize the load.

@emjay0921
Copy link
Contributor

You can see the code in this Commit

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants