Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation crawler #95

Open
EmilLuta opened this issue Oct 24, 2017 · 5 comments
Open

Translation crawler #95

EmilLuta opened this issue Oct 24, 2017 · 5 comments
Assignees

Comments

@EmilLuta
Copy link
Collaborator

@bluzi Hi there. I strongly believe that from time to time, a crawler could be ran in order to fetch translations from some official providers (for now, let's stick to Wikipedia).

Therefore, the goal would be to write such a scraper that goes through all entries and tries to fetch meaning/ translations/ aliases from given sources.

Let me know what you think. I could give you a hand of help with Python, if that's alright.

@bluzi
Copy link
Owner

bluzi commented Oct 24, 2017

Hey @EmilLuta,

Our current method of data collection is community based. I find this method very accurate, it's the same method used by Wikipedia for instance.

However, if you think you can create a crawler that will be accurate enough, I'd love to add it to this project.

@EmilLuta
Copy link
Collaborator Author

EmilLuta commented Oct 24, 2017

@bluzi I'll give it a go. The scope of this would be to enhance the current name entries, not go further into adding new entries. I can see your point of view. Looking forward to see how we'll be able to validate 'accurate enough', if the crawler is done. I'll keep you pinned.

@bluzi
Copy link
Owner

bluzi commented Oct 24, 2017

Good to see we're on the same page here. Can't wait to see the outcome of this.

@EmilLuta
Copy link
Collaborator Author

EmilLuta commented Oct 26, 2017

@bluzi Some job stuff has been done. Wikipedia doesn't seem to be such a reliable source (just a couple of names have translations) and even though this works, it's far from complete. My suggestion would be to create a new branch on your repo and integrate this for now (just to keep up with the reference) so far. From this point forth, I'm going to address https://www.behindthename.com/ and come with PoC ASAP.

Let me know what you think.

@bluzi
Copy link
Owner

bluzi commented Oct 28, 2017

@EmilLuta I added you as a collaborator, so feel free to create a branch and push your code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants