-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filename extracted as URL #43
Comments
|
#47 fixed |
First of all thank you for your time working on the patch. However I am not quite sure about the fix, please have a look on my comment: #47 (comment) And lets discuss this topic a bit maybe we can agree on something. |
as growth point : you can add some probability in extraction algorith, for example to decrease false positive rate you can use frequency of TLD usage as attribute (weight), so that probability of detected domain (for example |
From the following input (which is a legit archive filename):
URL is extracted using
find_urls
:The text was updated successfully, but these errors were encountered: