Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library disables CTRL+C when used with Fiber (context-based router) #61

Open
caner-cetin opened this issue Feb 9, 2023 · 11 comments
Open

Comments

@caner-cetin
Copy link

this is dumb. probably the dumbest bug I have ever seen, and I worked with JavaScript, to give you the measure.

Check this:

2023-02-09.20-01-55.mp4
@caner-cetin
Copy link
Author

Same with Gin library.

2023-02-09.20-06-54.mp4

@caner-cetin
Copy link
Author

Same with Chi.

There is a lot of trial and error while scraping, like "this div didn't work? fuck it, try this", and every time I want to do this I need to close the terminal, chdir back to the project folder and run it again.

Not production-breaking, not apocalypse bearer, it is just sheer annoyance if you are working on a REST API.

2023-02-09.20-13-06.mp4

@caner-cetin
Copy link
Author

caner-cetin commented Feb 9, 2023

Also, Chi is not a context-based router. So in context-based routers, or (w http.Writer, r *http.Response) signature routers, CTRL+C is just disabled.

@arnklint
Copy link

Having the same issue. Is there any work around, or what would be a good way to solve this? What's a good approach to adjust for this situation? Maybe we can add a PR?

@arnklint
Copy link

Tried just exposing a Stop() func

func (g *Geziyor) Stop() {
  g.shutdown = true
}

@caner-cetin
Copy link
Author

@arnklint Library is probably dead. If you need a scraper library, and need Remote Javascript Execution, use Go-Rod.

If you dont need RJE, just static webpages, go with plain HTTP requests and GoQuery library. Or, if you need automated pagination go with Colly.

Like, there are a lot of alternatives. Library is long long dead.

@caner-cetin
Copy link
Author

Seems like library is not dead, they merged a PR last week, but that is only commit since February 20. You should seek out for a better maintained lib, I already gave you alternatives and can give you more if you want.

@musabgultekin
Copy link
Collaborator

Hey, sorry because I'm a full time engineer now, I don't have too much time to maintain this library full-time. Sorry for inconvenience

@caner-cetin
Copy link
Author

@musabgultekin Can you at least add somewhere that "this library is not actively maintained", or, you know deprecate it? Cause in core, there are so many bugs waiting to be fixed, and in this state Geziyor should be deprecated in favor of Go Colly. Colly is a lot more stable and maintained since 6 years now,and only thing Colly is missing from Geziyor is Remote Javascript Execution, and in my experience, RJE doesn't even work right now in Geziyor.

@caner-cetin
Copy link
Author

And even if you need RJE, headless browsers are so much better anyways. Go-Rod or chromedp really excels on this.

As I said, so many people stumbles upon this unmaintained scraper/crawler library, but it should be deprecated in favor of Go-Rod and Colly.

@caner-cetin
Copy link
Author

We will get another answer in another month I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants