Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback method not called all the time while using js rendering #14

Open
pkmishra opened this issue Nov 16, 2019 · 5 comments
Open

Callback method not called all the time while using js rendering #14

pkmishra opened this issue Nov 16, 2019 · 5 comments

Comments

@pkmishra
Copy link

I am facing this weird issue where the data goes missing randomly for some pages. Upon debugging I found that callback method is not called all the time though Get request is made for all the links given. There is no error logged so I am not sure how to give you more data points. Only option is too attach debugger and see if I am able to replicate it.

@musabgultekin
Copy link
Collaborator

musabgultekin commented Nov 16, 2019

I have a guess. It could be some middleware canceled your requests. There are 3 middlewares do this

  1. Allowed Domains (cancels requests on hosts that are not on this list)
  2. Duplicate Requests
  3. Robots.txt file

I commented out logging on 1 and 2 . It could be them. Can you check those? If one of those, I must add logging for future debugging issues.

@musabgultekin
Copy link
Collaborator

I updated Chrome module and added logs on middlewares. Can you check again

@pkmishra
Copy link
Author

pkmishra commented Nov 17, 2019

Doesn't look like it falls under any categories mentioned but let me check it.

@pkmishra
Copy link
Author

no luck yet. intermittently getting ERROR: encountered exception 'Uncaught' (9583:29) and also observed js rendering doesn't honor allowed domain e.g. https://www.facebook.com/ pixel is rendered with message Crawled: (200) <GET https://www.facebook.com......>

@musabgultekin
Copy link
Collaborator

I cannot replicate errors you have encountered. I need more details, Real code snippet would be awesome.
Allowed domains only applied on first request. If some website makes external requests on rendering, we dont block them. It could cause not loading pages properly on chrome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants