Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ScrapedPage object #32

Open
jeremybmerrill opened this issue Jan 15, 2014 · 1 comment
Open

Create ScrapedPage object #32

jeremybmerrill opened this issue Jan 15, 2014 · 1 comment
Milestone

Comments

@jeremybmerrill
Copy link
Contributor

Which is what would be yielded out of Scraper#scrape instead of the HTML, the URL, and instance page's index, etc.

This ScrapedPage object -- which might inherit from Nokogiri::HTML -- would contain the raw HTML, the parsed HTML, the URL, the index page from which the instance page was linked (if present), a reference to the index page's ScrapedPage object, and the instance page's index (i.e. ordinal count) of pages linked to from the index page.

This would be a breaking change, so is farther away from being implemented into stable Upton.

@jeremybmerrill jeremybmerrill added this to the 1.0.0 milestone Feb 15, 2014
@jeremybmerrill
Copy link
Contributor Author

Implemented in future (for 1.0.0) in 31cbf41

Will be minimally breaking, since missing methods on Page are passed through to Nokogiri::HTML.

Maybe I should implement this even-less-breakingly in 0.4.0 by still passing the instance_index, instance_url, etc. attrs through to blk.call?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant