Coldwell Banker Scraper

Description

This is a Ruby script to scrape products from https://www.coldwellbankerhomes.com.

Scraped states, regions, products data then saved to output/ directory in files with CSV format.

This scraper script uses

Static HTML (DOM) parsing for links/general info
Semantic annotation recognizing in product/residence Microformat for parsing estate-specific data embedded in the product pages

Workflow: how it works

Features

Service Object Pattern which provide one public method - #call
Ruby executable script
All required gems installed with Bundler
curl support with Curb for getting pages HTML
Nokogiri for HTML parsing with XPath and CSS selector support.
CSV export via CSV Ruby class
Logging via Logger Ruby class
Code style is provided via RuboCop
Ruby code quality reporter via RubyCritic

Requirements

System: Linux, Mac
Git
Ruby version manager (rbenv or RVM)
Ruby 2.5.0
Bundler
Gems installed via Bundler Gemfile

Installation

Download code from repository

Clone with SSH:

$ git clone git@github.com:alex-petr/coldwell_banker_scraper.git

Or clone with HTTPS:

$ git clone https://github.com/alex-petr/coldwell_banker_scraper.git

rbenv (for macOS)

$ cd coldwell_banker_scraper/ && brew install rbenv

Ruby

$ rbenv install 2.5.0

Install `Bundler` and all required gems

$ gem install bundler && bundle

Tests

No test suite is available. To ensure that this scraper works run it and check output in terminal and output/ directory for CSV files.

Usage

$ bin/scraper

After running script will generate a bunch of CSV files inside output/ directory.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bin		bin
doc		doc
lib		lib
output		output
.gitignore		.gitignore
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

doc

doc

lib

lib

output

output

.gitignore

.gitignore

.ruby-version

.ruby-version

Gemfile

Gemfile

Gemfile.lock

Gemfile.lock

README.md

README.md

Repository files navigation

Coldwell Banker Scraper

Description

This scraper script uses

Workflow: how it works

Features

Requirements

Installation

Download code from repository

rbenv (for macOS)

Ruby

Install `Bundler` and all required gems

Tests

Usage

About

Releases

Packages

Languages

alex-petr/coldwell_banker_scraper

Folders and files

Latest commit

History

Repository files navigation

Coldwell Banker Scraper

Description

This scraper script uses

Workflow: how it works

Features

Requirements

Installation

Download code from repository

rbenv (for macOS)

Ruby

Install Bundler and all required gems

Tests

Usage

About

Topics

Resources

Stars

Watchers

Forks

Languages

Install `Bundler` and all required gems