Skip to content

comsavvy/Punch-scraping-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

News

This code is for scraping the latest Punch News (here) by crawling through different NEWS url.
End product:

  • The URL of the News
  • Title of the news
  • News content

All in one file! This project has three branches:

  1. main: For storing the NEWS into a text file.
  2. CSV: For storing the NEWS into a csv file.
  3. deployment:This can be deployed in SCRAPYHUB platform

Requirement

scrapy_engine.py module will handle the installation of the necessary libraries, are you scared if the libraries is too much?
Don't be! Because we are only installing one library called SCRAPY
But to install it manually,
copy and paste this pip install scrapy to your console.
You can visit the SCRAPY documentation if you are curious about how it works.