Skip to content

Grand Sumo Tournament Highlights Archive

Notifications You must be signed in to change notification settings

siznax/honbasho

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

honbasho

Archive Grand Sumo tournament highlights, as they are removed 😭 before each new tournament.

Update the config

Update basho.json with latest source and metadata, e.g.

"201609": {
    "en": "http://www.sumo.or.jp/EnHonbashoTopicsKoTorikumi15/wrap",
    "ja": "http://www.sumo.or.jp/ResultDataKoTorikumi15/wrap",
    "date": "7 Oct 2016",
    "title": "Aki 2016 (September) Grand Sumo Highlights",
    "archive": "honbasho-201609-aki",
    "description": "<b>Aki 2016</b>\n\nTokyo, Ryogoku Kokugikan\n\nSeptember 11, 2016 - September 25, 2016\n\n"

Crawl and download

Get highlights metadata:

$ mkdir {dest}
$ crawl.py {selector} > {dest}/data.json

Download movies and text:

$ download.py {dest} {dest}/data.json

Make highlights HTML index:

$ index.py {dest}/data.json {selector} > {dest}/highlights.html

Upload to the Internet Archive

  • Add a description for the archive page in basho.json
  • Move crawl HTML out of {dest}/
  • Make sure {selector} and {dest} have same name (e.g. 201607)

Review metadata changes to be made:

$ upload.py {selector}

Upload files and modify metadata:

$ upload.py {selector} -u  # upload files
$ upload.py {selector} -m  # modify metadata

Update project pages

Thanks to the Internet Archive for hosting, and @jjjake for the excellent internetarchive python library.

@siznax

About

Grand Sumo Tournament Highlights Archive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages