Skip to content

augusto-herrmann/ckan-tabular-and-map-counter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

How to count tabular and map datasets in a CKAN instance using the API, as a Jupyter Notebook. For more information, please see the blog post.

Quick start

Make sure you have either Jupyter Notebook or Lab installed. Install the Jupytext extension -- this is useful for Jupyter Notebooks to play nice with Git's line based version control, and required to open .md (or .pyfiles) as Notebooks inside Jupyter. Learn more about Jupytext on this Towards Data Science blog post.

Create a new Python environtment and activate it

$ python -m venv env
$ source env/bin/activate

Inside the environment, install ipykernel and the new environment inside Jupyter.

$ pip install ipykernel
$ python -m ipykernel install --user --name=ckanapi --display-name="CKAN API"

Now install the requirements for this specific notebook inside the new environment. In particular, we use just ckanapi, a Python wrapper for the CKAN API, and tqdm to display a nice progress bar.

$ pip install -r requirements.txt

Finally, to generate back the .ipynb file, use the command

$ jupytext --sync tabular-and-map-datasets.md 

Remember to put a rule in the .gitignore file to exclute .ipynb files, if you want notebooks to keep playing nice with Git diffs.

Now you can just open the Jupyter Notebook provided and run it. Adjust the parameters for your CKAN instance and enjoy!