You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's improve the create-test-data command to generate a more typical count of users, groups, orgs and datasets. It should also upload generated test data with tens of columns and thousands of rows. Next we should create a benchmark-test-data command to exercise the UI and APIs to display, sort and query the generated data.
These commands should have an option to generate a detailed report with the time for each each creation or query task.
In our nightly build job we can collect these reports and add them to a github pages static site repo, along with the commit id and pip freeze output, to track performance for these realistic workloads over time similar to https://speed.pypy.org/
This automatic reporting will help us identify changes to ckan's code, dependencies and environment that help or hurt performance.
The text was updated successfully, but these errors were encountered:
Instead of synthetic test data, we should snapshot real-world data from well-known sources, e.g.:
World Bank
UN
NYC's 311 and Taxi Data
Boston's CKAN Organizations
Canada's Open Data Portal
non-English content from other CKAN Sites (Saudi Arabia, Singapore, Japan, Africa, Argentina, Finland, etc.)
The sample data snapshot should be curated so that it can exercise CKAN subsystems (e.g. different data types, date formats, UTF-8 encoding, Languages, etc.)
Let's improve the
create-test-data
command to generate a more typical count of users, groups, orgs and datasets. It should also upload generated test data with tens of columns and thousands of rows. Next we should create abenchmark-test-data
command to exercise the UI and APIs to display, sort and query the generated data.These commands should have an option to generate a detailed report with the time for each each creation or query task.
In our nightly build job we can collect these reports and add them to a github pages static site repo, along with the commit id and
pip freeze
output, to track performance for these realistic workloads over time similar to https://speed.pypy.org/This automatic reporting will help us identify changes to ckan's code, dependencies and environment that help or hurt performance.
The text was updated successfully, but these errors were encountered: