You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a topic to gather some context and requirements of our initialization workflow (a.k.a make_app(...)) to refactor it and make it more efficient and maintainable.
Context
I always find CKAN's initialization workflow a little bit odd and difficult to follow. It also adds a lot of noise to the traceback when trying to interactively debug the application. Recently, I have been trying to debug and improve our Conf Declaration workflow and I ended up entangled in how our config object (and initialization) is executed.
Our current worflow:
Main application building method:
defmake_app(conf: Union[Config, CKANConfig]) ->CKANApp:
''' Initialise the Flask app and wrap it in dispatcher middleware. '''load_environment(conf)
flask_app=make_flask_stack(conf)
# Set this internal test request context with the configured environment so# it can be used when calling url_for from testsglobal_internal_test_request_context_internal_test_request_context=flask_app._wsgi_app.test_request_context()
returnflask_app
Inside load_environment
What does it mean environment in our current stack? Is it the same than config?
defload_environment(conf: Union[Config, CKANConfig]):
""" Configure the Pylons environment via the ``pylons.config`` object. This code should only need to be run once. """os.environ['CKAN_CONFIG'] =cast(str, conf['__file__'])
valid_base_public_folder_names= ['public', 'public-bs3']
static_files=conf.get('ckan.base_public_folder', 'public')
conf['ckan.base_public_folder'] =static_filesifstatic_filesnotinvalid_base_public_folder_names:
raiseCkanConfigurationException(
'You provided an invalid value for ckan.base_public_folder. ''Possible values are: "public" and "public-bs3".'
)
log.info('Loading static files from %s'%static_files)
# Initialize main CKAN config objectconfig.update(conf)
# Setup the SQLAlchemy database engine# Suppress a couple of sqlalchemy warningsmsgs= ['^Unicode type received non-unicode bind param value',
"^Did not recognize type 'BIGINT' of column 'size'",
"^Did not recognize type 'tsvector' of column 'search_vector'"
]
formsginmsgs:
warnings.filterwarnings('ignore', msg, sqlalchemy.exc.SAWarning)
# load all CKAN pluginsp.load_all()
# Check Redis availabilityifnotis_redis_available():
log.critical('Could not connect to Redis.')
app_globals.reset()
# Build JavaScript translations. Must be done after plugins have# been loaded.build_js_translations()
Inside load_all()
defload_all() ->None:
''' Load all plugins listed in the 'ckan.plugins' config directive. '''# Clear any loaded pluginsunload_all()
plugins=config.get('ckan.plugins') +find_system_plugins()
load(*plugins)
Summarized workflow
It goes something like this when adding all the stuff happening in plugins.load_all():
Our SQLAlchemy engine initialization is hidden inside update_config() method when it should be at the top level of our make app stack. (It also happens when trying to unload things that has never been loaded!) Currently, we are initializating the sqlalchemy engine each time we update plugins and I don't think this is necessary nor should be allowed. The CKAN database is a property and responsibility of CKAN, not the plugins. (Can a plugin override our main database? Is it safe? Possible? Viable?)
We are creating all our webassets library on each plugin update. This means that core base, css, vendor libraries are being created multiple times. It seems to me that each plugin should be responsible for creating/deleting it's own webassets.
site_user is also being fetched/created each time we modify plugins. (And since we are doing it using the logic layer this injects a lot of dependencies upon initialization.)
Executing pytest ckan/tests will execute all this initialization logic 4 times. (Even before starting tests)
We are initializating our CKAN instance with the assumption that might be plugins loaded. Why? I assume the reason is test but again, no run method should start with stop, that's a reset. We should have two explicit calls to load or reset depending on the scenario.
Our config object is being setup for the first time when trying to unload plugins. This is clearly a counter-intuitive logic since it shouldn't exist plugins loaded if our configuration object has not been setup.
Suggestions and Questions
We should have a more explicit separation between "loading our configuration" and "doing things with our configuration". Nowadays it is all mixed up, doing things as we load. This logically causes a lot of unnecessary re-doing when re-loading.
SQLAlchemy engine initialization should be out of the configuration handling logic. Same with the site user. I want to move this to the natural place of DB initialization in every flask application. Our make_flask_stack(...) method.
Why are we initializing all of our webassets library on each plugin load? It makes sense for plugins to register theirs, but why all of tem?
environment.update_config() is a mix of config updates and infrastructure initialization.
Given all this point, it is clear that we do not have any clear separation between configuration handling and infrastructure initialization.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello!
This is a topic to gather some context and requirements of our initialization workflow (a.k.a
make_app(...)
) to refactor it and make it more efficient and maintainable.Context
I always find CKAN's initialization workflow a little bit odd and difficult to follow. It also adds a lot of noise to the traceback when trying to interactively debug the application. Recently, I have been trying to debug and improve our Conf Declaration workflow and I ended up entangled in how our config object (and initialization) is executed.
Our current worflow:
Main application building method:
Inside
load_environment
What does it mean
environment
in our current stack? Is it the same than config?Inside
load_all()
Summarized workflow
It goes something like this when adding all the stuff happening in
plugins.load_all()
:Current "problems" and pitfails
update_config()
method when it should be at the top level of our make app stack. (It also happens when trying to unload things that has never been loaded!) Currently, we are initializating the sqlalchemy engine each time we update plugins and I don't think this is necessary nor should be allowed. The CKAN database is a property and responsibility of CKAN, not the plugins. (Can a plugin override our main database? Is it safe? Possible? Viable?)base
,css
,vendor
libraries are being created multiple times. It seems to me that each plugin should be responsible for creating/deleting it's own webassets.site_user
is also being fetched/created each time we modify plugins. (And since we are doing it using the logic layer this injects a lot of dependencies upon initialization.)pytest ckan/tests
will execute all this initialization logic 4 times. (Even before starting tests)run
method should start withstop
, that's areset
. We should have two explicit calls toload
orreset
depending on the scenario.Suggestions and Questions
We should have a more explicit separation between "loading our configuration" and "doing things with our configuration". Nowadays it is all mixed up, doing things as we load. This logically causes a lot of unnecessary re-doing when re-loading.
make_flask_stack(...)
method.webassets
library on each plugin load? It makes sense for plugins to register theirs, but why all of tem?environment.update_config()
is a mix of config updates and infrastructure initialization.Given all this point, it is clear that we do not have any clear separation between configuration handling and infrastructure initialization.
Beta Was this translation helpful? Give feedback.
All reactions