Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration: Drop the hashes for all nodes #6350

Closed
wants to merge 1 commit into from

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Apr 16, 2024

The caching implementation was altered significantly recently. Most notably the following changes were committed:

  • Remove core and plugin information from hash calculation [4c60bbe]
  • NodeCaching._get_objects_to_hash return type to dict [c9c7c4b]
  • Include the node's class in objects to hash [68ce111]

This means that all existing hashes are now incompatible. A migration is added to remove all hashes such that they can be recomputed.

@sphuber sphuber added the priority/critical-blocking must be resolved before next release label Apr 16, 2024
The caching implementation was altered significantly recently. Most
notably the following changes were committed:

* Remove core and plugin information from hash calculation
  [4c60bbe]
* `NodeCaching._get_objects_to_hash` return type to `dict`
  [c9c7c4b]
* Include the node's class in objects to hash
  [68ce111]

This means that all existing hashes are now incompatible. A migration is
added to remove all hashes such that they can be recomputed.
@mbercx
Copy link
Member

mbercx commented May 8, 2024

One note while testing this. verdi status seems to still refer to main_0001:

❯ verdi status 
 ✔ version:     AiiDA v2.5.1.post0
 ✔ config:      /Users/mbercx/project/core/.aiida
 ✔ profile:     presto
 ✘ storage:     Storage schema version is incompatible with the code version 'main_0001'. Run `verdi storage migrate` to solve this.
Warning: RabbitMQ v3.13.1 is not supported and will cause unexpected problems!
Warning: It can cause long-running workflows to crash and jobs to be submitted multiple times.
Warning: See https://github.com/aiidateam/aiida-core/wiki/RabbitMQ-version-to-use for details.
 ✔ broker:      RabbitMQ v3.13.1 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
 ⏺ daemon:      The daemon is not running.

Even though I should be at version main_0002. This is also what verdi storage info reports:

❯ verdi storage info
Critical: 
Database schema version `main_0002` is incompatible with the required schema version `main_0003`.
To migrate the database schema version to the current one, run the following command:

    verdi -p presto storage migrate

EDIT: Testing a psql_dos storage, I now read the line properly: "Storage schema version is incompatible with the code version 'main_0001'." So the sqlite_dos storage version was at main_0002 but the "code version" should be main_0001, maybe because there is no update for SQLite? For a PostgeSQL database the line reports "code version main_0003".

@mbercx
Copy link
Member

mbercx commented May 8, 2024

A second comment was related to the fact that I noticed migrations were still stored in the psql_dos directory. This made me wonder: can we migrate sqlite_dos storages? I tried and ran into the following error:

Traceback (most recent call last):
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
    self.dialect.do_execute(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: unrecognized token: "#"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/mbercx/.aiida_venvs/core/bin/verdi", line 8, in <module>
    sys.exit(verdi())
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/cmdline/commands/cmd_storage.py", line 78, in storage_migrate
    storage_cls.migrate(profile)
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/backend.py", line 122, in migrate
    migrator.migrate()
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrator.py", line 390, in migrate
    self.migrate_up('main@head')
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrator.py", line 399, in migrate_up
    upgrade(config, version)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/command.py", line 403, in upgrade
    script.run_env()
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/script/base.py", line 583, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 95, in load_python_file
    module = load_module_py(module_id, path)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 113, in load_module_py
    spec.loader.exec_module(module)  # type: ignore
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrations/env.py", line 51, in <module>
    run_migrations_online()
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrations/env.py", line 44, in run_migrations_online
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/runtime/environment.py", line 948, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/alembic/runtime/migration.py", line 627, in run_migrations
    step.migration_fn(**kw)
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrations/versions/main_0003_recompute_hash_all_nodes.py", line 39, in upgrade
    drop_hashes(op.get_bind(), hash_extra_key='_aiida_hash')
  File "/Users/mbercx/project/core/git/aiida-core/src/aiida/storage/psql_dos/migrations/utils/integrity.py", line 232, in drop_hashes
    conn.execute(statement_update)
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1422, in execute
    return meth(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1644, in _execute_clauseelement
    ret = self._execute_context(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1850, in _execute_context
    return self._exec_single_context(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1990, in _exec_single_context
    self._handle_dbapi_exception(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2357, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
    self.dialect.do_execute(
  File "/Users/mbercx/.aiida_venvs/core/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unrecognized token: "#"
[SQL: UPDATE db_dbnode SET extras = extras #- '{_aiida_hash}'::text[];]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

EDIT: Indeed, for a psql_dos storage, the migration runs smoothly.

@sphuber
Copy link
Contributor Author

sphuber commented May 8, 2024

A second comment was related to the fact that I noticed migrations were still stored in the psql_dos directory. This made me wonder: can we migrate sqlite_dos storages?

Uhm, nope, that is a eetsie-peetsie-tiny detail that we forgot about when allowing new storage backends. I am not sure how much it would be to generalize the migration infrastructure and if it is even possible. At the least we should probably look at it for the plugins that we ship with aiida-core. Unfortunately since core.sqlite_dos was already released, if we want to be really complete, we will need to already implement the migration for the sqlite based backends now... Ugh, this release will never get done...

@danielhollas
Copy link
Collaborator

If it turns out to be too difficult to fix quickly, we could theoretically postpone this migration. Is it really needed to drop the hashes? All that will happen is that users will get cache misses (barring the exceedingly rare hash collisions). But that was the case anyway before this release since the hashes were dependent on AiiDA version. And users can still initiate rehashing, even if the hashes were not dropped, right?

@sphuber
Copy link
Contributor Author

sphuber commented May 31, 2024

The problem of the migrations not working for sqlite is addressed in #6429 Nevertheless, I think we have decided not to add a migration for the dropping of the hashes this time. We will simply instruct users in the release notes to run verdi rehash. But if we will ever need to add a migration in the future, we can add separate ones for PSQL and SQLite and be sure they work.

@sphuber sphuber closed this May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-blocking must be resolved before next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants