Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure init Method call db.load to Restore leaf Instances #2065

Open
jieguangzhou opened this issue May 17, 2024 · 3 comments
Open

Ensure init Method call db.load to Restore leaf Instances #2065

jieguangzhou opened this issue May 17, 2024 · 3 comments

Comments

@jieguangzhou
Copy link
Collaborator

After refactoring the code, we noticed that many encodables were not properly transformed during db.load, leading to the necessity of calling component.init() to restore the data to its correct state.

This issue affects the normal invocation of instance methods and properties. For example, when dealing with an uninitialized encodable instance, we had to use component.prop.x to access x instead of component.prop.

To address this, we need to ensure that the init method is called after db.load or import_item, restoring all data except for those supporting LAZY loading.

@jieguangzhou jieguangzhou changed the title Ensure init Method Invocation Post db.load to Restore leaf Instances Ensure init Method call db.load to Restore leaf Instances May 17, 2024
@jieguangzhou
Copy link
Collaborator Author

jieguangzhou commented May 17, 2024

related to: #2064

@jieguangzhou
Copy link
Collaborator Author

jieguangzhou commented May 17, 2024

import dataclasses as dc
import typing as t

import pandas as pd

from superduperdb import superduper
from superduperdb.components.component import Component
from superduperdb.components.datatype import (
    pickle_serializer,
)


@dc.dataclass(kw_only=True)
class SpecialComponent(Component):
    type_id: t.ClassVar[str] = "special"
    my_data: pd.DataFrame
    _artifacts: t.ClassVar = (("my_data", pickle_serializer),)


df = pd.DataFrame(
    [{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 6}, {"a": 7, "b": 8}]
)

db = superduper("mongomock://test")
c = SpecialComponent("test", my_data=df)

db.add(c)

reloaded = db.load("special", c.identifier)

c.my_data is a DataFrame
reloaded.my_data is a Arfifact

We need to call reloaded.init() to get the actual my_data.

This will be very confusing.

We can automate this initialization except for data with lazy loading.

@blythed
Copy link
Collaborator

blythed commented May 17, 2024

I think this sounds like a bug - not intended behaviour - it's indeed correct that everything should be unwrapped after passing to the Component, however not Lazy*. How can we enforce this? The idea of init was for lazy loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants