Fix return type specification of asset loader helper functions in `load_assets_from_modules` #21808

emirkmo · 2024-05-13T14:55:09Z

Summary & Motivation

Basically Functions/Classes should take in generics like Sequence, but they should not return said generic if they actually return a concrete implementation. In this case, all of these functions actually return a list. So we shouldn't mistype.

How I Tested These Changes

Ran the docs build for latest. Ran the dev test suite from where it is actually passing on master (checked out the latest green commit). My changes are minor. No failures found before or after.

Concrete functions should specify the actual type of what they return. In this case, these functions all returns lists, NOT a generic Sequence. In general, functions and classes in a library should be as generic as possible in what they accept as input but be as specific as possible in what they return as output. These returns lists. If I try to append or manipulate the returns, they have the incorrect type of Sequence. A sequence does not have .append for instance. It cannot be concatenated with a list of AssetDefinitions, etc. If using strict type checking, which we do, I cannot even easily combine the output of these functions with assets imported "manually". I would get the error list[AssetsDefinition] is incompatiable with Sequence[...]. It is quite annoying to have to `typing.cast` the output of library code from Dagster into its actual type just to get my own type checking to work. It is made harder by the fact that CacheableAssetsDefintion is not exported by default, so I have to import it, and cast the output to `list[AssetsDefinition | SourceAsset | Cacheable...]]`. This fixes the return of these functions to the actual type, meaning that they are still `Sequence[Union[A, B, C]`, and will naturally pass the type check for any functions that works on a generic Sequence[Union[A, B ,C]`, but they are now also more concretely `List[Union[A, B, C]` so one can actually use the returned objects in their own user code. Change-Id: I13a06494c69c622dd65b4349d9538f94f1b3fc1a

Return types should be specific, not generic, (unless you take in a generic, work on it, and then return it). But these functions do not do that. They always return a list, even if they take in a Sequnce. Change-Id: I374139e50de4bfeaf908fb18699fd81fadb14475

sryza

We generally return Sequence in these cases because it gives us future flexibility to do things like caching the results. E.g. with the current typing, we could later decorate one of these functions with functools.lru_cache. With List, we wouldn't be able to do this, because we couldn't expect that the caller wouldn't modify the returned collection.

emirkmo · 2024-05-17T17:10:39Z

But Sequence is wrong. It is returning a list. I fail to see how wrongly typing the output effects runtime caching (unimplemented). The returned list is not hashable so your cache wouldn't work correctly..

As far as I can tell, you don't want to LRU cache these functions, at least in user code? Their point is to, given the same input, pull in all assets even if there are new ones defined in those locations. LRU caching is for functions that return the same thing for the same input, which these functions do not..

If you do want to cache, then you should return an object type that can be cached. Such as a tuple. (See the second answer for details as its possible to add wrappers around list returning functions to cache them by converting them to something hashable of course)

If nothing, then please do not actively mistype the return. You could leave it untyped which then at least allows users to set some settings to ignore the untyped third party type or have the type checker/LSP figure out the correct type. Right now the type is wrong and has to be manually cast to correct type to actually use the output. Please see the linked issue for the problems this causes.

sryza · 2024-05-17T17:26:01Z

If nothing, then please do not actively mistype the return

@emirkmo - all lists are sequences, so the type is accurate.

emirkmo added 2 commits May 13, 2024 16:21

gibsondan requested a review from sryza May 15, 2024 18:46

sryza reviewed May 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix return type specification of asset loader helper functions in `load_assets_from_modules` #21808

Fix return type specification of asset loader helper functions in `load_assets_from_modules` #21808

emirkmo commented May 13, 2024

sryza left a comment

emirkmo commented May 17, 2024 •

edited

sryza commented May 17, 2024

Fix return type specification of asset loader helper functions in load_assets_from_modules #21808

Are you sure you want to change the base?

Fix return type specification of asset loader helper functions in load_assets_from_modules #21808

Conversation

emirkmo commented May 13, 2024

Summary & Motivation

How I Tested These Changes

sryza left a comment

Choose a reason for hiding this comment

emirkmo commented May 17, 2024 • edited

sryza commented May 17, 2024

Fix return type specification of asset loader helper functions in `load_assets_from_modules` #21808

Fix return type specification of asset loader helper functions in `load_assets_from_modules` #21808

emirkmo commented May 17, 2024 •

edited