Samplebase

Samplebase is a database, which is:

in-process/server-less
file-local, all data lives in a user defined path and can be moved around
document-based/ad-hoc, no need to define a model beforehand
thread-safe, multiple readers, one writer

The core functionality is provided by the Sample(Document) object. On top of that, there are a few utility methods to help with parallely processing Samples.

Motivation

Let there be a task, which you can easily solve for given input arguments.

result = solve(**args)

A Sample is a pair of such args and the corresponding result. Imagine you now have thousands of different args that you want to sample. Samplebase enables you to separate the creation of Samples, from the execution of the solve operation and from the analysis of results.

Example

First define the task to be executed

import samplebase

def solve(x=None, y=None):
  # a lengthy calculation
  return {"product": x * y}

Span the space of arguments that you want to sample. Here, we have two samples.

data_dir = "/my/data/dir"
samplebase.create_sample(data_dir, args={"x": 2, "y": "barbara"})
samplebase.create_sample(data_dir, args={"x": 3, "y": "og"})

Map the function solve on the samples, which are identified via their location on disk and their auto-generated names.

names = samplebase.names_of_samples(data_dir)
samplebase.run_parallel(func=solve, prefix=data_dir, sample_names=names)

Look at results.

samples = samplebase.list_of_samples(data_dir)
for s in samples:
  print(s.result["product"])
>>> barbarabarbara
>>> ogogog

This last part can safely be executed in another interpreter/notebook even if samples are being processed.

Why another database?

Mostly because of parallel access with a server-less architecture. The motivation was: Being able to look at results, while some samples are still being processed

If this does not convince you, consider tinydb

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github/workflows		.github/workflows
conda.recipe		conda.recipe
samplebase		samplebase
tools		tools
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

conda.recipe

conda.recipe

samplebase

samplebase

tools

tools

.gitignore

.gitignore

AUTHORS

AUTHORS

LICENSE

LICENSE

README.md

README.md

setup.py

setup.py

Repository files navigation

Samplebase

Motivation

Example

Why another database?

About

Releases 2

Packages

Languages

License

chrisfroe/samplebase

Folders and files

Latest commit

History

Repository files navigation

Samplebase

Motivation

Example

Why another database?

About

Topics

Resources

License

Stars

Watchers

Forks

Languages