Welcome to Aqueduct

description
Open-source prediction infrastructure for data scientists

Welcome to Aqueduct

Aqueduct is open-source prediction infrastructure built for data scientists, by data scientists. With Aqueduct, data scientists can instantaneously deploy machine learning models to the cloud, connect those models to data and business systems, and gain visibility into the performance of their prediction pipelines -- all from the comfort of a Python notebook.

For more on why we're building prediction infrastructure for data scientists see the-aqueduct-philosophy.md.

The core abstraction in Aqueduct is a Workflow, which is a sequence of Artifacts (data) that are transformed by Operators (compute). The input Artifact(s) for a Workflow is typically loaded from a database, and the output Artifact(s) are typically persisted back to a database. Each Workflow can either be run on a fixed schedule or triggered on-demand.

The 12-line code snippet below is all you need to create your first Aqueduct workflow:

from aqueduct import Client, op

# Create an Aqueduct client. If we're running on the same machine as the 
# Aqueduct server, we can create a client without providing an API key or a
# server address.
client = Client()

# The @op decorator here allows Aqueduct to run this function as 
# a part of an Aqueduct workflow. It tells Aqueduct that when 
# we execute this function, we're defining a step in the workflow.
@op
def transform_data(reviews):
    '''
    This simple Python function takes in a DataFrame with hotel reviews
    and adds a column called strlen that has the string length of the
    review.    
    '''
    reviews['strlen'] = reviews['review'].str.len()
    return reviews

# With client.resource, we can load a connection to a database.
# Here, we use the Aqueduct demo DB.
demo_db = client.resource("aqueduct_demo")
reviews_table = demo_db.sql("select * from hotel_reviews;")

# Calling .get() allows us to retrieve the underlying data from the TableArtifact and
# returns it to you as a Python object.
print(reviews_table.get())

# Calling a decorated function returns another Aqueduct artifact.
strlen_table = transform_data(reviews_table)

# Artifacts can be saved -- here, we save the table with the appended strlen
# back to the Aqueduct demo DB with the table name `strlen_table`.
demo_db.save(strlen_table, table_name="strlen_table", update_mode="replace")

# This publishes the logic needed to create the strlen_table
# to Aqueduct. You will receive a URL below that will take you to the
# Aqueduct UI, which will show you the status of your workflow
# runs and allow you to inspect them.
client.publish_flow(name="review_strlen", artifacts=[strlen_table])

For more on this pipeline, check our Quickstart Guide.

Core Concepts

Tutorials

Examples

MPG Regressor [Linear Regression]
Wine Ratings Predictor [Decision Tree]
Diabetes Classifier [K-Nearest Neighbors]
Sentiment Analysis [Deep Learning]
House Price Predictor [Ensemble Model]

Name		Name	Last commit message	Last commit date
Latest commit History 229 Commits
.gitbook/assets		.gitbook/assets
.github		.github
api-reference		api-reference
example-workflows		example-workflows
guides		guides
installation-and-configuration		installation-and-configuration
integrations		integrations
metrics-and-checks		metrics-and-checks
notifications		notifications
operators		operators
parameters		parameters
resources		resources
workflows		workflows
.gitignore		.gitignore
README.md		README.md
SUMMARY.md		SUMMARY.md
artifacts.md		artifacts.md
faqs.md		faqs.md
metrics-and-checks.md		metrics-and-checks.md
operators.md		operators.md
parameters.md		parameters.md
quickstart-guide.md		quickstart-guide.md
the-aqueduct-philosophy.md		the-aqueduct-philosophy.md

RunLLM/gitbook

Folders and files

Latest commit

History

Repository files navigation

Welcome to Aqueduct

Core Concepts

Tutorials

Examples

Guides

API Reference

About

Resources

Stars

Watchers

Forks