Skip to content

chfi/gn-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GN proxy & access control

Dependencies and starting the proxy

The only dependencies are “redis-rkt” and “threading”, and they can be installed using raco:

git clone git@github.com:chfi/gn-proxy.git
cd gn-proxy
raco pkg install

The REST server can then be started by running server/rest.rkt, while providing the SQL username and password as environment variables:

env SQL_USER=username SQL_PASSWORD=password racket server/rest.rkt

By default it listens on port 8080 and listens on 127.0.0.1, however the port can be changed with the PORT environment variable.

The Redis and MariaDB connections are handled in server/db.rkt, and can be configured by editing connect-redis and connect-sql. See the documentation for the respective packages for more information, if needed:

Resources

Resources can be created using the constructors in Racket (e.g. new-geno-resource), and then serialized into stringified JSON using serialize-resource, which can then be inserted into Redis.

This is an example of the JSON representation of a resource. The only fields that will differ between resource types are “type”, “data”, and the mask fields. The “type” must be one of the keys in the resource-type hash defined in resource.rkt, and the “data” and mask fields depend on the resource type.

{ "name": "r1",
  "owner_id": "7733c380-b83f-45de-a8b5-17e1bc3738a9",
  "data": { "path": "test1.txt",
            "metadata": "test1" },
  "type": "dataset-file",
  "default_mask" : { "metadata": "no-access",
                     "data": "no-access"},
  "group_masks": {"0": {"metadata": "edit",
                        "data": "edit"}
                 }
}

For reference, these are the types currently defined:

dataset-file

data should be a hash containing two fields, path which is a path to the data file, and metadata which is a Redis key containing some metadata.

This type was created mainly for testing, hence its simplicity.

dataset-geno

data should be a hash containing two fields, dataset which is the name of a genotype dataset, and trait which is the name of a trait dataset. These are dataset.name and trait.name in the Python query, respectively. One example is “BXDGeno” for the dataset name, and “rs3657281” for the trait name.

Currently this only has one action branch, and really only one action, namely viewing data.

A JSON example:

{ "name": "some-resource",
  "owner_id": "7733c380-b83f-45de-a8b5-17e1bc3738a9",
  "data": { "dataset": "BXDGeno",
            "trait": "rs365781" },
  "type": "dataset-geno",
  "default_mask" : { "data": "view" },
  "group_masks": { "0": {"data": "view"} }
}

The query is defined in the function select-geno, in resource.rkt, and the result is provided as a string-encoded JSON array, transformed straight from the SQL result.

dataset-publish

data should be a hash containing two fields, dataset and trait. The Python equivalents are dataset.id and trait.name, respectively. The action set is essentially the same as for dataset-geno.

A JSON example:

{ "name": "some-resource",
  "owner_id": "7733c380-b83f-45de-a8b5-17e1bc3738a9",
  "data": { "dataset": "1",
            "trait": "17465" },
  "type": "dataset-publish",
  "default_mask" : { "data": "view" },
  "group_masks": { "0": {"data": "view"} }
}

The query is defined in the same module as dataset-geno, as the function select-publish. The query result is transformed into a JSON array, with SQL nulls replaced by JSON nulls.

Defining new resource types

To define a new resource type, the resource-types hash in resource.rkt must be extended with an entry mapping the new resource type name (as a symbol) to the corresponding action set.

An action set is a hash of action “branches”, which are alists that map the name of each action (as a string) to the corresponding action. An action is a value of the action struct, defined in privileges.rkt, and is a function of two arguments, along with the names of any additional parameters that need to be provided by the user (e.g. in the request to the REST endpoint).

It’s probably best to just look at how one of the existing resource types are defined, and dataset-geno is one of the simplest, while still querying the SQL database.

License

The GeneNetwork source code is released under the Affero General Public License 3 (AGPLv3). See ./LICENSE.txt.