This project is a proof of concept of building a decentralized key-value store on top of IPFS. This project was build as part of the course Building Scalable Blockchain Applications with Big Data Technology at the Hasso Plattner Institute.
The main focus of this project is a consensus algorithms for a network of nodes to agree on a consistent order of inserts transactions. The consensus algorithm is based on ELASTICO by Luu et al. (2016) which promises near linear agreement throughput with an increasing node count. For network communication Akka Cluster is used. The results were evaluated using different sized networks with up to 200 nodes.
For development, you can run the service locally with additional dummy actors using:
gradle -p store appRun
This requires a running instance of IPFS. To run the service with a mocked IPFS instance, run:
MOCK_IPFS="true" gradle -p store appRun
You can interact with the store through the REST-API as follows:
- Insert a new value: localhost:9090/store/api/store/my-namespace/my-key/my-value
- Get the value for a key: localhost:9090/store/api/store/my-namespace/my-key
- Inspect the index: localhost:9090/store/api/store/my-namespace/index
- Set the
COMMUNITY_COUNT
variable incommunity-assigment-server/communityAssignmentService.py
to the amount of communities you want to have. - Build the docker-compose setup with running
docker-compose -f docker-compose.yml build
in the root directory - Start multiple nodes with
docker-compose -f docker-compose.yml up --scale store=<number_of_nodes>
A store node consists of two major components, the Datastore Service and the Consensus Service. The Datastore Service handles incoming transactions and manages the in-memory index of the keys as well as the Transaction Log. The Consensus Service communicates with the network and takes part in the consensus rounds.
The Datastore Service manages the high-level interactions. It handles incoming read and write-requests, interacts with IPFS and forwards new transactions to the Consensus Service. The Datastore Service maintains an index containing the mapping of keys to IPFS content hashes, which are required to address the values stored in IPFS. Besides that, it manages the Transaction Log. The Transaction Log contains the immutable ordered list of all write-transactions. Because all transactions go through a consensus round, the Transaction Log is eventual consistent on all nodes in the network.
The Consensus Service starts and takes part in consensus rounds. For network communication we use Akka Cluster together with their PubSub implementation. Once a previous consensus round is finished, the Consensus Service pulls the next pending transaction from the Transaction Backlog and starts a new consensus round. The consensus algorithm is based on a simplified version of ELASTICO using PBFT internally. Due to the scope of this seminar, a couple of simplifications have been made such as leaving out dynamic community formation, identity management, and cryptographic protocols. Once the network agrees on the next transaction, the Consensus Service notifies the Datastore Service which integrates the new transaction in its Transaction Backlog and local index.
When adding a new key-value pair to a node, the following steps are done:
- The Datastore Service adds the value to IPFS and retrieves its content hash for addressing the value.
- The tuple of the key and content hash of the value represents a new pending transaction which is added to the Transaction Backlog.
- The Consensus Service pulls a pending transaction from the Transaction Backlog and initiates a new consensus round with the transaction. Based on the ELASTICO agreement protocol, a two-round consensus is done. First, the local community of the node agrees on a next transaction within the local community. Based on this result, the final community agrees on the next transaction of all transactions proposed by the local communities.
- Once the network reaches consensus on the next transaction, the Consensus Service notifies the Datastore Service. The Datastore Service integrates the key into its local index and adds the transaction to the Transaction Log.
When reading a key, that reached consistency in the network, the node only has to look up the respective content hash from its index. Using this content hash, it can return the value stored in IPFS. Note that in order to guarantee consistent reads, further measures such as quorum reads have to be implemented.
For our experiments, the Community Assignment Server simplifies the dynamic community assignment by registering new nodes joining the cluster and statically assigning a community number to each node.
Following http GET routes are supported:
/join
Get a community assignment/members
List all registered nodes and their assignment/members/count
Count the current registered nodes/clear
Clear the current state of the server
For our experiments, we initially set up two seed-nodes. Next, we dynamically added further nodes for the different experiment setups. Nodes can be deployed on various instances including container engines. In our case, we used DigitalOcean with one s-1vcpu-1gb
instance for each store node. After all nodes have joined the network, we added new random key-value pairs to all nodes in the network using round-robin scheduling. Finally, the start and end events of the consensus rounds for each transaction are pulled from the logs of all nodes. The duration of a consensus round is defined by the difference between the timestamp of the event starting the consensus round and the latest timestamp of a node agreeing on this transaction.
The following table shows some of the results based on the experiment setup described before. For these experiments, the nodes are distributed equally between the communities.
Description | Median | Mean |
---|---|---|
5 Nodes, 1 Community | 0.35 | 0.5 |
25 Nodes, 2 Communities | 0.59 | 0.89 |
100 Nodes, 25 Communities | 3.04 | 9.49 |
200 Nodes, 50 Communities | 17.89 | 45.64 |
The chart shows that there are differences in growth between the median and mean. These differences might indicate an increasing network overhead and its resulting outliers of consensus durations.
More details on all experiments and the raw log files can be found here.
- Implement proof of work for joining a community preventing malicious user controlling too many nodes
- Add recovery mechanisms for nodes rejoining the cluster
- Integrate the assignment service within the nodes (maybe a similar approach as used in ELASTICO)
- Create dedicated Akka PubSub Clusters for every community to reduce messaging and cluster setup time