-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Spec: Draft for top-level p2p spec #9732
base: main
Are you sure you want to change the base?
Conversation
* add parameters to byzantine send action * make net not trusted it's not necessary since for proofs Ivy will assume that the environment does not break action preconditions * use require instead of assume it seems that assume is not checked when other isolates call! * add comment * add comment * run with random seed * make domain model extractable to C++ * substitute require for assume assumes in an action are not checked when the action is called! I.e. they place no requirement on the caller; we're just assuming that the caller is going to do the right thing. This wasn't very important here but it leade to a minor inconsistency slipping through. * make the net isolate not trusted there was no need for it * add tendermint_test.ivy contains a simple test scenario that show that the specification is no vacuuous * update comment * add comments * throw if trying to parse nset value in the repl * add comment * minor refactoring
Bumps [gaurav-nelson/github-action-markdown-link-check](https://github.com/gaurav-nelson/github-action-markdown-link-check) from 1.0.12 to 1.0.13. - [Release notes](https://github.com/gaurav-nelson/github-action-markdown-link-check/releases) - [Commits](gaurav-nelson/github-action-markdown-link-check@1.0.12...1.0.13) --- updated-dependencies: - dependency-name: gaurav-nelson/github-action-markdown-link-check dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/stale](https://github.com/actions/stale) from 3.0.19 to 4. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v3.0.19...v4) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* abci: clarify what abci stands for * link to abci type protos.
* abci: clarify connection use in-process * Update abci.md * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * invert abci explanations * lint++ * lint++ * lint++ * lint++ Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* abci: points of clarification ahead of v0.1.0 * lint++ * typo * lint++ * double word score * grammar * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * pr feedback * wip * update non-zero status code docs * fix event description * update CheckTx description Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update supervisor_001_draft.md If the only node in the *FullNodes* set is the primary, that was just deemed faulty, we can't find honest primary. * Update supervisor_001_draft.md
* wip * wip * wip * remove comments in favor of gh comments * wip * udpates to language, should must etc * Apply suggestions from code review Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * remove tendermint cache description Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* add missed proto files * add abci changes * rename blockchain to blocksync * Update proto/tendermint/abci/types.proto Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: Callum Waters <cmwaters19@gmail.com>
…ions Tendermint specification version 0.7.1
* [Rebased to v0.34.x] abci: PrepareProposal (#6544) * fixed cherry-pick * proto changes * make proto-gen * UT fixes * generate Client directive * mockery * App fixes * Disable 'modified tx' hack * mockery * Make format * Fix lint Co-authored-by: Marko <marbar3778@yahoo.com>
This test would fail if run with "go test -count=2" because it uses a fixed address and was not closing the server, so the subsequent run could not bind to the address. While closing the server is correct, it would probably be better if the API was able to report the bound address so that we could pass "localhost:0" for an anonymous port. But I am currently focusing on test cleanup, not ready to change any existing APIs.
* libs/pubsub/query: specify peg version in go.mod The code to generate the pubsub queries was dependent on an unspecified version of the peg tool. This brings peg into go.mod so it is on a fixed version. This should also enable dependabot to notify us of future updates to peg. The version of query.peg.go generated from the current version of peg correctly contains the special "Code generated by..." line to indicate to other tools that the file is automatically generated and should therefore be excluded from linters, etc. I removed the make target as there were no git grep results referencing "gen_query_parser"; directly running "go generate" is a reasonable expectation in Go projects. Now that "go run" is module aware, I would typically use "go run" inside the go:generate directive, but in this case we go build to a gitignore-d directory in order to work around the nondeterministic output detailed in pointlander/peg#129. * libs/pubsub/query: check error from (*QueryParser).Init() The newly generated peg code returns an error from Init(); the previous version was niladic. Co-authored-by: Sam Kleinman <garen@tychoish.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the introduction of the document. Possibly need of better definitions for used terms.
- (correct) nodes that follow the protocol described here | ||
- (potentially) adversarial nodes whose behavior deviates to harm the system | ||
- (correct) nodes that don't follow the protocol to shield themselves but behave in a "nice way" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth touching on "connectedness" in this section? Maybe some definitions that give an idea of how the number connections that a node keeps to peers, together with the "amount of bad peers" (or probability of a peer to be bad) has an impact on the global properties we want to achieve (e.g. a message reaches all full nodes eventually)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. I think there are basic implicit assumptions/requirements in the current solution, e.g., between any two correct validators there is a path consisting only of correct full nodes. (Most likely even more paths). This might be a requirement that the p2p needs to provide to the reactors, and it can only do so, if the underlying network has even stronger properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, I hardly can see this property being ensured, it is probabilistically achieved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right @sergio-mena, in that these properties cannot be guaranteed, so maybe we shouldn't try. I mean, the P2P should only guarantee a best effort guarantee of conectedness, and then we call out that if by some external forces other guarantees are made, such as the existence of a GST and a maximum number of byzantine nodes, then and stronger properties can be derived, like eventual delivery of messages from some correct processes to some correct processes. This way we modularize the guarantees.
TODO: | ||
- Reactor API | ||
- Network? How do we communicate with other nodes | ||
- Discuss that validators run special set-up, and manage their own neighborhood (hide behind sentry nodes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Although I'd simplify to say "if you configure a node to prevent it to normally connect to peers " (validator<-->sentry) then only the normally connected node (the sentry) can expect the properties the overlay network provides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should use something like "normally" in the definitions of the specification. At this points I think it makes sense to talk about validators to motivate the definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of defining a public network, the overlay, fully using p2p and PEX, and a private network, where each operator manually configures nodes. In fact, validators also have their IDs set as "private", so that no one learns its actual address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sergio-mena says: let's drop the term "validators" from this spec.
Talk about the public network of full nodes. Explain the validators need to make sure to be connected to the public network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cason says:
- We assume validators are somehow connected to at least a correct public node
From this point we reason about correct public nodes
spec/p2p/p2p-top-level.md
Outdated
> Even if at every time *t* we satisfy Point 1, if the overlays at times *t* and *t+1* are totally different, it might be hard to implement decent communication on top of it. E.g., Consensus gossip requires a neighbor to know its neighbors *k* state so that it can send the message to *k* that help *k* to advance. If *k* is connected only one second per hour, this is not feasible. | ||
3. Openness. It is always the case that new nodes can be added to the system | ||
> Assuming 1. and 2. holds, this means, there must always be nodes that are willing to add connections to new peers. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something about self-healing? (not sure, just thinking out loud)
Something like, "the network overlay can recover from important crippling network events (e.g. temporary partitions)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I wrote the properties 1.-3. having somewhat "good periods" in mind. The question as I see it now when I look at your comment is whether there is a relation between self healing and openness. But perhaps the connection is rather in the solution side. We need to think more ;-)
|
||
> should be English and precise. will be accompanied with a TLA spec. | ||
|
||
TODO: This seems to be a research question. Perhaps we can find some simple properties by looking at the peer-to-peer systems academic literature from several years ago? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found the following quote while doing a quick research "I cannot define gossip, but I can recognize it when I see it”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josef-widder, thanks for putting together this draft top-level spec.
I added a few comments (mainly with some ideas).
I like the overall structure. IMO it's an excellent starting point: this document structure can be helpful on execution (i.e., who focuses on what and when). I think we could even structure our tracking issue (#9573) according to this document structure.
spec/p2p/p2p-top-level.md
Outdated
# Part I - A Tendermint node | ||
|
||
TODO: | ||
- perhaps we should survey here what we think are the expectations of the reactors from p2p? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the reactors be exposed or have a say in the overlay? Could we have multiple ones in parallel? For example, could we use a chord like overlay for 1-to-1 communication but a fat tree rooted at the proposer for proposal broadcast? The p2p could ask the reactors for hints on how to propagate messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- no duplication
- no corruption
the following are "low level" and not required as long as messages are delivered.
- ability to prioritize messages.
- ability to make messages stale, for example, right after sending MSG1, still in the outbound buffers, send a MSG2, which makes MSG1 useless and dropped.
- at most once delivery
- channel isolation (or some best effort)
- to respect back pressure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we will stay with "one overlay for all". I will make this explicit in the write-up.
p2p properties (single hop, by channel)
- no corruption
- prioritization
seems that currently message handling by p2p is best effort
- consensus might need for liveness that neighborhood is stable | ||
- mempool might just need that a transaction can reach each validator within a reasonable amount of time (this might be achievable if at no point in time the current graph is connected, but, e.g., something along the lines that the union of the graphs over some period in time is connected.) | ||
- What does p2p expect from the reactors? (don't falsely report bad nodes; this puts requirements on the reactors and perhaps/likely also on the application running on top of ABCI) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know where exactly to put it, but it would be a good property not to leak information to other nodes, for example, the internal topology of a validator network.
|
||
> should be English and precise. will be accompanied with a TLA spec. | ||
|
||
TODO: This seems to be a research question. Perhaps we can find some simple properties by looking at the peer-to-peer systems academic literature from several years ago? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found the following quote while doing a quick research "I cannot define gossip, but I can recognize it when I see it”
spec/p2p/p2p-top-level.md
Outdated
- I/O | ||
- dispatch messages incoming from the network to the reactors | ||
- send messages incoming from the reactors to the network (the peers the messages should go to) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- do not forward messages already known to potential receivers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this is currently implemented at reactor level, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reactors try to filter what to send, based on the deduced state of the other nodes, but not based one what messages they have received. The first is stronger, if correctly implemented, but the latter should be easier to implement.
assumptions make sense, e.g., it is in the interest of a full node to behave | ||
correctly | ||
|
||
> should have clear formalization in temporal logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one interesting property is that eventually all messages sent by correct nodes to correct nodes are delivered or they become stale (e.g., a decision is reached and becomes known)
|
||
> safety specifications / invariants in English | ||
|
||
TODO: In a good period, *p* should stay connected with *q*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really required? What if there are better connections?
I thought about "In a good period, connections eventually stabilize", but even in good periods there may be changes in the network that the p2p layer might want to take into account. And even if not, if we need to ensure openness and a bounded number of connections, good connections may need to be dropped.
Wondering if there is any way we could get anonymized aggregated data from the network? For example average number of connections, average memory usage, anything that we could use to study the network while not revealing too much to malicious entities. |
Co-authored-by: Lasaro Camargos <lasaro@informal.systems>
Co-authored-by: Lasaro Camargos <lasaro@informal.systems>
- proposers are not disconnected? | ||
- mempool might just need that a transaction can reach each validator within a reasonable amount of time (this might be achievable if at no point in time the current graph is connected, but, e.g., something along the lines that the union of the graphs over some period in time is connected.) | ||
> talk about the 1-to-1 and the 1-to-many/all delivery guarantees needed by the protocols (do this for each protocol. discuss how these requirements translate into what p2p should guarantee) | ||
- What does p2p expect from the reactors? (don't falsely report bad nodes; this puts requirements on the reactors and perhaps/likely also on the application running on top of ABCI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- to provide reasonable parameters to drive the connections of peers.
These parameters can be low level (like what we have now, i.e., number of outgoing connections, number of incoming connections, permanent peers) or higher level (e.g., form a tree, for a ring with long n fingers...), which require a more elaborate PEX.
…int into josef/i9573-josef-p2p
This is the beginning of a high-level p2p spec addressing #9573 for discussions. I started top down trying to understand what p2p is supposed to do rather than how p2p should do things. This collects findings from several discussion.
PR checklist
CHANGELOG_PENDING.md
updated, or no changelog entry neededdocs/
) and code comments, or nodocumentation updates needed