WIP Spec: Draft for top-level p2p spec #9732

josef-widder · 2022-11-21T15:31:18Z

This is the beginning of a high-level p2p spec addressing #9573 for discussions. I started top down trying to understand what p2p is supposed to do rather than how p2p should do things. This collects findings from several discussion.

PR checklist

Tests written/updated, or no tests needed
CHANGELOG_PENDING.md updated, or no changelog entry needed
Updated relevant documentation (docs/) and code comments, or no
documentation updates needed

* add parameters to byzantine send action * make net not trusted it's not necessary since for proofs Ivy will assume that the environment does not break action preconditions * use require instead of assume it seems that assume is not checked when other isolates call! * add comment * add comment * run with random seed * make domain model extractable to C++ * substitute require for assume assumes in an action are not checked when the action is called! I.e. they place no requirement on the caller; we're just assuming that the caller is going to do the right thing. This wasn't very important here but it leade to a minor inconsistency slipping through. * make the net isolate not trusted there was no need for it * add tendermint_test.ivy contains a simple test scenario that show that the specification is no vacuuous * update comment * add comments * throw if trying to parse nset value in the repl * add comment * minor refactoring

Bumps [gaurav-nelson/github-action-markdown-link-check](https://github.com/gaurav-nelson/github-action-markdown-link-check) from 1.0.12 to 1.0.13. - [Release notes](https://github.com/gaurav-nelson/github-action-markdown-link-check/releases) - [Commits](gaurav-nelson/github-action-markdown-link-check@1.0.12...1.0.13) --- updated-dependencies: - dependency-name: gaurav-nelson/github-action-markdown-link-check dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [actions/stale](https://github.com/actions/stale) from 3.0.19 to 4. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v3.0.19...v4) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* abci: clarify what abci stands for * link to abci type protos.

* abci: clarify connection use in-process * Update abci.md * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * invert abci explanations * lint++ * lint++ * lint++ * lint++ Co-authored-by: M. J. Fromberger <fromberger@interchain.io>

…package name (#344)

* abci: points of clarification ahead of v0.1.0 * lint++ * typo * lint++ * double word score * grammar * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * Update spec/abci/abci.md Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * pr feedback * wip * update non-zero status code docs * fix event description * update CheckTx description Co-authored-by: M. J. Fromberger <fromberger@interchain.io>

* Update supervisor_001_draft.md If the only node in the *FullNodes* set is the primary, that was just deemed faulty, we can't find honest primary. * Update supervisor_001_draft.md

* wip * wip * wip * remove comments in favor of gh comments * wip * udpates to language, should must etc * Apply suggestions from code review Co-authored-by: M. J. Fromberger <fromberger@interchain.io> * remove tendermint cache description Co-authored-by: M. J. Fromberger <fromberger@interchain.io>

* add missed proto files * add abci changes * rename blockchain to blocksync * Update proto/tendermint/abci/types.proto Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: Callum Waters <cmwaters19@gmail.com>

…ions Tendermint specification version 0.7.1

* [Rebased to v0.34.x] abci: PrepareProposal (#6544) * fixed cherry-pick * proto changes * make proto-gen * UT fixes * generate Client directive * mockery * App fixes * Disable 'modified tx' hack * mockery * Make format * Fix lint Co-authored-by: Marko <marbar3778@yahoo.com>

This test would fail if run with "go test -count=2" because it uses a fixed address and was not closing the server, so the subsequent run could not bind to the address. While closing the server is correct, it would probably be better if the API was able to report the bound address so that we could pass "localhost:0" for an anonymous port. But I am currently focusing on test cleanup, not ready to change any existing APIs.

* libs/pubsub/query: specify peg version in go.mod The code to generate the pubsub queries was dependent on an unspecified version of the peg tool. This brings peg into go.mod so it is on a fixed version. This should also enable dependabot to notify us of future updates to peg. The version of query.peg.go generated from the current version of peg correctly contains the special "Code generated by..." line to indicate to other tools that the file is automatically generated and should therefore be excluded from linters, etc. I removed the make target as there were no git grep results referencing "gen_query_parser"; directly running "go generate" is a reasonable expectation in Go projects. Now that "go run" is module aware, I would typically use "go run" inside the go:generate directive, but in this case we go build to a gitignore-d directory in order to work around the nondeterministic output detailed in pointlander/peg#129. * libs/pubsub/query: check error from (*QueryParser).Init() The newly generated peg code returns an error from Init(); the previous version was niladic. Co-authored-by: Sam Kleinman <garen@tychoish.com>

cason

Some comments on the introduction of the document. Possibly need of better definitions for used terms.

spec/p2p/p2p-top-level.md

sergio-mena · 2022-11-24T09:07:56Z

spec/p2p/p2p-top-level.md

+       - (correct) nodes that follow the protocol described here
+       - (potentially) adversarial nodes whose behavior deviates to harm the system
+       - (correct) nodes that don't follow the protocol to shield themselves but behave in a "nice way"
+


Would it be worth touching on "connectedness" in this section? Maybe some definitions that give an idea of how the number connections that a node keeps to peers, together with the "amount of bad peers" (or probability of a peer to be bad) has an impact on the global properties we want to achieve (e.g. a message reaches all full nodes eventually)

Yes, you are right. I think there are basic implicit assumptions/requirements in the current solution, e.g., between any two correct validators there is a path consisting only of correct full nodes. (Most likely even more paths). This might be a requirement that the p2p needs to provide to the reactors, and it can only do so, if the underlying network has even stronger properties.

Moreover, I hardly can see this property being ensured, it is probabilistically achieved.

You are right @sergio-mena, in that these properties cannot be guaranteed, so maybe we shouldn't try. I mean, the P2P should only guarantee a best effort guarantee of conectedness, and then we call out that if by some external forces other guarantees are made, such as the existence of a GST and a maximum number of byzantine nodes, then and stronger properties can be derived, like eventual delivery of messages from some correct processes to some correct processes. This way we modularize the guarantees.

sergio-mena · 2022-11-24T09:09:37Z

spec/p2p/p2p-top-level.md

+TODO:
+- Reactor API
+- Network? How do we communicate with other nodes
+- Discuss that validators run special set-up, and manage their own neighborhood (hide behind sentry nodes).


That's a good point. Although I'd simplify to say "if you configure a node to prevent it to normally connect to peers " (validator<-->sentry) then only the normally connected node (the sentry) can expect the properties the overlay network provides

Yes, I think we should use something like "normally" in the definitions of the specification. At this points I think it makes sense to talk about validators to motivate the definition.

I like the idea of defining a public network, the overlay, fully using p2p and PEX, and a private network, where each operator manually configures nodes. In fact, validators also have their IDs set as "private", so that no one learns its actual address.

@sergio-mena says: let's drop the term "validators" from this spec.

Talk about the public network of full nodes. Explain the validators need to make sure to be connected to the public network

@cason says:

We assume validators are somehow connected to at least a correct public node

From this point we reason about correct public nodes

sergio-mena · 2022-11-24T09:12:41Z

spec/p2p/p2p-top-level.md

+    > Even if at every time *t* we satisfy Point 1, if the overlays at times *t* and *t+1* are totally different, it might be hard to implement decent communication on top of it. E.g., Consensus gossip requires a neighbor to know its neighbors *k* state so that it can send the message to *k* that help *k* to advance. If *k* is connected only one second per hour, this is not feasible.
+3. Openness. It is always the case that new nodes can be added to the system
+    > Assuming 1. and 2. holds, this means, there must always be nodes that are willing to add connections to new peers.
+


Something about self-healing? (not sure, just thinking out loud)
Something like, "the network overlay can recover from important crippling network events (e.g. temporary partitions)"

You are right. I wrote the properties 1.-3. having somewhat "good periods" in mind. The question as I see it now when I look at your comment is whether there is a relation between self healing and openness. But perhaps the connection is rather in the solution side. We need to think more ;-)

sergio-mena · 2022-11-24T09:13:03Z

spec/p2p/p2p-top-level.md

+
+> should be English and precise. will be accompanied with a TLA spec.
+
+TODO: This seems to be a research question. Perhaps we can find some simple properties by looking at the peer-to-peer systems academic literature from several years ago?


Totally agree

found the following quote while doing a quick research "I cannot define gossip, but I can recognize it when I see it”

sergio-mena

@josef-widder, thanks for putting together this draft top-level spec.
I added a few comments (mainly with some ideas).
I like the overall structure. IMO it's an excellent starting point: this document structure can be helpful on execution (i.e., who focuses on what and when). I think we could even structure our tracking issue (#9573) according to this document structure.

spec/p2p/p2p-top-level.md

lasarojc · 2022-11-24T14:05:38Z

spec/p2p/p2p-top-level.md

+# Part I - A Tendermint node 
+
+TODO: 
+- perhaps we should survey here what we think are the expectations of the reactors from p2p? 


Should the reactors be exposed or have a say in the overlay? Could we have multiple ones in parallel? For example, could we use a chord like overlay for 1-to-1 communication but a fat tree rooted at the proposer for proposal broadcast? The p2p could ask the reactors for hints on how to propagate messages.

no duplication

no corruption

the following are "low level" and not required as long as messages are delivered.

ability to prioritize messages.

ability to make messages stale, for example, right after sending MSG1, still in the outbound buffers, send a MSG2, which makes MSG1 useless and dropped.

at most once delivery

channel isolation (or some best effort)

to respect back pressure

For now we will stay with "one overlay for all". I will make this explicit in the write-up.

p2p properties (single hop, by channel)

no corruption

prioritization

seems that currently message handling by p2p is best effort

spec/p2p/p2p-top-level.md

lasarojc · 2022-11-24T14:30:45Z

spec/p2p/p2p-top-level.md

+    - consensus might need for liveness that neighborhood is stable
+    - mempool might just need that a transaction can reach each validator within a reasonable amount of time (this might be achievable if at no point in time the current graph is connected, but, e.g., something along the lines that the union of the graphs over some period in time is connected.)
+- What does p2p expect from the reactors? (don't falsely report bad nodes; this puts requirements on the reactors and perhaps/likely also on the application running on top of ABCI)
+


I don't know where exactly to put it, but it would be a good property not to leak information to other nodes, for example, the internal topology of a validator network.

lasarojc · 2022-11-24T14:59:00Z

spec/p2p/p2p-top-level.md

+
+> should be English and precise. will be accompanied with a TLA spec.
+
+TODO: This seems to be a research question. Perhaps we can find some simple properties by looking at the peer-to-peer systems academic literature from several years ago?


found the following quote while doing a quick research "I cannot define gossip, but I can recognize it when I see it”

lasarojc · 2022-11-24T15:01:59Z

spec/p2p/p2p-top-level.md

+- I/O
+   - dispatch messages incoming from the network to the reactors
+   - send messages incoming from the reactors to the network (the peers the messages should go to) 
+


do not forward messages already known to potential receivers

I thought this is currently implemented at reactor level, no?

I think the reactors try to filter what to send, based on the deduced state of the other nodes, but not based one what messages they have received. The first is stronger, if correctly implemented, but the latter should be easier to implement.

spec/p2p/p2p-top-level.md

lasarojc · 2022-11-24T15:08:12Z

spec/p2p/p2p-top-level.md

+assumptions make sense, e.g., it is in the interest of a full node to behave
+correctly 
+
+> should have clear formalization in temporal logic.


one interesting property is that eventually all messages sent by correct nodes to correct nodes are delivered or they become stale (e.g., a decision is reached and becomes known)

lasarojc · 2022-11-24T15:25:29Z

spec/p2p/p2p-top-level.md

+
+> safety specifications / invariants in English 
+
+TODO: In a good period, *p* should stay connected with *q*.


is this really required? What if there are better connections?
I thought about "In a good period, connections eventually stabilize", but even in good periods there may be changes in the network that the p2p layer might want to take into account. And even if not, if we need to ensure openness and a bounded number of connections, good connections may need to be dropped.

lasarojc · 2022-11-24T18:16:33Z

Wondering if there is any way we could get anonymized aggregated data from the network? For example average number of connections, average memory usage, anything that we could use to study the network while not revealing too much to malicious entities.

Co-authored-by: Lasaro Camargos <lasaro@informal.systems>

lasarojc · 2022-11-25T17:58:50Z

spec/p2p/p2p-top-level.md

+        - proposers are not disconnected?
+    - mempool might just need that a transaction can reach each validator within a reasonable amount of time (this might be achievable if at no point in time the current graph is connected, but, e.g., something along the lines that the union of the graphs over some period in time is connected.)
+    >  talk about the 1-to-1 and the 1-to-many/all delivery guarantees needed by the protocols (do this for each protocol. discuss how these requirements translate into what p2p should guarantee)
+- What does p2p expect from the reactors? (don't falsely report bad nodes; this puts requirements on the reactors and perhaps/likely also on the application running on top of ABCI)


to provide reasonable parameters to drive the connections of peers.

These parameters can be low level (like what we have now, i.e., number of outgoing connections, number of incoming connections, permanent peers) or higher level (e.g., form a tree, for a ring with long n fingers...), which require a more elaborate PEX.

…int into josef/i9573-josef-p2p

crypto-facs and others added 30 commits June 24, 2021 15:01

rpc: Add totalGasUSed to block_results response (#308)

6bd5263

add new pex messages (#312)

6039594

update spec to reference currently used timestamp type (#317)

0ff67d6

address discrepancies between spec and implementation (#322)

24f22ee

update proto files for release (#318)

8f9cd23

stale bot: ignore issues (#325)

ed7fa80

evidence: add section explaining evidence (#324)

8029cf7

statesync: new messages for gossiping consensus params (#328)

72adbf9

rpc: update peer format in specification in NetInfo operation (#331)

1e985f6

Update supervisor_001_draft.md (#334)

4c3339a

core: text cleanup (#332)

e44ab95

abci: clarify what abci stands for (#336)

26ee62a

* abci: clarify what abci stands for * link to abci type protos.

proto: move proto files under the correct directory related to their …

41e6812

…package name (#344)

Update supervisor_001_draft.md (#333)

f793752

* Update supervisor_001_draft.md If the only node in the *FullNodes* set is the primary, that was just deemed faulty, we can't find honest primary. * Update supervisor_001_draft.md

light: update initialization description (#320)

e2a038e

proto: add tendermint go changes (#349)

fa2ccc8

* add missed proto files * add abci changes * rename blockchain to blocksync * Update proto/tendermint/abci/types.proto Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: Callum Waters <cmwaters19@gmail.com>

Merge tag 'v0.7.1' of github.com:tendermint/spec into cal/spec-migrat…

1ad0683

…ions Tendermint specification version 0.7.1

fix mockery generation script (#9094)

a11b774

Merge branch 'main' into cal/spec-migrations

0323bd5

spec: migrate spec at v0.7.1. into main (#9098)

a7049ed

Fix unsafe-reset-all for working with default home (#9103)

18c7e3a

to be shared

8ec77d4

josef-widder self-assigned this Nov 21, 2022

josef-widder changed the title ~~Josef/i9573 josef p2p~~ Draft for top-level p2p spec Nov 21, 2022

josef-widder changed the title ~~Draft for top-level p2p spec~~ spec: Draft for top-level p2p spec Nov 21, 2022

josef-widder changed the title ~~spec: Draft for top-level p2p spec~~ WIP Spec: Draft for top-level p2p spec Nov 21, 2022

some requirements discussion

0461d43

cason reviewed Nov 23, 2022

View reviewed changes

spec/p2p/p2p-top-level.md Outdated Show resolved Hide resolved

spec/p2p/p2p-top-level.md Outdated Show resolved Hide resolved

spec/p2p/p2p-top-level.md Outdated Show resolved Hide resolved

sergio-mena reviewed Nov 23, 2022

View reviewed changes

spec/p2p/p2p-top-level.md Outdated Show resolved Hide resolved

sergio-mena reviewed Nov 24, 2022

View reviewed changes

lasarojc reviewed Nov 24, 2022

View reviewed changes

josef-widder added 2 commits November 24, 2022 16:56

things learned from the SRDS 2021 paper.

6c04c1d

minor changes

128b617

josef-widder and others added 3 commits November 25, 2022 14:18

Update spec/p2p/p2p-top-level.md

a0d0856

Co-authored-by: Lasaro Camargos <lasaro@informal.systems>

incorporated comments from the PR discussion

7ef04ee

Update spec/p2p/p2p-top-level.md

46c857c

Co-authored-by: Lasaro Camargos <lasaro@informal.systems>

lasarojc reviewed Nov 25, 2022

View reviewed changes

josef-widder added 2 commits November 29, 2022 17:17

added (unstructured) nodes after office hour with Bucky

84b521a

Merge branch 'josef/i9573-josef-p2p' of github.com:tendermint/tenderm…

0fbeaaf

…int into josef/i9573-josef-p2p

thanethomson added the S:wip Work in progress (prevents stalebot from automatically closing) label Nov 30, 2022

josef-widder added 3 commits December 5, 2022 14:03

consensus reactor

e3fd2cf

requirements

2d85387

req

3bc10d1

josef-widder added the C:spec Component: specification label Dec 22, 2022

jmalicevic mentioned this pull request Jan 5, 2023

WIP Spec: Draft for top-level p2p spec cometbft/cometbft#80

Closed

jaekwon force-pushed the main branch from a9feb1c to 64747b2 Compare January 5, 2023 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Spec: Draft for top-level p2p spec #9732

WIP Spec: Draft for top-level p2p spec #9732

josef-widder commented Nov 21, 2022

cason left a comment

sergio-mena Nov 24, 2022

josef-widder Nov 24, 2022

cason Nov 25, 2022 •

edited

lasarojc Nov 25, 2022

sergio-mena Nov 24, 2022

josef-widder Nov 24, 2022 •

edited

cason Nov 25, 2022

josef-widder Nov 25, 2022

josef-widder Nov 25, 2022

sergio-mena Nov 24, 2022

josef-widder Nov 24, 2022

sergio-mena Nov 24, 2022

lasarojc Nov 24, 2022

sergio-mena left a comment

lasarojc Nov 24, 2022

lasarojc Nov 24, 2022

josef-widder Nov 25, 2022

lasarojc Nov 24, 2022

lasarojc Nov 24, 2022

lasarojc Nov 24, 2022

josef-widder Nov 25, 2022

lasarojc Nov 25, 2022

lasarojc Nov 24, 2022

lasarojc Nov 24, 2022

lasarojc commented Nov 24, 2022

lasarojc Nov 25, 2022


		> should be English and precise. will be accompanied with a TLA spec.

		TODO: This seems to be a research question. Perhaps we can find some simple properties by looking at the peer-to-peer systems academic literature from several years ago?


		> safety specifications / invariants in English

		TODO: In a good period, p should stay connected with q.

WIP Spec: Draft for top-level p2p spec #9732

Are you sure you want to change the base?

WIP Spec: Draft for top-level p2p spec #9732

Conversation

josef-widder commented Nov 21, 2022

PR checklist

cason left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cason Nov 25, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josef-widder Nov 24, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sergio-mena left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lasarojc commented Nov 24, 2022

Choose a reason for hiding this comment

cason Nov 25, 2022 •

edited

josef-widder Nov 24, 2022 •

edited