Skip to content
This repository has been archived by the owner on Feb 18, 2021. It is now read-only.

research using telehash over the bittorrent DHT #106

Open
quartzjer opened this issue Nov 23, 2014 · 12 comments
Open

research using telehash over the bittorrent DHT #106

quartzjer opened this issue Nov 23, 2014 · 12 comments
Labels

Comments

@quartzjer
Copy link
Member

Since v3 doesn't contain a DHT by default, show how to use it over bittorrent's. See this thread.

@RainaBatwing
Copy link

My tweets were about using BitTorrent trackers, not their DHT system. I'm not sure what use bt dht would be over the dotPublic telehash dht, so I'll presume we're not really talking about that.

For smaller meshes, use the BitTorrent Tracker infrastructure (not DHT) to find peers.

A torrent tracker service is a centralised server you can use to ask for peers. Most public trackers now operate over a very simple udp protocol which works like this:

  1. send a single announce udp packet including an info_hash (group, like private mesh id), your desired port, and a unique identifier for this node, which could be derived from node's hashname. The unique ID is useful for deduplication when you move to different internet addresses and reannounce.
  2. tracker replies directly with a single udp packet, which includes up to about 50 random ip:port combinations that have been announced to that info_hash in the past hour or so.

Public tracker services have proven to endure quite well in the face of strong opposition and shoestring budgets. There are many public options available that continue to operate at incredible scale. The common standards are:

  • udp://tracker.openbittorrent.com:80
  • udp://tracker.publicbt.com:80
  • udp://tracker.ccc.de:80
  • udp://tracker.istole.it:80

There are a few properties of torrent trackers which are really cool:

  1. They're totally free to use, like DNS, and can definitely handle any scale of users you can throw at them
  2. You can form a private mesh as easily as changing the info_hash value you send in your announce packet
  3. They are very simple to implement, and should work well in embedded systems.
  4. Coders, and especially kids, aren't burdened by the need to operate a reliable seed computer to introduce new peers to the network, or the implementation of a complex invitation system where existing peers introduce new ones to a private mesh.

An example use case: A simple chat room system similar to Cryptocat could be established with an app which broadcasts everyone's chat messages in a Gossiping way. When a user opens the app, it asks for their name and a chatroom name. The chatroom name is hashed together with some constant salt like the app's name, and provided as an info_hash to tracker.ccc.de. tracker.ccc.de returns up to ~50 potential peers who have recently been in that chatroom. The user attempts connection to each of them until they're introduced to the private mesh, trying to maintain 10 or so connections. Whenever a new message is posted, each peer would rebroadcast to a few of the people they're connected to, honouring a relatively short TTL, similar to Gnutella 1's original simple broadcasting search system.

Not burdening developers with the need to have a seed server for their private mesh has great potential for social change. When children learn to code they're often keen to create networked toy apps, but soon find out the difficulties of getting access to a server without paying for it, and even once they have access free servers usually only run awful systems like php. These coders instead of refining their apps skills are forced to learn more programming languages and the ways of unix and terminals. Once they have endured this long enough, servers become easier, and in adulthood are easily paid for. By then it's too late - centralisation is the lazy option. It doesn't have to be this way.

If we have good p2p libraries that work effortlessly to create private meshes, p2p network graphs can be very simple and data transmission can be very unoptimised. Nodes never have to deal with other software getting in their way. Kids could import telehash in to whatever they are learning to use and instantly have some simple reliable networking capabilities. Learning structures like gossip protocols would be fun and easily recognisable from real life. A new generation of coders could learn p2p from the very beginning and always put off learning servers as something complicated, annoying, expensive, prone to failure at scale, and ultimately unnecessary.

If we really want social change, I think this is the story we should focus on. Kids are the future. And they're already in a bad situation that can be relieved by p2p technologies.

@Kagami
Copy link
Contributor

Kagami commented Nov 24, 2014

Not burdening developers with the need to have a seed server for their private mesh has great potential for social change.

There is no support for UDP sockets in browsers now (only in Chrome APPs/Firefox addons) so some sort of relays/proxies is still needed. (E.g. those who rewrap websocket traffic into UDP.)

@fd
Copy link
Contributor

fd commented Nov 24, 2014

@Kagami @RainaBatwing Also not that dotPublic solves this problem by routing traffic through routers/bridges (which are auto discovered by traversing the DHT). This also solves most (if not all) NAT traversal problems.

@quartzjer
Copy link
Member Author

While it would be fun to re-use the tracker UDP infrastructure in a more central way, it poses the same problem as re-using public STUN/TURN servers, that the data and interactions are all un-encrypted and very easily monitored and recorded. A first-principle of telehash is to minimize any metadata leakage, and I don't think there's a use of trackers that wouldn't at a minimum reveal that two parties are related/connecting to each other.

That just means I don't want to integrate it as a default option, I'm still interested in exploring it as an alternative and for apps that are already bittorrent based. What I intended with this issue is that: how can we demonstrate and show how to use a telehash mesh overlayed over the bittorrent stack to bring additional security and encryption options? Kind of like using OTR over IM :)

@RainaBatwing have you seen anyone use the udp tracker queries to exchange public keys as well as ip:port? Is there ways to extend it to bundle additional info but remain compatible with the existing protocol(s)?

@RainaBatwing
Copy link

Looking at the tracker specs, UDP trackers like most of the public trackers available today do not share any information about your endpoint except your IP and Port to other peers when they announce. Old style HTTP trackers do: With HTTP trackers, peers can publish 20 octets of "peer_id" data, which is transmitted to other users when they announce along with IP and Port. I think the peer_id can be any binary data, but I think 20 bytes may not be enough to be useful for straight forward crypto - even curve25519's relatively small public keys are 32 bytes long. Maybe the whole idea is bad. Oh well. I'll try not to bother you guys with ideas as impractical as this in the future. Maybe get some working prototypes going first!

@quartzjer I haven't found anyone "misusing" trackers for fun stuff yet.

@ariddell
Copy link

@quartzjer Is there another issue discussing why DHT isn't on by default (rather than off by default?)

@quartzjer
Copy link
Member Author

@ariddell the goals of v3 have evolved to just implement the minimum p2p crypto decoupled from any DHT, so there isn't even one bundled to enable. My intention has been to move all the DHT work I did into the https://github.com/quartzjer/dotPublic repo/project, but I've been pretty focused on getting v3 solid first, and then show how to easily couple it with that project and others (like bittorrent's DHT).

@quartzjer
Copy link
Member Author

Adding a link to @RainaBatwing twitter thread about uDT usage here too: https://twitter.com/jeremie/status/565509635822473217

I'm going to look more into how to map that into a standard path/transport for any node to offer and upgrade to.

@RainaBatwing
Copy link

uTP, not uDT. uTP is a network protocol similar to TCP, with long lived connections and congestion control. uTP implements ietf rfc6817 LEDBAT instead of copying TCP Reno congestion control. ledbat in the worst case performs the same as tcp, so will not outcompete it. In normal operation, ledbat will make measurements of outgoing delay (latency from this peer to specific remote peer) and adjust upstream speed to transmit as fast as possible, but without increasing outgoing delay by more than 100ms.

Why is this super important? Telehash is aiming to run on home internet connections, like HFC and ADSL. These modems normally have many seconds worth of outgoing buffer, and reno will fill that buffer, so any other network applications trying to send a message like a HTTP GET will have to wait several seconds before their packets leave the house, whenever a p2p app is transmitting at full speed, like sharing an image file or audio message, profile, or something else that is large enough to transmit for more than a second. When the modem's outgoing buffer becomes saturated, other people on the network, like family members or coworkers will try to figure out who is to blame, and start demanding the user stop using P2P systems.

It is the responsibility of a good p2p network layer to ensure the default operation does not disrupt interactive traffic like web browsing! If telehash is disruptive, it not only does itself and it's applications harm, but could perpetuate a negative feeling towards p2p applications in general, if it gets any widespread usage.

I don't think uTP is a good protocol to run telehash on, because it is connection-based and does not transparently move between network addresses when peers roam. Telehash should implement LEDBAT internally and not compromise on those features.

@quartzjer
Copy link
Member Author

I definitely need to make an architecture diagram of how v3 interacts with transports to help explain things more simply, apologize about any confusion.

If you consider telehash to be like PGP/GPG, where it is primarily just providing encrypted blobs that something else has to send/receive, maybe it makes more sense how it just uses any transport(s) available? It can actually use uTP really well even though it is connection oriented and non-roaming, since the state of the telehash link/mesh is managed internally separate from the transports, the transport layer can seamlessly re-establish and negotiate different (multiple) uTP (or other) connections.

Once I get a chance to look into it more I'll find a usable JS and C implementation of uTP and show how this can work in practice :)

@RainaBatwing
Copy link

Okay! Thank you Jeremie!!

@ariddell
Copy link

ariddell commented Aug 5, 2015

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants