criteria for perfect p2p fileshare:
- encrypted communication between nodes
- communication over ‘approved’ channels, web/pop3/nntp/ssl
- decentralized
- not slow like gnutella
decentralized version of bittorrent (should satisfy most of the criteria):
network tracker info
gnutella style network of connected clients, each with the ability to serve as tracker (or supernode) for any file hashes they have knowledge of. generally a client will have decent knowledge of any hashes it is currently in the process of dowloading/uploading. some sort of election is necessary to vote in additional supernodes when nodes log off or become overloaded. also node rotation is needed to help distribute the load of tracking, and also to distribute the bandwidth load.
search mechanism
no actual client to client searching. when a client joins the network it uses some sort of ‘discover’ method to find supernodes, and to find out limited details about every hash they have knowledge of; the filename, size, hash, number of seeds/downloads, and some sort of descriptive text. this information then builds a ‘page’ that looks very much like bytemonsoon or suprnova; a simple list of known hashes. as the client spends some time on the network it will come into contact with more supernodes, and eventually learn everything they know as well. the user can then search his local list of hashes, organize them by type (movie, music, game, application, misc) by number of seeds/downloads, etc.
network protocol
all communication between client and supernode is done via http requests, on a client specified port. you could run your client on a common port commonly used for other types of communication to help disguise traffic. communication between clients is limited to asking for knowledge of supernodes, and actual data transfer. actual transferring and communication between client and supernode will be backwards compatible with existing bittorrent (to a certain extent)
initial network discovery
this is something i’m unsure about implementing so far. it needs to be decentralized, otherwise the whole system is useless. heres my ideas so far:
-
gnutella style ‘host catcher’. client keeps track of every ip it sees on the network and then starts going through the list until it finds an active client, which it then asks for all the supernodes it knows of. this is very messy, but is very decentralized. idea being that hopefully people will publish current lists of ips in various locations around the web so you can easily connect to the network for the first time.
-
nify usage of the dns system. a client would do an nslookup for a specified hostname, which will return a list of (fairly)current supernode IPs. the TTL for the domain name could be set resonably low so as always to have fresh info. there would need to be some sort of mechanism for exporting a list of IPs to a zone file on a fairly regular interval, and people would have to be constantly restarting nameservers, which would be a pain in the butt. this still leaves a central location to be shut down, but with enough authoritive nameservers this shouldnt be too much of a problem. alternately clients could return ip addresses in a dns compatible way, allowing people who run thier client all the time to volunteer to act as authoritave servers for the domain. i like this idea alot. provided there are enough clients that stay conected fairly regularly (i’m not sure what the limit on authoritave dns servers is) providing a pretty large list of supernodes, it should be possible to reliably connect to the network at all times.
supernodes
upon initial connection to the network a client will learn the knowledge of several supernodes. because a client needs a good amount of data before the user will be able to find things to download, it seems logical to let it serve as supernode for a while. i dont know whether itd be more efficient to give the new client a list of supernodes and have it ask each of them for all the info they can give, or to tell all the supernodes that the new client needs info, and have them give the info to the client at thier leisure. the user needs to be able to specify the amount of bandwidth to devote to tracking, and perhaps the maximim number of hashes and/or ips it can keep track of. this should be a fairly conservative default. when a supernode starts getting overloaded, it needs to be able to somehow cause the creation of a new supernode, and to pass off some of the hashes it is tracking to the new supernode. supernodes will also maintain a list of other known supernodes, whether it be nodes that it passed info off to, or nodes who passed info off to it. when a user initiates a download, it contacts the last known supernode that was tracking that particular file/hash. then that supernode either gives the client the ip/port info of people who have the file, or gives the address of another supernode to ask.
last modified: 2003-08-26 01:52:54 -0400