Gnutella2
|
The Gnutella2 peer-to-peer protocol is a reworking of the Gnutella protocol, written mainly by Michael Stokes. It drops all of the old Gnutella protocol except for the connection handshake and adopts an entirely new and intricate system.
Contents |
History
In November 2002, Michael Stokes announced the Gnutella2 protocol to the Gnutella Developers Forum, which immediately caused a schism among the developers. Some thought the goals behind Gnutella2, primarily to make a clean break with the Gnutella 0.6 protocol and start over so that much of the kludge would be done elegantly, to be impressive and desirable. Other developers, primarily those of LimeWire and BearShare, thought it a "cheap publicity stunt" and discarded any technical merits. Many of that group still refuse to refer to the network as "Gnutella2" and instead refer to it as "Mike's Protocol".
Curiously, the Gnutella2 protocol still uses the old "GNUTELLA CONNECT/0.6" handshake string for its connections as defined in the Gnutella 0.6 specifications which was criticized by the GDF as an attempt to use the Gnutella network for bootstrapping the new, unrelated network, while proponents of the network claimed that its intent was to remain backwards-compatible with Gnutella to allow current Gnutella clients to add Gnutella2 at their leisure.
With the developers entrenched in their positions, a flame war soon erupted, centering around lead BearShare developer Vincent Falco, which further cemented both sides' resolves.
The draft specifications were released on March 26, 2003, and more detailed specifications soon followed. Gnutella2 (G2) is not supported by many of the "old" Gnutella network clients, however many Gnutella2 clients still also connect to Gnutella. Many Gnutella2 proponents claim that this is because of political reasons, while Gnutella supporters claim technical reasons for avoiding the new protocol.
How it works
Gnutella2 divides nodes into two groups, leaves and hubs. Leaves maintain one or two connections to hubs, while hubs accept hundreds of leaves, and many connections to other hubs. When a search is initiated, the node obtains a list of hubs if needed, and contacts the hubs in the list, noting which have been searched, until the list is exhausted, or a predefined search limit has been reached. This allows a user to find a popular file easily without loading the network, while theoretically maintaining the ability for a user to find a single file located anywhere on the network.
Hubs index what files a leaf has by means of a Query Routing Table, which is filled with single bit hashes of keywords which the leaf uploads to the hub, and which the hubs then use to create a combined version to send to their neighboring hubs to reduce the number of queries forwarded. This allows for hubs to reduce bandwidth greatly by simply not forwarding queries to leaves and neighboring hubs if the entries which match the search are not found in the routing tables.
Gnutella2 relies extensively on UDP, rather than TCP, for searches. The overhead TCP introduces would make a random walk search system unworkable, though UDP is not without its own drawbacks.
Protocol features
Gnutella2 has an extensible binary XML-like packet format which was conceived as an answer for Gnutella's many klugdes, as future network improvements and individual vendor features could be added without worry of causing bugs in other clients on the network. While many developers who came after the flame war have proclaimed that this feature makes it much easier to code a client for Gnutella2 than Gnutella, Gnutella developers still maintain that the Generic Gnutella Extension Protocol (GGEP) allows for flexible additions to the Gnutella 0.6 protocol.
Gnutella2 employs SHA1 hashes for file identification, to allow for a single file to be reliably downloaded in parallel from multiple sources (swarming), as well as Tiger-Tree hashes to allow for the reliable uploading of parts as the file is being downloaded.
To create a more robust and complete system for searching, Gnutella2 also has a metadata system for more complete labelling, rating, and quality information to be given in the search results than would simply be gathered by the file names. Nodes can even share this information after they have deleted the file, allowing users to mark viruses and worms on the network without requiring them to keep a copy.
Gnutella2 also utilizes compression in its network connections to reduce the bandwidth used by the network.
Shareaza has the additional feature to request previews of images and videos, though currently no additional clients take advantage of this.
Clients
Some current Gnutella2 clients are:
- Shareaza (Windows), Open source C++ under the GPL.
- Morpheus (Windows), Closed source.
- Gnucleus (Windows), Open source core in C/C++ under the LGPL.
- Adagio (Cross Platform), Open source Ada under the GPL.
- FileScope (Cross Platform), Open source C# under the GPL.
- Caribou (Cross Platform), Open source C++ under the LGPL.
- MLDonkey (Cross Platform), Open source Ocaml under the GPL.
- TrustyFiles (Windows), Closed source.
External links
- Gnutella2 Wiki (http://www.gnutella2.com/)
- Gnutella2 Mailing-List (http://www.gnutella2.com/mailman/listinfo/g2-dev_gnutella2.com)
- Gnutella2 Protocol Specification (http://www.gnutella2.com/index.php/Main_Page#The_Protocol)
- Gnutella2 crawler (http://crawler.instantnetworks.net/)