How BitTorrent works
BitTorrent greatly reduces the load on seeders, because clients generally download the file from each other. The coloured bars beneath all of the clients represent individual pieces of the file. After the initial pieces transfer from the seed, the pieces are individually transferred from client to client. This demonstrates how the original seeder only needs to send out one copy of the file for all the clients to receive a copy.
BitTorrent greatly reduces the load on seeders, because clients generally download the file from each other. The coloured bars beneath all of the clients represent individual pieces of the file. After the initial pieces transfer from the seed, the pieces are individually transferred from client to client. This demonstrates how the original seeder only needs to send out one copy of the file for all the clients to receive a copy.
The bittorrent protocol breaks the file(s) down into smaller fragments, typically a quarter of a megabyte (256 KB) in size. Peers download missing fragments from each other and upload those that they already have to peers that request them. The protocol is 'smart' enough to choose the peer with the best network connections for the fragments that it's requesting. To increase the overall efficiency of the swarm (the ad-hoc P2P network temporarily created to distribute a particular file), the bittorrent clients request from their peers the fragments that are most rare; in other words, the fragments that are available on the least number of peers, making most fragments available widely across many machines and avoiding bottlenecks. The file fragments are not usually downloaded in sequential order and need to be reassembled by the receiving machine. It is important to note that clients start uploading fragments to their peers before the entire file is downloaded. Sharing by each peer therefore begins when the first complete segment is downloaded and can begin to be uploaded if another peer requests it. This scheme is particularly useful for trading large files such as videos and operating systems. This is contrasted with conventional file serving where high demand can lead to saturation of the host's resources as the consumption of bandwidth to transfer the file to many requesting downloaders surges. With BitTorrent, high demand can actually increase throughput as more bandwidth and additional “seeds” of the file become available to the group. Cohen claims that for very popular files, BitTorrent can support about a thousand times as many downloads as HTTP.
[edit]
Sharing files
To share a file using BitTorrent, a user creates a .torrent file, a small "pointer" file that contains:
* the filename, size, and the checksum (hash) of each block in the file (which allows users to make sure they are downloading the real thing)
* the address of a "tracker" server (which is discussed below)
* and some other data (like client instructions).
The torrent file is then distributed to users, often via email or placed on a website. The BitTorrent client is started as a "seed node", allowing other users to connect and begin downloading. When other users finish downloading the entire file, they can optionally "reseed" it--becoming an additional source for the file. One outcome of this approach is that if all seeds are taken offline, the file may no longer be available for download, even if a client has a copy of the torrent file. However, everyone can eventually get the complete file as long as there is at least one distributed copy of the file, even if there are no seeds.
Downloading with BitTorrent is straightforward. Each person who wants to download the file first downloads the torrent and opens it in the BitTorrent client software. The torrent file tells the client the address of the tracker, which, in turn, maintains a log of which users are downloading the file and where the file and its fragments reside. For each available source, the client considers which blocks of the file are available and then requests the rarest block it does not yet have. This makes it more likely that peers will have blocks to exchange. As soon as the client finishes importing a block, it hashes it to make sure that the block matches what the torrent file said it should be. Then it begins looking for someone to upload the block to.
BitTorrent gives the best download performance to the people who upload the most, a property known as "leech resistance", since it discourages "leechers" from trying to download the file without uploading it to anyone. (Although, confusingly, when used in opposition to "seeds" or "seeders" as in "S/L ratio" (meaning "seed/leech ratio"), "leecher" only means someone who hasn't downloaded the full file yet.)
Though BitTorrent is a good protocol for a broadband user, it is less effective for dial up connections, where disconnections are common. On the other hand, many HTTP servers drop connections over several hours, while many torrents exist long enough to complete a multi-day download (often required for large files).
[edit]
Terminology
availability
(also distributed copies) The number of full copies of the file available to the client. Each seed adds 1.0 to this number, as they have one complete copy of the file. A connected peer with a fraction of the file available adds that fraction to the availability (ie. a peer with 65.3% of the file downloaded increases the availability by 0.653).
choked
Describes an uploader to whom the client does not wish to upload. An uploading client 'chokes' another client in several situations:
* The second client is a seed, in which case it does not want any pieces (ie. it is completely uninterested)
* The uploading client is already uploading at its full capacity (ie. the value for max_uploads has been reached)
interested
Describes a downloader who wishes to obtain pieces of a file the client has. For example, the uploading client would flag a downloading client as 'interested' if that client did not possess a piece that it did, and wished to obtain it.
leech
A leech is usually a peer who has a negative effect on the swarm by having a very poor share ratio - in other words, downloading much more than they upload. Most leeches are users on asymmetric internet connections who do not leave their BitTorrent client open to seed the file after their download has completed. However, some leeches intentionally hurt the swarm to avoid uploading by using modified clients or excessively limiting their upload speed.
The term leech is also incorrectly used to refer to what should properly be called a peer, a member of the swarm who has not yet downloaded the complete file.
peer
A peer is one instance of a BitTorrent client running on a computer on the Internet that you connect to and transfer data. Usually a peer does not have the complete file, but only parts of it, however, 'peer' can be used to refer to any participant in the swarm (in this case, also known as a 'client').
scrape
This is when a client sends a request to the tracking server for information about the statistics of the torrent, like who to share the file with and how well those other users are sharing.
seed
A seed is a peer that has a complete copy of the torrent and still offers it for upload. The more seeds there are, the better the chances are for completion of the file.
snubbed
An uploading client is flagged as snubbed if the downloading client has not received any data from it in over 60 seconds.
superseed
When a file is new, much time can be wasted because the seeding client might send the same file piece to many different peers, while other pieces have not yet been downloaded at all. Some clients, like ABC and BitTornado, have a "superseed" mode, where they try to only send out pieces which have never been sent out before, making the initial propagation of the file much faster. This is generally used only for a new torrent, or one which must be re-seeded because no other seeds are available.
swarm
Together, all users sharing a torrent are called a swarm. Six peers and two seeds make a swarm of eight.
torrent
A torrent can mean either a .torrent metadata file or all files described by it, depending on context. The torrent file contains metadata about all the files it makes downloadable, including their names and sizes and checksums of all pieces in the torrent. It also contains the address of a tracker that coordinates communication between the peers in the swarm.
tracker
A tracker is a server that keeps track of which seeds and peers are in the swarm. Clients report information to the tracker periodically and in exchange receive information about other clients that they can connect to. The tracker is not directly involved in the data transfer and does not have a copy of the file.
[edit]
Comparison to other file sharing systems
Version 4.0.4 running in Windows XP
Enlarge
Version 4.0.4 running in Windows XP
The method used by BitTorrent to distribute files parallels to a large extent the one used by the eDonkey2000 network, but nodes in eDonkey's file sharing network usually share and download a much larger number of files, making the bandwidth available to each transfer much smaller. BitTorrent transfers are typically very fast, because all nodes in a group concentrate on transferring a single file or collection of files. While the original eDonkey2000 client provided little "leech resistance", most new clients have some sort of system to encourage uploaders. eMule, for example, has a credits system whereby a client rewards other clients that upload to it by increasing their priority in its queue. However, the nature of the eDonkey2000 concept means download speeds tend to be much more variable, although the number of available files is far greater.
A similar method to BitTorrent was the Participation Level introduced in KaZaA in 2002. The Participation Level would increase when you upload and decrease when you download. Then when you upload a file to someone else the person with the highest Participation Level gets it first, then they upload it on to the person with the next highest Participation Level, and so on. This can be visualised as a pyramid, with the people who have the most upload bandwidth available at the top and people with less bandwidth on progressively lower levels. This is the most efficient way to distribute a file to a large number of users: it is probable that even the people at the bottom of the pyramid will get the file faster than if the file was served by a non-P2P method. Unfortunately, the system adopted by KaZaA is considered by some to be flawed as it relies on the client accurately reporting their Participation Level and therefore it is easy to cheat with the many "unofficial" clients.
[edit]
Legal issues
BitTorrent, like any other file transfer protocol, can be used to distribute files without the permission of the copyright holder. BitTorrent has received bad press (mostly initiated by incensed Hollywood movie distributors) for its ability to also illegally distribute copyrighted files.
[edit]
Legal uses for BitTorrent
BitTorrent can be used by software developers who want to ease the bandwidth strain on their servers. If a developer offers a large file for download, the bandwidth limit of their server may be exceeded if a large number of people download the file. By offering the file via BitTorrent, they transfer much of the bandwidth burden to downloaders of the file.
For example, the site http://www.gameupdates.org offer legal game files via BitTorrent, the demo of the flight sim X-Plane is offered via BitTorrent, as well as the World of Warcraft ingame patches. Another such example is PlaneShift, a free open-source MMORPG, which uses BitTorrent for its primary method of distribution. The fan-film Star Wars: Revelations is distributing two DVD images as well as the film by itself via BitTorrent, and Star Wreck: In the Pirkinning, a feature-length film, was provided for download via the network besides a centralized server. In 2005, the rock group Harvey Danger began distributing their third full-length album, Little by Little..., using BitTorrent. Also, various operating systems have used BitTorrent as an alternative way of distributing ISO images of their releases, including FreeBSD, NetBSD version 1.6.2 and later, and most major Linux distributions.
Peter Jackson's production diaries for King Kong have been posted for download using BitTorrent technology. Democracy Now, a progressive news organisation, now distributes its daily television and radio broadcast using BitTorrent technology as well as by podcasting in addition to its traditional cable and satellite distribution. Several Anime companies have also used BitTorrent technology to release teaser episodes and trailers online for promotional purposes, as a sign of embracing technology that is often seen as a direct competitor. Furthermore, the NASA Space agency recently included BitTorrent as a means to download some of their larger space image files.
BitTorrent is also used to distribute updates to the BitTorrent client itself.
Following the success of the BitTorrent protocol, Bram Cohen, its creator, was hired in 2004 by Valve Software to develop a means of distributing patches and other content for online video games, proving that there are some less controversial reasons for the development of this technology. While many legal files, including Linux distributions, are available on other networks such as eDonkey2000 and Gnutella, these are placed there by users and not generally part of the official distribution mechanism. BitTorrent is overwhelmingly the most popular P2P protocol adopted officially for legal uses, and many adopters report that only by using the BitTorrent technology, with its dramatically reduced demands on networking hardware and bandwidth, could they afford to distribute such large files. This has led to a rapid upswing in both the size and quality of many files distributed freely online, notably from amateur film or video producers.
[edit]
Copyright enforcement
There have been many cases of BitTorrent sites distributing illegal content being shut down by organizations. Some of these shutdowns are performed by industry associations, such as the MPAA, and some are performed by government organizations.
In December 2004, the Finnish police raided a major BitTorrent site, Finreactor [1][2]. The charges have been dropped.
Suprnova.org, one of the most popular early BitTorrent sites, closed in December 2004, supposedly due to the pressure felt by Sloncek, the founder and administrator of the site. In December, 2005, Sloncek revealed that the Suprnova computer servers had in fact been confiscated by Slovenian authorities. LokiTorrent, arguably the biggest torrent source after the demise of Suprnova, closed down soon after Suprnova. Allegedly, after threats from the MPAA, Edward Webber (known as 'lowkee'), webmaster of the site, was ordered by the court to pay a fine and supply the MPAA with logs (the IP addresses of visitors). It is thought that one of the primary causes of this enforcement action was the website's early release of Grand Theft Auto: San Andreas and Halo 2 weeks before their commercial releases.
Other sites that offer files such as anime fansubs are careful to shut down torrents that have been licensed in the United States, and only provide files that cannot be bought in the U.S., presumably in order to stay on the good side of the "effect on the market" factor of the fair use definition.
Webber, in the weeks following his reception of the subpoena, had begun a fundraising campaign to pay lawyers fees in a legal battle against the MPAA. In news reports, Webber said he would stand up to protect the rights of file sharers, which he did not. Webber raised approximately US$45,000 through a PayPal-based donation system. It is unclear how much of that money went to the MPAA, but taking into account the amount of damages he most likely had to pay, probably much of it. Following the agreement, the MPAA changed the LokiTorrent website to display a message intended to intimidate filesharers. Webber did not comment on this change.
On May 25, 2005, the popular BitTorrent website elitetorrents.org was shut down by the United States Federal Bureau of Investigation and Immigration and Customs Enforcement. At first it was thought that a malicious hacker had gained control of the website, but it was soon discovered that the website had been taken over by the US government. Ten search warrants relating to members of the website were executed. It is thought that one of the primary causes of this enforcement action was the website's early release of Star Wars Episode III: Revenge of the Sith.
On October 24, 2005, a 38-year-old Hong Kong BitTorrent user Chan Nai-Ming (é³ä¹æ, using the handle å¤æ天ç Lit. The master of cunning, while the magistrate referred to him as Big Crook) allegedly distributed the three movies Daredevil, Red Planet and Miss Congeniality, subsequently uploading the torrent file to a newsgroup. (See HKSAR v Chan Nai Ming). He was convicted of breaching the copyright ordinance, Chapter 528 of Hong Kong law. The magistrate remarked that Chan's act of distributing the seed caused significant damage to the interest of copyright holders, up to thirty users downloading the torrent simultaneously.
He was released on bail for HK$5,000, awaiting a sentencing hearing, though the magistrate himself admitted the difficulty of determining how he should be sentenced due to the lack of precedent for such a case. On November 7, 2005, he was sentenced to jail for three months but was immediately granted bail pending appeal to the High Court.
On November 23, 2005, the movie industry and Bram Cohen, the creator of BitTorrent signed a deal they hoped would reduce the number of pirated films shared on the downloading network. The deal covered films found via the bittorrent.com website run by BitTorrent, Inc. It meant bittorrent.com had to remove any links to pirated films made by seven Hollywood movie studios. As it only covered the bittorrent.com website it is unclear what overall effect this has had on net piracy. (Source: BBC News)
[edit]
Legal defenses
There are two major differences between BitTorrent and many other peer-to-peer file-trading systems, which advocates suggest make it less useful to copyright violators. First, BitTorrent does not offer a search facility to find files by name. A user must find the initial torrent file by other means, such as a Web search. Second, BitTorrent makes no attempt to conceal the host ultimately responsible for a given file's availability: a person who wishes to make a file available must run a tracker on a specific host or hosts and distribute the tracker address(es) in the .torrent file. While it is possible to simply operate a tracker on a server that is located where the copyright holder cannot take legal action, this feature of the protocol does imply some degree of accountability that other protocols lack. It is far easier to request that the server's internet service provider shut the site down than it is to find and identify every user sharing a file on a traditional peer-to-peer network.
[edit]
Etiquette
Because BitTorrent relies on the upstream bandwidth of its users — and the more users, the more aggregate bandwidth is available for sharing the files — it is considered good etiquette to leave one's BitTorrent client open after downloading has completed so that others may continue to gain from the file that has been distributed.
It is not clear, however, how long one should leave their client open after downloading has finished. Many clients report the byte traffic upstream as well as down, so the user can see how much they have contributed back to the network. Some clients also report the "share ratio", a number relating the amount of data uploaded to the amount downloaded. It is generally considered good form to at least share back the equivalent amount of traffic as the original file size.
It is worth noting that the requirement of a "1.00" share ratio (uploading as much data as you have downloaded) is rather hotly contested given its relative impossibility to achieve for every person. On any given torrent, the best possible outcome is the original seeder with an infinite ratio (having only uploaded data and never downloaded any data), a number of peers with 1.00 ratios (having downloaded the file, uploaded just as much data, and then promptly logged off), and two users with a .50 ratio (the last two having each downloaded a separate half of the file and then shared their half with the other). This is highly unlikely to be achieved due to the very small chance of the last two peers downloading completely opposite halves and finishing just as the last seeder logged off and the fact that not all people will upload the same amount of data they downloaded as some will upload less and others will upload more. Ultimately, a perfect torrent would leave two end users with only a .50 ratio for the torrent, which means every user would have to provide new content at least equal to the portion of data they did not get to upload in the last torrent to maintain an overall ratio of 1.00.
While it's highly unlikely that all users who download a given torrent will achieve a 1.0 ratio on it (because the net ratio of all users is 1.0, if any user uploads past 1.0 some other user will have to sustain a lower ratio), it is more of a guideline to encourage the average upstream of a given network. Some networks, for example, prevent access to new torrents for the first 24-48 hours that the torrent is active to people with overall ratios of less than 1.0 and a certain amount of data uploaded.
The amount of time the client is left open may be more important than the amount of traffic contributed, since new users attempting to download a file will first need to find peers hosting the file.
Many advanced trackers now track statistics such as how many seeders and downloaders were on a torrent at the time of a user's disconnect as many consider this information more important than just the user's ratio of downloaded/uploaded.
[edit]
New developments
The BitTorrent protocol is still under development and therefore may still acquire new features and other enhancements such as improved efficiency.
In May 2005, Bram Cohen released a new beta version of BitTorrent that eliminated the need for Web site hosting of centralized servers known as "trackers". It is now possible to have a torrent up in minutes, with a file, a website, and no understanding of how it works. In addition, Cohen launched a new service on BitTorrent's website, which helps you find files - both legal and illegal.
Cohen explained that the tracker removal feature is part of his ongoing effort to make publishing files online "painless and disruptively cheap". The move is only one of several designed to remove BitTorrent's dependence on centralized trackers.
This change is said to cause some trouble in the legal efforts to shut down illegal file sharing. However, Tarun Sawney, BSA Asia antipiracy director, said BitTorrent files could still be identified, since with or without the tracker sites, someone still hosts the infringing files (see [3] [4]).
[edit]
Alternative approaches
The BitTorrent protocol provides no way to index torrent files. As a result, a comparatively small number of websites have hosted the large majority of copyright infringing torrents, rendering those sites especially vulnerable to lawsuits. In response, some developers have sought ways to make publishing of files more anonymous while still retaining BitTorrent's speed advantage. The Shareaza client, for example, provides three alternatives to BitTorrent: eDonkey2000, Gnutella, and Shareaza's native network, Gnutella2. If the tracker is down, it can finish the file over the other protocols, and/or find new (Shareaza) peers over G2. BitTorrent also inspired the privately held company InmateMediaGroup: eXeem. It is backed by Andrej Preston, the administrator of the now-defunct Suprnova BitTorrent website. eXeem is supposed to decentralize BitTorrent and eliminate the need for web-based trackers (easy targets for the RIAA or the MPAA). Unlike BitTorrent (the protocol and the official client), eXeem is closed-source and owned by a corporation. Distributed trackers is also one of the goals for Azureus 2.3.0.2 and BitTorrent 4.1.2. Another interesting idea that has surfaced recently in Azureus is virtual torrent. This idea is based on the distributed tracker approach and is used to describe some web resource. Right now, it is used for instant messaging. It is implemented using a special messaging protocol and requires an appropriate plugin. Anatomic P2P is another approach, which uses a decentralised network of nodes that route traffic to dynamic trackers.
[edit]
BitTorrent search / Trackerless
Recently, Bram Cohen released his own BitTorrent search engine [5], which searches popular BitTorrent trackers for torrents, although it does not host nor track torrents itself [6]. From software version 4.2.0, BitTorrent also support "trackerless" torrents, featuring a DHT implementation that allows the client to download torrents that have been created without using a BitTorrent tracker.
[edit]
Web seeding (unofficial feature)
One recently implemented feature of BitTorrent is web seeding. The advantage of this feature is that a site may distribute a torrent for a particular file or batch of files and make those files available for download from that same web server application; this can simplify seeding and load balancing greatly once support for this feature is implemented in the various BitTorrent clients. In theory, this would make using BitTorrent almost as easy for a web publisher as simply creating a direct download while allowing some of the upload bandwidth demands to be placed upon the downloaders (who normally use only a very small portion of their upload bandwidth capacity). This feature is an unofficial one, created by TheSHAD0W, who created BitTornado.
Link: Web-Based Seeding Specification
[edit]
Broadcatching
Another proposed feature combines RSS and BitTorrent to create a content delivery system dubbed broadcatching. Since a Steve Gillmor column for Ziff-Davis in December 2003, the discussion has spread quickly among many bloggers (Techdirt, Ernest Miller, and former Tech TV host Chris Pirillo, for example). In an interview, Scott Raymond explained:
"I want RSS feeds of BitTorrent files. A script would periodically check the feed for new items, and use them to start the download. Then, I could find a trusted publisher of an Alias RSS feed, and 'subscribe' to all new episodes of the show, which would then start downloading automatically — like the 'season pass' feature of the TiVo."
While potential illegal uses abound as is the case with any new distribution method, this idea lends itself to a great number of ideas that could turn traditional distribution models on their heads, giving smaller operations a new opportunity for content distribution. The system leans on the cost-saving benefit of BitTorrent, where expenses are virtually non-existent; each downloader of a file participates in a portion of the distribution.
RSS feeds layered on top keep track of the content, and because BitTorrent does cryptographic hashing of all data, subscribers to the feed can be sure they're getting what they think they're getting, whether that winds up being the latest Sopranos episode, or the latest Sveasoft firmware upgrade. (Naturally, however, ensuring that the same data reaches all nodes neglects the possibility that the original, source file may be corrupted or incorrectly labeled.)
[edit]
API
The BitTorrent web-service Prodigem has made available an ability to any web application capable of parsing XML through its standard Representational State Transfer (REST) based interface. Alongside this release is a first application built using the API called PEP which, in 400 lines of PHP, will parse any Really Simple Syndication (RSS 2.0) feed and automatically create and seed a torrent for each enclosure found in that feed.
[edit]
Encryption
Main article: Protocol header encrypt
Protocol header encrypt (PHE), Message stream encryption (MSE), or Protocol encryption (PE) are features of some BitTorrent clients that attempt to make BitTorrent hard to throttle. MSE and PE are two names for the same protocol.
Some ISPs throttle BitTorrent traffic because it makes up a large proportion of total traffic and ISPs don't want to spend money purchasing extra capacity. Instead, ISPs spend money purchasing hardware designed to look for and throttle BitTorrent traffic. Encryption makes BitTorrent traffic harder to detect and therefore harder to throttle.
[edit]
Peer exchange
Main article: Peer exchange
Peer exchange (PEX) is another method to gather peers for BitTorrent in addition to trackers and DHT. Peer exchange checks with other peers to see if they know of any other peers.
[edit]
BitTorrent-related applications
Because of the open nature of the protocol, many clients have been developed that support numerous platforms and written using various programming languages.