In my previous post on the BitTorrent protocol, I took a look at the main parts of what makes up its infrastructure. For this post I am going to focus on how data is uploaded and downloaded.
The BitTorrent protocol is designed as a very efficient way of sharing data. You start by using your BitTorrent client to prepare the file that you want to share. A number of operations are performed at his point
- The file is divided up into pieces. Most pieces will be the same size and can be 256kb in size upwards.
- A cryptographic hash value is calculated which is used to detect any modifications to pieces when they are downloaded. This hash value is also called the info_hash and is unique for every file uploaded
- The upload is registered with one or more trackers which then add the upload to their indexes.
The client which uploaded the torrent file becomes the one and only seed until someone else completes the download. A seed is any client which has successfully downloaded all pieces and is now sharing the original upload. A leech or peer is any client which has downloaded one or more pieces but has yet to complete the full download. Seeds, peers and leeches can share data and this is what makes BitTorrent so efficient, anyone who downloads a piece can share it with those that do not have it. It also does not matter what order you download the pieces in, as your BitTorrent client will reassemble them in the correct order.
A key role of the tracker is to monitor who has downloaded what piece. The tracker will also make sure that the rarest pieces are shared first. This is a very important concept as downloading via BitTorrent is like making a jigsaw, you put it together piece by piece but if one is missing you will never complete it. The tracker tries to avoid this by monitoring what is the least shared piece and will then make sure that it is the next one that is downloaded by another client.
From an OSI model point of view, most data is moved around using the TCP protocol. This is a bit confusing, a protocol within a protocol. BitTorrent itself operates at layer 7, the application layer. When it comes to the data transport layer, BitTorrent uses TCP and UDP. Data transfers use TCP and connections with trackers normally use UDP. I will take a look at this in more detail in my next blog post which looks at detecting BitTorrent activity on your network.
At any one time your BitTorrent client can be downloading many pieces simultaneously. This approach uses up the available bandwidth so that you can download faster. However, this can also cause major problems on networks for two reasons. Firstly the bandwidth consumption can be high which can slow down access to other Internet based services. Secondly the high number of connections can overload firewalls and NAT devices as they try to maintain a list of who is connecting to what.
When you are downloading and sharing data using BitTorrent you can easily find out what connections your system is making. A peer list will show what connections are established and the bandwidth speed to each. This can be a problem for some people as it can be an easy way to find out who is sharing copyrighted material. To counteract this, a number of applications have been developed which block connections from companies associated with copyright infringement investigations. The applications operate like firewalls, blocking any connections which appear on their lists. Without successfully establishing a connection it can be a hard for a third party to determine if you are sharing data or not.
In my next post on this topic I am going to look at how you can detect BitTorrent on your network. From monitoring the downloading of torrent files to watching for unusual traffic on high TCP port numbers. If there is anything specific you want me to cover please add a comment and I will try and include it in my next post.
Darragh
Darragh Delaney is head of technical services at NetFort. As Director of Technical Services and Customer Support, he interacts on a daily basis with NetFort customers and is responsible for the delivery of a high quality technical and customer support service. Follow Darragh on Twitter @darraghdelaney