What has Rusty to do with Brand New? If you’re a geek, chances are you’ve heard of torrents. If you haven’t, torrents are a file sharing technology. Torrents allow users to share large files quickly and easily in a decentralized manner.
In this post I will explain more of its technology and also talk about a new Open Source project that we have started collaborating since June this year. Keep reading because it is not like sharing your files via OneDrive, Google Drive, etc. and may be you too can benefit from it.
You will learn about the technology how it works without much technical “fuzz” and why you can benefit from it, in addition to how it differs from your standard file sharing solutions. Thereafter, I will talk about Torrust, the new open source project we have started working with.
About Torrust you will learn four things:
- How it is different from other solutions
- The improvements that we are working on
- The team behind it
- Potential use case in the field of data science
How does torrent technology work?
Torrents are a decentralized approach to sharing files “anonymously” to a great extent. The anonymity though is relative. It is relative, since your IP is not obfuscated by Torrent trackers. If you really wanted to share anonymously you could do this using a VPN to hide your real IP.
Nevertheless, unlike other traditional file sharing methods, torrents allow users to connect with each other directly without the need for a centralized server.
This has several advantages:
- It’s much more difficult for governments or ISPs to block torrent traffic since there’s no central point that can be shut down.
- Torrents are also very efficient in terms of bandwidth usage. Since files are downloaded in small pieces from multiple sources, the load on any one server is greatly reduced.
- The more people share the same file the faster the download will be since there are more sources to download from.
However, torrents also have some disadvantages:
- The main disadvantage of torrents is that they can be used for illegal purposes. Because torrents are decentralized and anonymous, they’re often used to share pirated movies, music, and software.
- Another disadvantage of torrents is that they might be sluggish if there are not many sources providing the same torrent file.
How standard file sharing differs from Torrents
If you compare torrents to standard file sharing done with for example One Drive they are different in many ways:
- First, standard file sharing solutions are centralized, which means that there is a single point of control. If that point of control is shut down, the entire system goes down with it. Torrents are decentralized, which means that there is no single point of control. Even if one source is shut down, the torrent can still be downloaded from other sources.
- Second, standard file sharing solutions allow you to share files with only a limited number of people. For example, you might be able to share a file with up to 100 people using One Drive. With torrents, there is no limit to the number of people who can download the same torrent.
- Third, standard file sharing solutions are usually very inefficient in terms of bandwidth usage. Among other things for example, if you’re sharing a 100 MB file with 100 people, that’s 10 GB of data that needs to be transferred. With torrents, which split the files into small pieces only the pieces of the file that are needed are downloaded, so the bandwidth usage is much lower and you always avoid downloading the same piece twice.
- Fourth, the more people share the file the faster you download it.
“An image is worth a thousand words” as the saying goes, so here is a little graph to show you how your download time can go down. I took some average upload and download speed world statistics and assumed only a 10th of the upload available bandwidth per seeder and calculated the time savings for a Is it not sweet?
How you can benefit?
I think from the above you can already see the benefits of this decentralized technology.
- You do not depend on a central server
- Speed
- Unlimited sharing
- Bandwidth optimization
- Anonymity
Torrents used by Institutions and Businesses
As you can imagine It is a fact that these advantages are readily being exploited for legal as well as for illegal purposes just as with any technology. Which is the reason why torrents are too quickly associated to illegal pirating. Still though, let me give you some names of companies that actively use them for legal purposes such as being more efficient and providing better services Amazon, Facebook (Meta), Linux, some governments, etc. It is being used for software updates, backups, etc.
If you want to read a little bit more about these uses:
- https://spectrum.ieee.org/how-torrents-can-benefit-businesses-and-users
- https://www.makeuseof.com/tag/8-legal-uses-for-bittorrent-youd-be-surprised/
- https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTorrent.html
Torrents used privately
For personal use with friends, family clubs etc. you can also benefit from it by actively sharing for example images, and other files privately without having to go through another service.
One benefit of Torrents I have not previously spoken about
For one, torrents are open source. This means that anyone can develop software that uses the torrent protocol, and there are no licensing fees or restrictions on who can use it. This makes torrents a great choice for open source projects that need to distribute large files which brings us to Torrust.
Torrust – New Open Source Torrent Solution written in Rust
Torrust is a Rust implementation of the torrent protocol. It has the different elements usually found in Torrent Software solutions which are:
List of current features
- Multiple UDP server and HTTP(S) server blocks for socket binding possible
- Full IPv4 and IPv6 support for both UDP and HTTP(S)
- Private & Whitelisted mode
- Built-in API
- Torrent whitelisting
- Peer authentication using time-bound keys
- newTrackon check supported for both HTTP, UDP, where IPv4 and IPv6 is properly handled
- SQLite3 Persistent loading and saving of the torrent hashes and completed count
- MySQL support added as engine option
Also if you are curious about Rust here are some articles on it:
- How Rust Compares to Other Programming Languages
- Rust vs. Python: Why Rust is gaining in popularity
- Rust by the Numbers: The Rust Programming Language in 2021
How it is currently different from other Torrent solutions
Improved speed, scalability and resource management have been to date the focus of this torrent implementation.
Speed, scalability and resource management
In terms of resource management and speed we already have proof that it does what it is meant to do. It is already being used in the backend of some hosting sites with millions of connections.
The improvements that we are working on
With the addition of our team we are now working on improving security, reliability, code quality and increased user friendliness and repository automation.
User friendliness or UX and UI
Most of the torrent clients and sites can only be described as ugly and awkward to use. We are currently working on creating a minimalist interface and an underlying use case logic that makes sense and is scalable.
Code quality & security & reliability
Currently the whole code base is being reworked and all the elements of the current software architecture are being analyzed for security holes. We are implementing testing throughout the whole project’s code base.
For scalability and security we are defining the right approach for creating the right user / permission structure to avoid over engineering it.
A current goal is to apply Test Driven Development (TDD) and automation to all the projects repositories and through this improve the overall testing coverage, reliability and security of the software.
The team behind it
The team behind Torrust at the moment consists of Mick van Dijke, several voluntary contributors and the Nautilus Cyberneering team which incorporates software security architects, senior developers and other freelance collaborators like UX / UI designers etc.
The Nautilus Cyberneering team’s experience during the past year has been to build up the necessary skills to automate git code repositories to implement versioning of data sets used in Machine Learning and Data Science. More about the Nautilus Cyberneering team here.
Torrents Potential use case in the field of data science
Data scientists often need to download large data sets for their work. These data sets can be difficult to find, and they’re often very large, which can make downloading them slow and painful.
Given that they are slow and painful with instances where scientist require various attempts to download the wanted file they have required something called “chunking”. The chunking feature as it happens to be the case is already existent in Torrents.
Therefore the challenge of size and versioning for any community using large data sets can optimized even further by using torrents.
For example, for the large files chunking is already implemented in torrents and in data sets with thousands of small files torrents can also be a nice approach having a different versioning approach. With data sets for example often one version of is made up to a high percentage of the same files as another version which you and other scientist may already have downloaded. Imagine how much you could reduce the download time and reduce time and bandwidth use.
Conclusion
We think that torrents are a great way to share data, and we’re excited to see what we will build with our community for everyday torrent use but also for their application in machine learning and data science.
Also if you are curious about Torrust or what else we have been up to:
Stay tuned for more updates. Thanks for reading!