Tilburg University: что метаданные торрентов могут рассказать специалистам по кибербезопасности?

Exploring Torrent Metadata as an Open Source Intelligence Resource

Cybersecurity specialists frequently encounter torrent traffic during investigations related to corporate policy violations, internal risks, and criminal activities. A recent research study proposes a fresh perspective on this activity, viewing it as a potential source of open-source intelligence. The authors of the study pose a practical question: how much valuable information can be extracted solely from publicly available torrent metadata without accessing the actual content?

The study highlights that torrent files contain a substantial amount of descriptive information. This includes file names, tracker addresses, cryptographic hashes, and various metadata fields.

Trackers, in particular, return lists of peers connected to specific files, revealing IP addresses and the ports in use. While this data is originally intended for coordinating downloads, it can also be leveraged for security analysis.

For the research, metadata was collected from The Pirate Bay and public UDP trackers across 206 popular torrent resources. The result was a dataset comprising over 60,000 unique IP addresses. Each address was further enriched with information from open services that provide data on geolocation, internet service providers, autonomous systems, and indicators of VPN or hosting infrastructure usage.

A distinct phase of the analysis focused on labeling IP addresses previously associated with the distribution of materials related to child sexual exploitation. This was achieved using an external publicly available monitoring database. Importantly, the researchers consciously avoided direct interaction with illegal content, relying instead on cross-referencing existing tags.

Co-author Giuseppe Cascavilla, an associate professor at Tilburg University, explained in a comment to Help Net Security that the choice of UDP trackers was a deliberate design decision. This approach allowed for testing the concept of analysis, although it imposed limitations on the completeness of observations.

According to Cascavilla, expanding the methodology through large-scale data collection from DHT networks could enhance coverage by identifying users who evade centralized trackers. He noted that integrating DHT data would likely strengthen the observed correlation between anonymization and riskier behavior, potentially leading to the formation of denser network structures with more connecting nodes. He emphasized that the current findings should be viewed as a conservative snapshot of activity already observable through open tracking mechanisms.

TrendTechie
Tilburg University: что метаданные торрентов могут рассказать специалистам по кибербезопасности?