Traces

From SimpleWiki
Jump to navigationJump to search

From this location you can download several traces, including anonymized packet headers (tcpdump/libcap), Netflow version 5 data, a labeled dataset for intrusion detection, and Dropbox traffic traces. More information on the data collection and on the anonymization procedures can be found below. When using these traces, please refer to the Acceptable Use policy.

Booters - An analysis of DDoS-as-a-Service Attacks

Below you can find the data sets presented in:

Information about how we performed our measurements and the characteristics of our network infrastructure can be found in Section II (Methodology) of the paper above cited, specifically section II-B (Measurements) and II-C (Compensating DDoS attack traffic).


Datasets for Booter attacks

Description Filename File size Attack type Attack average Attack sources
Booter 1 anon-Booter1.pcap.gz 1.6G DNS-based 700Mbps 4486
Booter 2 anon-Booter2.pcap.gz 818M DNS-based 250Mbps 78
Booter 3 anon-Booter3.pcap.gz 1.1G DNS-based 330Mbps 54
Booter 4 anon-Booter4.pcap.gz 5.5G DNS-based 1.19Gbps 2970
Booter 5 anon-Booter5.pcap.gz 60M DNS-based 6Mbps 8281
Booter 6 anon-Booter6.pcap.gz 1.4M DNS-based 150Mbps 7379
Booter 7 anon-Booter7.pcap.gz 2.4M DNS-based 320Mbps 6075
Booter 8 anon-Booter8.pcap.gz 197M CharGen-based 990Mbps 281
Booter 9 anon-Booter9.pcap.gz 465M CharGen-based 5.48Gbps 3779

Acceptable use

Use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license". Please make sure to cite our paper:

@inproceedings{santannajjIM2015,
author={Santanna, J.J. and van Rijswijk-Deij, R. and Hofstede, R. and Sperotto, A. and Wierbosch, M. and Zambenedetti Granville, L. and Pras, A.},
booktitle={IFIP/IEEE International Symposium on Integrated Network Management (IM)},
title={Booters - An analysis of DDoS-as-a-service attacks},
year={2015},
month={May},
pages={243-251},
doi={10.1109/INM.2015.7140298}
}

DNSSEC and its Potential for DDoS Attacks

Introduction

Below you can find the data sets presented in:

  • "DNSSEC and its Potential for DDoS Attacks" by Roland van Rijswijk-Deij, Anna Sperotto and Aiko Pras. In Proceedings of the 14th ACM Internet Measurement Conference (ICM 2014), November 5-7 2014, Vancouver, BC, Canada.

A technical report describing the data sets and outlining the acceptable use of the data sets can be downloaded here:

Datasets for DNSSEC-signed domains

Top-level domain Filename File size SHA256 hash
.com com.dnssec.db.gz 841M c69d8bd680825bac272e97b0575a07e90ccbe4ffb2492e13edca4781fb574b7d
.net net.dnssec.db.gz 174M e6c2b600c895a30b90fb1dc126a0ae55b28d0d6e0378164f4bca12b0b259bfa9
.org org.dnssec.db.gz 113M aff93cb57405d9131567e9d4b687dbd86067c8c162ab28528be3b09ca3f11a08
.nl nl.dnssec.db.gz 3.8G 740ea3b30c23992ee00a9d595982f0635eb67b12c13c4338b4e39afd72ecfeaa
.se se.dnssec.db.gz 601M af2d4a8b1503d021f9151136f39eb843809684de26d64ab3e4a47594953fd4df
.uk uk.dnssec.db.gz 25M dda8f4320d7dd8ce44d52bf4d59c30b2dc043fca01cefec06a8922494acb1e93

Datasets for regular domains

Top-level domain Filename File size SHA256 hash
.com com.non-dnssec.db.gz 1023M ce3b98e524f4f3589ed4e9a746bf88314a5c3e0815193e21ef98f286d5f787fb
.net net.non-dnssec.db.gz 209M 47679adee3eb0b8228e4fae98596db64b605aef5ee35324c228e8531a86d2f45
.org org.non-dnssec.db.gz 209M 0d57968b4734501fba52bca310bbf2e76684a82fe86b393e1dca1a97f4d55758
.nl nl.non-dnssec.db.gz 1.9G b07548f8d7ea2d3baf28971d2ec89f5fcf5235c3cee5b3543a159079e2e208a6
.se se.non-dnssec.db.gz 605M e2194c2ecedbd962cc5a51411edee9569061228693afc86881f57cda29d300cd
.uk uk.non-dnssec.db.gz 39M 8809d590c26fb25c903c3ce037c5cc9ec1d28e7a5e6ef90d13b464cd9f6040f4

Acceptable use

Use of the datasets above for research or other purposes is subject to the "Creative Commons 4.0 Attribution-Sharealike license".

Please make sure to cite either our IMC 2014 paper:

@inproceedings{IMC2014,
address = {Vancouver, BC, Canada},
author = {van Rijswijk-Deij, Roland and Sperotto, Anna and Pras, Aiko},
booktitle = {Proceedings of the Internet Measurement Conference 2014},
doi = {"http://dx.doi.org/10.1145/2663716.2663731"},
publisher = {ACM Press},
title = { {DNSSEC and its potential for DDoS attacks - a comprehensive measurement study} },
year = {2014}
}

Or cite the technical report that describes the datasets and accompanies the IMC 2014 paper:

@techreport{dnssec-techrep-2014,
address = {Enschede, The Netherlands},
author = {van Rijswijk-Deij, Roland and Sperotto, Anna and Pras, Aiko},
institution = {University of Twente},
title = { {Large-scale DNS and DNSSEC data sets for network security research} }, 
url = {"http://www.simpleweb.org/w/images/0/04/Techreport.pdf"},
year = {2014}
}

Cloud Storage

Benchmarks

You can download from this link the software and data presented in:

  • "Benchmarking Personal Cloud Storage" by Idilio Drago, Enrico Bocchi, Marco Mellia, Herman Slatman and Aiko Pras. In Proceedings of the 13th ACM Internet Measurement Conference. IMC 2013.

Dropbox User Files

In this experiment we collected basic statistics of what files are stored in Dropbox folders.

Download our datasets:

Name File Size Volunteers
Crawler Dataset 219M 333

Some results derived from these data can be found in here.

Dropbox Traffic Traces

You can download from this page the flow data used in the following paper:

Check here for more details. Several scripts used to process the data are also available here.

First Data Capture

These datasets were captured from March 24, 2012 to May 5, 2012.

Name File Size Flows Devices
Campus 1 21MB 167,189 283
Campus 2 262M 1,902,824 6,609
Home 1 181M 1,438,369 3,350
Home 2 82M 693,086 1,313

Second Data Capture

This dataset was captured from June 01, 2012 to July 31, 2012.

Name File Size Flows Devices
Campus 1 32M 264,131 270

Intrusion Detection

SSH datasets

The SSH datasets feature a unique combination of flow data (exported using NetFlow) and authentication log files, allowing for validation of any flow-based intrusion detection system. More information on the datasets can be found here. Citing the paper accompanying the SSH datasets is required when using the datasets:

SSH Compromise Detection using NetFlow/IPFIX
Rick Hofstede, Luuk Hendriks, Anna Sperotto, Aiko Pras. In: ACM SIGCOMM Computer Communication Review, Vol. 44, No. 5, 2014, ISSN 0146-4833, pp. 20-26.

Labeled Dataset for Intrusion Detection

In this scenario, a honeypot (running in a virtual machine) ran for 6 days, from Tuesday 23 September 2008 12:40:00 GMT to Monday 29 September 2008 22:40:00 GMT. The honeypot was hosted in the University of Twente network and directly connected to the Internet. The monitoring window is comprehen- sive of both working days and weekend days. The data collection resulted in a 24 GB dump file containing 155.2 million packets. The processing of the dumped data and logs, collected over a period of 6 days, resulted in 14.2M flows and 7.6M alerts. More information on the labeling procedure can be found here.

Pcap Traces

These datasets are a collection of anonymized packet headers (tcpdump/libcap) and NetFlow data collected from various locations in the Netherlands. More information on the data collection and anonymization procedures can be found here. You can find bellow a short description of the scenarios where the datasets where collected.

Trace 1 - Packet Headers

In scenario 1, the 300 Mbit/s (a trunk of 3 x 100 Mbit/s) ethernet link has been measured, which connects a residential network of a university to the core network of this university. On the residential network, about 2000 students are connected, each having a 100 Mbit/s ethernet access link. The residential network itself consists of 100 and 300 Mbit/s links to the various switches, depending on the aggregation level. The measured link has an average load of about 60%. Measurements have taken place in July 2002.

Trace 2 - Packet Headers

In the second scenario, the 1 Gbit/s ethernet link connecting a research institute to the Dutch academic and research network has been measured. There are about 200 researchers and support staff working at this institute. They all have a 100 Mbit/s access link, and the core network of the institute consists of 1 Gbit/s links. The measured link is only mildly loaded, usually around 1%. The measurements are from May - August 2003.

Trace 3 - Packet Headers

This dataset was collected in a large college. Their 1 Gbit/s link (i.e., the link that has been measured) to the Dutch academic and research network carries traffic for over 1000 students and staff concurrently, during busy hours. The access link speed on this network is, in general, 100 Mbit/s. The average load on the 1 Gbit/s link usually is around 10-15%. These measurements have been done from September - December 2003.

Trace 4 - Packet Headers

In scenario 4, the 1 Gbit/s aggregated uplink of an ADSL access network has been monitored. A couple of hundred ADSL customers, mostly student dorms, are connected to this access network. Access link speeds vary from 256 kbit/s (down and up) to 8 Mbit/s (down) and 1 Mbit/s (up). The average load on the aggregated uplink is around 150 Mbit/s. These measurements are from February - July 2004.

Trace 5 - Packet Headers

The dataset Packet Headers 5 was collected in a hosting-provider, i.e. a commercial party that offers floor- and rack-space to clients who want to connect, for example, their WWW-servers to the Internet. At this hosting-provider, these servers are connected at (in most cases) 100 Mbit/s to the core network of the provider. The bandwidth capacity level of this hosting-provider’s uplink (that we have measured) is around 50 Mbit/s. These measurements are from December 2003 - February 2004.

Trace 6 - Packet Headers

In scenario 6, a 100 Mbit/s Ethernet link connecting an educational organization to the internet has been measured. This is a relatively small organization with around 35 employees and a little over 100 students working and studying at this site (the headquarter location of this organization). All workstations at this location ( 100 in total) have a 100Mbit/s Lan connection. The core network consists of a 1 Gbit/s connection. The recordings took place between the external optical fiber modem and the first firewall. The measured link was only mildly loaded during this period. These measurements are from May - June 2007.

NetFlow Traces

Trace 7 - NetFlow Data

The Netflow version 5 data was recorded in the access router connecting a university to its ISP. It contains flow information about most of the incoming and outgoing university’s traffic and some internal traffic as well. The traces cover a period of time of two working days, namely between Wednesday August 1st 2007, 00:00 and Thursday August 2nd 2007, 23:59. The university has a /16 network providing connectivity to the employees and the students on its buildings and the campus. The university is connected to its ISP through a 10 Gbps optical link with an average load of 650 Mbps and peaks up to 1.0 Gbps.

Please note that this trace consists of NetFlow datagrams, collected between flow exporter and flow collector. As such, to obtain the raw flow data, the trace should be imported or replayed to a flow collector, such as nfcapd. For more information on this works, we refer to the following tutorial on flow monitoring.

Software

Some analysis software is described in this PDF document and can be downloaded from here.

The source code of the application based in AnonTool API, used to anonymize the Netflow data can be found here.

Other Trace Sources