The Virtual Center for Network and Security Data

Sponsored by Department of Homeland Security (DHS) Science & Technology (S&T) Directorate

Project Summary

Today's networked systems are being attacked with increasing frequency and intensity. In order for researchers to fully understand the scope and impact of these attacks as well as develop defensive mechanisms, researchers require Internet-wide datasets. These dataset, going beyond simple single point packet traces, provide a broad view of events with rich correlated data.

In order to meet the demand for creation of such a repository, we have: (1) Interested potential data providers and secure their commitment to participate; (2) Coordinated the creation of meta-data for the repository; (3) Created a query interface for searching for data sources based on the meta-data; (4) Provided access to the data made available from member institutions; (5) Provided centralized aggregation and storage of specific data sets.

Virtual Repository Datasets:

In order to jump-start Internet-scale research we have developed a virtual data repository of rich, correlated datasets representing Internet scale behaviors. Data available from this virtual repository includes both infrastructure level data as well as data from distributed forensics tools. Just a few examples include:

  • Blackhole data consists of information collected by monitoring dark or unused address space and because the address space is unused, any traffic destined to this space can be considered malicious. The provided sensor data includes the monitoring of over 17 million unused addresses.
  • Attack and intrusion activity logs from NIDS and firewalls collected at over 1600 networks world wide supplied by WAIL and by dshield.org.
  • Exterior routing protocol (i.e., BGP) and Netflow data from the routers that make up the peering edge of the MichNet network. MichNet has a robust measurement infrastructure that monitors flow and routing information on 18 peering and backbone routers and over 500 interfaces.

Current and Past Virtual Repository Participants:

As part of this work we have brought together a diverse set of consortium partners representing tier-1 ISPs, national research networks, and existing global data collection infrastructure with the goal of providing a wide range of extremely relevant data sources, providing a global perspective on Internet behavior. Some current and past participants include:

DHS Project Portal

The DHS PREDICT portal where access to datasets can be requested by researchers is here: https://predict.org