Encrypted Web Traffic Dataset: Event Logs and Packet Traces

Investor logo

Warning

This publication doesn't include Faculty of Arts. It includes Institute of Computer Science. Official publication website can be found on muni.cz.
Authors

ŠPAČEK Stanislav VELAN Petr ČELEDA Pavel TOVARŇÁK Daniel

Year of publication 2022
Type Article in Periodical
Magazine / Source Data in Brief
MU Faculty or unit

Institute of Computer Science

Citation
Web https://doi.org/10.1016/j.dib.2022.108188
Doi http://dx.doi.org/10.1016/j.dib.2022.108188
Keywords HTTPS dataset; TLS 1.2 encryption; Host-based data collection; Network data collection; Encrypted traffic analysis; Event-flow correlation
Attached files
Description We present a dataset that captures seven days of monitoring data from eight servers hosting more than 800 sites across a large campus network. The dataset contains data from network monitoring and host-based monitoring. The first set of data are packet traces collected by a probe situated on the network link in front of the web servers. The traces contain encrypted HTTP over TLS 1.2 communication between clients and web servers. The second set of data is an event log captured directly on the web servers. The events are generated by the Internet Information Services (IIS) logging and include both the IIS default features and custom features, such as client port and transferred data volume. Anonymization of all features in the dataset has been carefully carried out to prevent private information leakage while preserving the information value of the dataset. The dataset is suitable mainly for training machine learning techniques for anomaly detection and the identification of relationships between network traffic and events on web servers. We also add tools, settings, and a guide to convert the packet traces to IP flows that are often preferred for network traffic analysis.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.