diff --git a/README.md b/README.md index 67e6e3a..2e0cc9b 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,29 @@ # Malware Classifier From Network Capture -*Malware Classifier* is a simple free software project done during an [university workshop of 4 hours](http://www.foo.be/cours/dess-20142015/Redis-Introduction.pdf). The objective of the 4 hours workshop was to introduce network forensic and simple techniques to classify malware network capture (from their execution in a virtual machine). So the software was kept very simple while using and learning existing tools (networkx, redis and Gephi). +*Malware Classifier* is a simple free software project done during an [university workshop of 4 hours](http://www.foo.be/cours/dess-20142015/Redis-Introduction.pdf). The objective of the 4 hours workshop was to introduce network forensic and simple techniques to classify malware network capture (from their execution in a virtual machine). So the software was kept very simple while using and learning existing tools ([networkx](https://networkx.github.io/), [redis](http://www.redis.io/) and [Gephi](http://gephi.github.io/)). + +# How to use the Malware Classifier + +You'll need of a set of network packet captures. In the workshop, we used a dataset with more than 5000 pcap files generated from the execution of malware in virtual machines. + +``` +... +0580c82f6f90b75fcf81fd3ac779ae84.pcap +05a0f4f7a72f04bda62e3a6c92970f6e.pcap +05b4a945e5f1f7675c19b74748fd30d1.pcap +05b57374486ce8a5ce33d3b7d6c9ba48.pcap +05bbddc8edac3615754f93139cf11674.pcap +... +``` + +The filename includes the MD5 malware executed in the virtual machine. + +If you want to classify malware communications based on the Server HTTP headers of the (potential) C&C communication. + +```shell +cd capture +ls -1 . | parallel --gnu "cat {1} | tshark -E header=yes -E separator=, -Tfields -e http.server -r {1} | python ./bin/import.py -f {1} " +``` ## Notes for the student