Sometimes the discussion drifts into an argument about vulnerability assessment vs. vulnerability management. I did not see a lot of opinions on this debate on the internet hence I prefer talking about it rather than about the difference between pentesting and vulnerability management. People may switch labels for vulnerability management and vulnerability assessment or give it different names but the difference - and there is one, a very big one - remains the same.
Vulnerability assessment vs. vulnerability managementTo introduce you to the concept of big data in vulnerability management, let me clarify some terms first. I'd like to talk for a minute about the difference between vulnerability assessment and vulnerability management. Careful, these are no academic definitions! Most people talk about vulnerability management or vulnerability assessment. The concept they are usually referring to is what I call vulnerability assessment. It can be part of a penetration test or as a bigger solution of its own. Quite some people may know vulnerability assessments from the context of scanning machines for vulnerabilities for exploitation (pentesters) or for patching (patch management, vulnerability lifecycle, asset management). This does of course not imply that pentesting is only there for exploiting client machines and then leaving the client on his own to deal with the remediation by himself. Vulnerability assessment can be seen as a piece of a puzzle, a snapshot of a single or a few systems. What I understand under vulnerability assessment may become a little clearer when I put it in contrast with its bigger brother, vulnerability management.
Vulnerability management operates on a different scale than vulnerability assessment. In fact, it is all about scaling up and coping with problems that arise from scalability. We are talking about numbers here and a large-scale enterprise context. We discuss vulnerability management once the focus shifts from "Hey, look at this rare vulnerability, let's sit down and look for the best remediation!" to "Ok, now we got the scanner reports of these 8000 machines, let's sit down and try to find the best way how to deal with the most critical vulnerabilities." Vulnerability management - in contrast to vulnerability assessment - is a lot more about prioritization and dealing with limited resources like time and budget. A lot of hackers may give these management aspects an eyeroll and regard this only one level above writing documentation when it comes to fun factors. I agree in many ways, but fact is that this is an existing issue for big enterprises. And this is also where a lot of money sits. The peculiarities of vulnerability management have been worked out before in more detail as well.
Big data comes into play
What I get from talking to experts who are dealing with vulnerability management issues is that we quickly drift into the area of big data when trying to work with such huge amounts of data. How will you start to remediate your vulnerabilities when you can't even handle the amounts of output that is being generated by your scanners? I've seen some custom scripts to deal with scanner reports in order to make them more structured or easier to handle. This is the core of the problem really, unstructured data and our inability to deal with it efficiently. Although all scanner solutions offer nice exports like csv I see a huge gap when it comes to analyzing the masses of data or post-processing them. There is a lack of maturity in the common vulnerability scanner solutions when it comes to handling big amount of scanner output and generating meaningful reports from these mountains of data. I've not seen a single product that offers integrated data mining techniques like clustering, classification or pattern recognition.
Security is a myth
At this point the focus starts to shift from being absolutely secure to being as secure as possible. Do we still talk about security when it comes to dimensions like in vulnerability management? Can one seriously assume to be secure once they reach a size of multiple millions of IPs? I'm not good at imagining such huge numbers, to let's try it with one million IPs. Still pretty much and difficult to handle for a human mind, isn't it? Ok, so let's try 100,000 IPs. This is fairly realistic for big medium-sized or small big-sized companies. Do you think a dedicated individual, not even speaking of a dedicated group with sufficient resources ("Oh, hi there, APT1!"), won't be able to find a whole somewhere within the 100,000 boxes? Even if not all 100,000 computers are internet-facing, playing hide'n seek in there with an intruder can be a real pain. Think furthermore about all the administration which comes with so many computers as a side-effect. Even assumingly basic tasks like asset management or patch management will become a nightmare at scales like this.
TL;DR: There are fine boundaries between vulnerability management, vulnerability assessment and penetration testing. Vulnerability management becomes a big data issue for large-scale operations.