Vulnerability Scanning in the Cloud – Part 1

This is the first in potentially a series of posts regarding vulnerability scanning in the cloud and some of the related challenges and helpful tips.

I started looking around for any good sources or posts on the topic of vulnerability scanning in the cloud.  In this case, an infrastructure as as service (iaas) scenario for private or public cloud.  I didn’t find anything.

How it starts..

When you get the email or call that goes something like.. “Hey, what are we scanning in our <vendor name here> cloud”??? .. In a perfect world you just say, “When we worked with everybody to setup and plan our cloud usage we had vulnerability scanning designs built in from the beginning.. We are good.”

Or, you start sweating and realize nobody ever brought you in to any planning or discussions and there are already large cloud deployments that you aren’t scanning.

Or maybe you are a consultant or service provider going in to an environment and setting up vulnerability scanning in a customer cloud. These posts should be helpful for people that are in the planning stages or trying to “catch up” with their cloud groups or customers.

Dynamic nature of the cloud

Most clouds give you the ability to dynamically provision systems and services, and as a result, you dynamically provision IP addresses. Sometimes these IP addresses are from a certain range, and often, especially for Internet facing systems, these IP addresses are from a large pool of addresses shared with other customers.

In these large dynamic ranges, it is common for the IP address you used today, to be used by another customer tomorrow. 

This dynamic nature is great for operations, but can cause some challenges on tracking assets. 

Asset management is different

Traditional vulnerability management has been very tied to IP addresses and/or DNS names. In cloud scenarios, assets are often temporary, or may not have DNS names. Sometimes your dns names for PaaS type services are provisioned by the cloud provider, with little or no control from your IT group.

Most cloud providers have their own type of unique identifiers for assets. These unique identifiers are what need to be used for asset tracking.. IP addresses, and sometimes DNS names are just stateful metadata for your asset. 

Also, cloud has different types of “objects” that can be given IP addresses beyond traditional compute system interfaces. Certain services can be provisioned in cloud from a PaaS solution that are dedicated to your tenancy/account, and they get their own IP address. Are these your asset? Many times you may have some control over the content and data on these services even though you don’t manage most of the underlying solution. 

In general, the whole approach for asset management in cloud is that your assets are tracked by the cloud provider, and you use their API’s to query and gather information on your assets.

Your vulnerability analysis and asset analysis needs to become dynamic and based on the data returned from your asset queries. This is definitely not a bad thing. Most big companies struggle with solid asset management because there are always ways to circumvent traditional asset management. (This is why network traffic based asset management is becoming so popular) 

Now, with cloud, as long as you are using the API, and know what tenancies you have, you can get a good list of assets… However, this list is short lived… You need to consistently query the API’s to get a good list. Some cloud providers are able to provide a “push” notification or provide “diffs” of what has come online or gone away in X amount of time. I think that is the future best practice of cloud asset management. Real time visibility into what is coming and going. 

 

Capacity is costly..

One major concept and value of cloud is only using and paying for capacity you need.

When it comes to information technology, this “costly capacity” in IaaS essentially comes down to

  1. Network usage (sending data over the network)
  2. Storage usage,(disk space, object space, etc)
  3. Compute usage (CPU)..

Classic Vulnerability scanning can typically be performed 2 different ways,

  1. Either scanning over the network from a scanning system, or
  2. By installing a local agent/daemon/service on the host that reports up the vulnerability data.

Both of these approaches use all 3 types of capacity mentioned above in your cloud, but mostly network and CPU usage.

Scanning over the network — Network Usage

Your cloud vendor’s software defined networking can have huge capacity, or it could remind you of early 90’s era home networking.

One of the major considerations for network based scanning is determining where your bottlenecks are going to be.

  • Do you have virtual gateways or bandwidth caps?
  • Do you have packet rate caps?
  • Are you trying to scan across regions or networks that may be geographically disperse with high latency and/or low bandwidth?

Cloud networking doesn’t just “work”… in many cases it is far more sensitive than physical networks. You need to carefully look at the network topology for your cloud implementations and base scanner placement based on your topology and bottleneck locations. Depending on your network security stack, you may even need or want to avoid scanning across those stacks.

Agents

Agent based scanning is starting to be one of the preferred options in some cloud iaas implementations, because you can just hope that every host reports up it’s vulnerability data when it comes online. This is a nice approach if you have good cooperation from your infrastructure groups to allow your agent to be deployed to all systems.

However, agents likely will not be able to go on every type of resource or service with an IP, such as 3rd party virtual appliances.  You will still need network scanning to be able to inspect some virtual systems or resource types such as PaaS deployed services.

– Most agents typically lack the ability to see services from the perspective of the “network”, which is often where the most risk resides. For example, they can’t talk to all the services and see the ciphers or configurations being exposed to network clients.
 
So, regardless of what you may have been told, there is no cloud or vendor provided vulnerability scan agent that will give you full visibility to your cloud resources. You still need network scans.
 

Even though agents won’t solve all your problems,  you probably won’t be hitting packet rate caps or throughput issues, since they mostly just push up their data in one stream on a regular schedule. So agents can allow you to avoid some of the network issues you might hit otherwise.

 
Here are some questions you need to consider for vulnerability scanning in the cloud…
 
  • How much cpu impact will there be from network scanning or agent scanning? The act of scanning will use some capacity.
 
  • Should you size your cloud capacity to allow for vulnerability management? (yes)
 
In summary, vulnerability management in the cloud is different.
 
Why?
 
  • Dynamic assets.
  • API driven asset management
  • Cloud has more “things” as a service than what one solution can handle.
  • Container Services
  • PaaS
  • Functions/Serverless
  • SaaS/Services
     

How to handle vulnerability management in the cloud?

  • Take a look at all the services your cloud provider offers that you are planning to use.
  • Create an approach for each type of scenario and thing that will be used.
  • Some cloud providers are starting to build in some amount of vulnerability management natively into their platforms. Leverage these native integrations as much as possible.

Scanning Large Container Registries

As container technology adoption grows, the need to provide governance and inspection of these containers and platforms also grows.

One of the nice things about container images is that they are easier to analyze than a traditional application (which may be spread across many directories and files) since everything you need to analyze exists in that container image somewhere.

Container vulnerabilities bring a converged vulnerability footprint of both application and operating system package vulnerabilities. This means your container needs to be treated like an application in some respects, but you also need to analyze the dependencies that are along side the application inside the container, which are often linux packages in the case of linux based containers.

Most of the container scanning solutions out there are fairly immature in that they still mostly treat containers like a virtual machine. They ask the container to dump out its package list (dependencies) and create a finding if they are not at the latest version. Unfortunately, this approach completely ignores the application and/or application runtime itself in many cases. As container scanning solutions mature, they are going to need to differentiate themselves by how well they can analyze the application and application runtimes that exist in containers.

One good solution due to this lack of toolset convergence is to

  • Scan & analyze application artifacts before they are allowed to be layered onto a container, then
  • Scan the container itself after it is built.

This way you are covering both the application and its dependencies.

Some challenges with scanning container repositories and registries.

  • Huge registries and/or repositories of container images.

Some large registries may have hundreds or thousands of different repositories. Each repository could have hundreds of container images. This can easily lead to registries that have tens or hundreds of thousands of container images. I imagine we will soon see registries with millions of container images if they don’t already exist.

Most container scanners know not to rescan things they have already seen, but the first scan on large registries can take a very long time in many cases.

This huge volume of containers can cause a few challenges, and here are some ideas on how to overcome those challenges.

  • Your repo/registry scanner must be designed to scale out or up to handle 10’s of thousands of containers. This usually means…..
  • The container scanner backend must track track the container layer hashes and container hashes to know what it has not already scanned. It obviously shouldn’t scan layers or images it has already scanned.
  • The container scanner backend must be able handle multiple concurrent scans against multiple images or repositories. It should be able to scale up if needed. This means your scanner backend design has to be able to handle multiple concurrent scanners and be able to distribute work between them properly.
  • The container scanner should implement shortcuts to know if it has already scanned images from a registry without necessarily checking every layer and image hash. If you pull down a registry manifest with 10,000 images, the next time you pull the manifest, you should try to diff the manifests to determine what are “new” images and scan those first.
  • A good approach is for container scanner companies to “pre-load” containers and container layers from public registries. This way you may be able to avoid even have to scan many of the layers of the containers.
  • Container scanners should natively support the main container registries in cloud providers like Azure, Google, etc.. by knowing how to use their API’s enough to access the container registries and repositories they provide.
  • A container scanner should usually try to scan in a LIFO approach by scanning newer images first. This can be difficult because container tags and version tags are not very structured. You can try to scan all “latest” tags first. One field I think could be valuable to be added to the docker registry manifest is the timestamp of the image. Since tags are not structured enough to be reliable, you could use the timestamp or epoch to at least know when the container was last modified or placed in a repo.
  • You want to use the LIFO approach because newer containers are the ones most likely to be used, and the ones that need to be analyzed as part of CI/CD integrations

Those are my thoughts on scanning large container registries and repositories. Do you have any thoughts on optimizing container scanning for large registries? I imagine similar work has been done on different types of file or artifact scanning in the past. It seems like we always try to “reinvent the wheel” in security products for some reason.

3 Types of Vulnerability Scanning – Pros and Cons

The 3 Main Types of Vulnerability Scanning Approaches

 

There are 3 major types of vulnerability scanning you can use on your networks. Most large organizations will have to use all 3 (or at least a couple) methods.

  • Unauthenticated Network Based Scanning

  • Authenticated Network Based Scanning

  • Agent Based Scanning

This post will go over the differences of these methods and explain why a combination of methods is typically needed. (This is standard network and host scanning. Containers will be covered in a different post) Yes, passive network scanning exists too. I don’t feel knowledgeable enough on that yet to speak to it.

Back in 2011 I posted a quick explanation of some of the differences between authenticated and unauthenticated scans. Not much (if anything) has changed since then in regards to the differences between those 2 types of scans. However, I will add some more details on the differences in this post.

Unauthenticated Network Based Scanning

These are scans that you run from a system with “scan engine” software or a an appliance type of system. These scans run across a network, targeted at other systems without knowing anything about the targeted systems other than the IP address or DNS name.

No credentials are provided in these types of scans.

The unauthenticated scan has to mostly guess at everything it tells you about the target system because all it can do is probe the ports and services you have open on your system and try to get it to give up information.

  • Cons – More false positives. (it is guessing)
  • Cons – Less detailed information. (it is still guessing)
  • Cons – May require more network connections than authenticated scans.
  • Cons – You are more likely to impact legacy services or applications that do not have authentication or input sanitation.
  • Cons – You have to maintain access to your targets through firewalls, IDS, IPS, etc.
  • Cons – You have to manage a scanner system(s)
  • Pros – Only shows the highest risk issues
  • Pros – Gives you a good view of the least capability an attacker on your network will have. Any script kiddie will be able to see anything an unauthenticated scan shows you.
  • Pros – Is usually faster than an authenticated scan in many cases.

Authenticated Network Based Scanning

These are scans that you run from a system with “scan engine” software or a an appliance type of system. These scans run across a network, targeted at other systems but provide login credentials to the targeted system that allow the network scanner to get a command shell (or similar access) so it can simply run commands and check settings on the targeted system. This allows for much more accurate and detailed information to be returned.

You will never get 100% authenticated scanning success on large networks because of the variety of system types and authentication methods required. You will probably not be able to get into every appliance, printer, iot device, etc.. So 100% is not typically a realistic goal for diverse environments.

  • Pros- Less false positives. (Much less guessing)
  • Pros- More detailed information. (again, doesn’t have to guess anymore)
    • You can now see things like missing patches, specific os versions, locally installed 3rd client software versions.
  • Pros- May require less network connections than authenticated scans.
  • Pros- You are less likely to impact 3rd party legacy services or applications that do not have authentication or input sanitation, because the scanner doesn’t have to guess about the service.
  • Pros – You can now gather configuration information off the system to help feed a CMDB or perform configuration baseline checks. You are now a configuration checking tool and not just a vulnerability checking tool..
  • Cons – Still has most of the type of impacts on custom written socket servers/services.
  • Cons – You are now awash in a sea of data (vulnerability data) about that system.
  • Cons- Risk assessment requires more analysis because instead of a handful of findings from an unauthenticated vulnerability scan, you may now have 30-40 findings.
  • Cons – Is often slower than an un-authenticated scan in many cases, because it is running specific commands from a shell on the system and waiting for the returns etc.. This is not always the case, and it some cases authentication may speed up scans.
  • Cons – You have to maintain access to your targets through firewalls, IDS, IPS, etc.
  • Cons – You have to manage a scanner system(s)

Agent Based Scanning

Agent based scanning requires the installation of a daemon/agent on Linux and Unix systems, or a “Service” on Windows systems. I will refer to this an a “agent” from now on.

The agent is installed locally on the targeted systems, runs on a schedule, and reports the data up to a centralized system or SaaS service. Vulnerability scan agents are usually fairly light weight, but the different variations and vendors all have their own quirks. I highly recommend you perform testing on a variety of systems and talk to existing similar clients using the vendor’s agents before going with this approach.

One of the big pitfalls with an agent is that it cannot fully talk to the target system’s network stack like a network based scanner.. So if you have an nginx service that is misconfigured, it likely won’t report that as an issue, while a network based vulnerability scan would.

This lack of capability to simulate a network client is the big gap in agent functionality. As a result, you cannot truly get a “full” vulnerability picture without running at least an additional network based scan. In some cases, the agent data may be good enough, but that is a decision up to each organization.

Agents are good solutions for systems like mobile laptops that may rarely be on the corporate network, or for systems like some public cloud scenarios, where you can’t maintain full network scanner access across a network to the target host.

  • Pros- Less false positives. (Much less guessing. The agent is installed on the system and just asks for the information. )
  • Pros- More detailed information. (again, doesn’t have to guess anymore)
    • You can now see things like missing patches, specific os versions, locally installed 3rd client software versions.
  • Pros- Requires Far less network connections. Usually just an outbound push of data.
  • Pros – The system with the agent can report up its data from anywhere to your Saas backend or potentially into an internet connected backend if that is your design scenario. So the scanner just resides with each host.
  • Pros- You are less likely to impact 3rd party legacy services or applications that do not have authentication or input sanitation, because the agent doesn’t talk to the network stack and services like a network client.
  • Pros – You can now gather configuration information off the system to help feed a CMDB or perform configuration baseline checks. You are now a configuration checking tool and not just a vulnerability checking tool..
  • Cons – You are now awash in a sea of data (vulnerability data) about that system.
  • Cons- Risk assessment requires more analysis because instead of a handful of findings from an unauthenticated vulnerability scan, you may now have 30-40 findings.
  • Cons – You now have an agent and piece of software on every target system that you (or some team) has to own and somewhat manage. Since every company has slight different ways this is done, it adds a layer of complexity and overhead compared to running a scan across the network.
  • Pros- You have to maintain far less network access (usually just an outbound connection) IDS, IPS, WAF’s etc don’t matter anymore.
  • Cons – You now have to manage an agent, and are now a customer and user of every target system
  • Cons – Your agent may (will) get blamed, (and sometimes rightly so) for impacting performance on a system.

So what is the best solution?

Like almost everything in IT and IT Security, the best solution depends on your requirements. Most larger organizations want the verbose data that an authenticated scan or agents provide.

With most people using laptops these days, classic network based vulnerability scanning is going to miss a lot of assets that an agent will be able to cover.

Datacenter implementations may be covered fine with authenticated scanning, and not having to manage an agent or be called in to every performance issue (because you have something running on the system) in that scenario may reduce headache.

Public iaas hosts may require unauthenticated scanning from an Internet based scanner, and an agent on the host to get the full picture of data..

Ultimately, the right approach is the one that meets your requirements and fits within your funding and capabilities.

Payment Card Security In The News

On Feb 4th, 2014, I gave a high level presentation to our Northwest Arkansas ISSA chapter regarding Payment Card Security. Unfortunately, the roads were icy that day, so there were only a few of us in attendance.

I felt like this was a presentation that both technical and non-technical attendees would find interesting due to all of the credit card security topics that had been in the news over the holidays.

Below is a LibreOffice Impress document with the contents of the presentation.

Payment_Card_Security_Feb_2014

When Is the Best Time To Run Vulnerability Scans?

It Depends…

There are several factors to consider when determining the times to run vulnerability scans.

Is this the first time you have run this scan?

Is the scan going to run against an ecommerce site?

Do you have standing approval from your operational areas to run a scan?

Do you have security monitoring and logging systems that will alert on the scanning?

Contact the administrators of your websites to determine the best times to run a vulnerability scan.

Most site admins will know their peak periods of website activity, it is best to avoid those periods for routine scanning simply due to the scans increasing the load on the site.

Scans can often cause increased error logging and alerting. So you need to be extra diligent and careful the first time you run scans. Assume that you may break things the first time.

  • Talk to the stakeholders for the systems you are scanning to determine the best time to scan.
  • Notify the stakeholders and any support areas that may be involved if there are issues or alerts generated by the scan.
  • Follow your normal change control management procedures and treat initial scans like a system change.

One piece of information that your stakeholders will need to know is the source where your scans will originate. They may want to whitelist or ignore those ip addresses in their monitoring.

If you are able to perform vulnerability scanning on your network and e-commerce sites without anybody noticing, then you likely have a gap in your ability to detect malicious scanning also. 🙂

 

How To Understand a Vulnerability Scan Report – Part 2 – The Network Port

How To Understand a Vulnerability Scan Report – Part 2 – The Network Port

Part 2 of a multiple part series explaining vulnerability scan data and nuances of how that data should be used. Part 1 was about IP addresses.

 

  • Network Port
    • This is the network Port identifier number (1 through 65535) where the vulnerability scanner found a vulnerability.
    • The port number is not always provided in some vulnerability scan reports, although it is a critical piece of information, as will be discussed below.
    • The teams that own the systems or applications with vulnerabilities will often be unfamiliar with network ports until they do some further research on their application or system.
    • In part 1 of this series it was discussed that a system can have more than 1 IP address. The level of complexity increases with ports because each IP address can have up to 65,535 tcp and/or udp ports.
    • It is very unusual for most IP addresses to have more than 100 or so ports open, so many vulnerability scanners will consider a system with many ports open to be a firewall configured to respond on all ports as a defensive measure.

     

    What does a port number tell me?

  • A listening port tells you that some piece of software is accepting network socket connections on a specific network port. Your vulnerability is on the software that is using that port.
  • The port number should be your starting point to determine which service or application is listening for incoming socket connections. This service or application port listed in your vulnerability scan is what typically has the vulnerability.
  • There are many common ports used that are easy to identify.
  • Once you know what the program or service is, your next step is often to contact the person or team responsible for managing that service or application.
  • One nice thing that most vulnerability scanners will do is give you the text response that the vulnerability scanner got from the port when it initially fingerprinted that port.
    • This text info is valuable because it will often give you the response header/banner/response from the service and often has enough information for you to understand what the service is, even if you had no previous information about that port.

     

    Okay, that’s nice, but if I see a webserver vulnerability, I already know to call the webserver folks.

  • It’s not quite that easy. Run a port scan (all ports) on a heavily used server and you might be surprised how many http/https protocol servers are running.
    • Even dedicated webservers will often have many different instances of a webserver running, each one on different ports. Being able to tell the owning team the specific port that had the vulnerability finding is critical to being able to determine the source of the problem.
    • If the vulnerability is application related, knowing the port is likely how you will determine the application team that needs to remediate the vulnerability finding. The team that manages the webserver may know which application instance is running on which port, and can direct you to the proper application team.

    Load Balancing can throw you off.

  • Network Load Balancers can take traffic directed at one port on an IP address, and redirect that traffic to different ports on different IP addresses.
  • This can obviously cause some issues for you since you will see the port on the Virtual IP address on the load balancer as having the vulnerability.
  • This is a more common scenario you will face when scanning servers from outside a DMZ, from the Internet, or on a hosting or cloud environment.
  • It is critical for you to have the network load balancer configuration and be able to trace which IP addresses and ports are actually responding with vulnerabilities. Without this information you are stuck at the Virtual IP address without being able to go any further to find the true IP and port that has the vulnerability.