Monday, March 19, 2007

Investigating False Positives and Other Security Anomalies Part 2

In Part 1 of this series, I talked about investigating vulnerability scan results where the scanner alerted on something and further investigation revealed that the vulnerability was a leftover file from an upgrade. For example, the computer was upgraded from Microsoft Office XP to Office 2003. As far as Windows/Microsoft Updates and the enterprise patch management system are concerned, the computer is running Office 2003 completely patched for the installed software. An in-depth investigation was performed which involved going into the scanner session logs and finding out which file caused the scanner to alert on the vulnerability. Indeed it turns out to be a left over file from Office XP that Office 2003 doesn’t even use. Renaming or removing the file fixes the vulnerability, and Office continues to work normally, so all fixed, right? After all, it was a pretty straight forward fix – we knew that a Microsoft product was upgraded, the new Microsoft product didn’t clean up after the old version, and a vulnerability was left on the box. The entire solution of renaming an old Office file seemed logical and one thing was related to the other.


Not so fast! Let’s move on to the next type of scenario in the investigative process that is even a little more difficult to troubleshoot. The vulnerability scanner alerts on something that experience showed was easily remediated by renaming a file or removing it. The vulnerability was related to a left over file, and getting rid of it resolved the vulnerability – for the time being. Later on, the computer is scanned again and the same vulnerability has returned. Nothing had changed. Noting new was installed, and the same versions of the Office software are still on the machine. So let’s take a more in-depth look at this type of scenario and see what happened.


Scenario 3: A scan is run, and the now much discussed vulnerability related to MS Office products has appeared on several computers. The previously developed fix of renaming or removing a vulnerable left-over file proves successful. Later, these same computers are scanned again. Many of them show that the vulnerability has been successfully remediated, but on a few of them, the vulnerability has reappeared. Investigation into the scan session logs shows that the previously renamed vulnerable file is again the culprit causing this vulnerability to appear. Physical inspection of the file system on the target computers verifies that the renamed file is still in its renamed form, but now another copy of the original vulnerable file is on the box. One thing interesting is noted about these computers: They all something in common – they all a have a piece of third-party software (not Microsoft software) installed. The software title and vendor is not important here, and I don’t want to be accused (or worse) of name calling and accusing on the Internet, so I just won’t get into a name-calling session here.


Further in-depth troubleshooting reveals that again renaming the vulnerable file, and performing an immediate scan shows the vulnerability remediated. Now for the next step: verifying that all of the software works. MS Office works fine, the corporate email client works fine, as do the web browser and other normally used applications. The computer is scanned, and the machine is still clean of the vulnerability. Since all of the computers with this problem had in common another piece of software, this particular application is tested last. The application in question is started up, and produces an error. The error is that there is a corrupt or missing DLL file, and is prompting the user to install the original software CD for this application. This is done, and the software repairs itself. The application now runs normally. Another scan reveals that the vulnerability is now present. Looking at the folder on the computer where the vulnerable file resides, we see that sure enough the renamed file is still there, but the original vulnerable file has returned.

In this case, it is clear that another piece of software (not from Microsoft) is related to, and interacting with, the Microsoft native files for an MS Office installation. Not sure what to make of this, a call to the vendor’s tech support reveals that the suspect Microsoft DLL may be used by their software, but they are not sure. This will have to be investigated further with the software developers. There are some known versions of the DLL file that are not vulnerable, so the hypothesis was that replacing the offending DLL with a non-vulnerable version will fix the problem. Replacing with a non-vulnerable version allows the software to operate normally and error free. A re-scan of the computer now shows that it is vulnerability free also.

Note: As of this writing, the software company in question has no intention of fixing this vulnerability in their software. I was in communication with them today and the tech support person I spoke with stated that the company will not be releasing a patch for this product - it is Microsoft's problem, evidently. This brings up the issue that a piece of third party software is latching onto a known application (Microsoft Office) for its functionality, and the vendors are not keeping up on the security ramifications of their software installing known vulnerabilities onto a computer.


Investigations Start with Patch and Scan Testing Process:

It is quite clear from the events discussed in the two parts of this article that a proactive strategy for patching and scanning is in order. Such a strategy will ensure that vulnerability scanning is built in to the patch testing process so that 1) patches will be verified as being applied and that they do not have adverse affects on the system, and 2) the vulnerabilities that the patch is meant to target are actually being remediated. Testing the patches as they are received will ensure that they apply properly and do not break applications. Then a follow up of deploying patches to a pilot group will give the patches more rigorous testing in a real environment, and allow IT staffs to clear up any problems quickly before deploying to the full production environment. Once this is done, a follow-up scan on those same pilot computers will verify whether or not the applied patch mitigated the vulnerability. If it does, then the desired goal was achieved. If it does not, then it is time to have an investigative process to find out if 1) the patch is not doing its job, or 2) the scanner is alerting on a false positive condition. This process will allow for the discovery of scanner alert anomalies as soon as possible, and a fix to be developed before the scanner hits the full production environment.


It is important to note that testing patches and developing vulnerability remediations can be tricky in that hidden causes will sometimes not be found right away. This was evident when scenario 4 as described above brought to light newly discovered problems for a situation that was thought to be previously resolved. For this reason, it is important to carefully choose those users who will be in the pilot group for the second phase of patch testing. They should be fairly computer savvy users who know how to properly respond to error messages, and that they also know how to carefully document any problems that they run into. This is the group of people that will know that these errors are possibly going to occur, and won’t fly off the handle when they do. They will know to calmly notify their IT support staff, and won’t panic and click through all the error messages until the IT staff has had a chance to see them and work the issues. So having said all that, let’s take a look at the chronological steps that would take place in this whole testing and investigative process.


The Steps (in chronological order):

  1. The new patches are released from the vendor and the new cycle of patch and scan testing begins.
  2. Non-production machines in a lab and/or virtualized environment are scanned and verified clean of all vulnerabilities before patch testing begins.
  3. All discovered vulnerabilities are remediated on the designated test machines before patch testing begins. Those that cannot be remediated are documented with the reason why they cannot be resolved (ie false positive, etc.).
  4. The new patches are first tested on the non-production machines in lab or virtualized environment.
  5. All applications on the lab machines are tested for proper operation, and that no errors are experienced on the machines.
  6. The scanner profile is verified to have the proper checks for the latest patches and other newly discovered conditions.
    • Note: This often happens after the new patches are released, and it can sometime take a few days for the new scanning profiles to be configured on the scanner. However, steps 1 – 5 can be performed prior to the new scanner profiles being configured. Step 7 and beyond, however, are dependant on the scanner being configured to look for the new patches that are being tested in this phase.
  7. A test scan is performed on the lab machines to verify that they are free of vulnerabilities. Any vulnerabilities found are investigated and resolved.
  8. Patches are deployed to the designated pilot group of production users.
  9. The designated pilot users are to use their computers for a pre-determined testing period. Three days to one week is recommended for this testing period.
  10. A sample of this pilot group is selected for another verification scan, and the scanner is run against these machines to verify that the machines are clean of the vulnerabilities that the new patches were meant to mitigate.
    • Note: This step can be done concurrently with the operational testing period described in step 9.
    • Any vulnerability conditions that are related to the new patches that exist as a result of this scan are investigated, documented, and solutions determined.
  11. The new patches are deployed to the remainder of the production machines.
  12. Full scan of the production environment is run.
    • Note: The full scan of the production environment to look for the new patches should take place only after allowing sufficient deployment time. This will vary depending on the size and geographical diversity on the organizations.


Wrapping It All Up:

Having a standardized, methodical approach to patching and scanning will help give more structure to the whole process. Using a checklist, like the one above or a locally developed checklist will help ensure that testing is performed properly. It is easy to overlook things, and very easy to be led down an incorrect path when investigating the types of situations mentioned in this series. It is important to use several different tools and analyze the similarities and differences in information that each of the tools provides.


So the lesson learned in this whole exercise is that IT staffs should be less prone to jumping on the “False Positive” bandwagon, and more inclined to using research and investigative techniques to find out what is really happening. Don’t rely on just one analysis tool or set of data to make a conclusion. Security is hard work, and often involves many steps to get it right. Overlooking even a single vulnerability by claiming that it is a false positive gets it off your to-do list, but doesn’t actually clear it up – your machines are still vulnerable. If the bits are on the box, you MUST remediate. Calling it a false positive when it is not does not constitute a valid remediation strategy.

Use some industry respected assessment tools, come up with a good (consistent) methodology, search for clues, and above all else – do some research and investigation! As a line from the movie Apollo 13 goes – “Work the problem! Don’t make it worse by guessing!” Guessing that it is a false positive is a dangerous habit to get into.

No comments: