Tag Archives: VMware

Rootkit in my lab? (Part III)

First, thanks for all the comments in the previous articles (Part I and Part II).

I decided to analyze one the crash I experienced during registry analysis.
I could reproduce all the time a BSOD with Regshot. I thought it would be nice to see what I could get with WinDBG.

I had my environment set up with the suspicious VM configured to debug activated on the serial port, which is a simple pipe on Mac OS X.
Another VM is configured with a serial port as the other end of this pipe, and WinDBG attached to it.
Another method would be to just configure Windows to create a crashdump file with kernel symbols, that you can later load into WinDBG. Of course, the first method offers more opportunities to check and play with the live system.

Then, I just boot the target and trigger the crash, simply by starting a scan with Regshot:

Windows then crashes, WinDBG catches the exception and stops.

So what do we have ?

First, the error type, PAGE_FAULT_IN_NONPAGED_AREA (50), means that an instruction pointed to an invalid memory address. Let’s check this.

With !analyse -v, you get the full error dump.

Crashing Instruction

It shows the function (nt!CmpGetValueKeyFromCache, offset 0x89) and the memory address where the crash was triggered.

The instruction at this address is:

80637807 f3a5 rep movs dword ptr es:[edi],dword ptr [esi]

This instruction is trying to copy 8 bytes at the address pointed by EDI.
EDI has the value of 0xe1285050 at execution time.

And what do we have at this memory location ?

EDI pointing to invalid memory section

Nothing indeed. Note that this corruption persists at every boot.

So what can we conclude?
We can certainly exclude hardware failure, because it is a virtual machine and because the corruption always occur at the same memory region, even after a reboot.
At least, I can now be sure that something in the kernel is definitely corrupted.

Could it be a rootkit trick? Still the question remains, but to me it now looks very, very suspicious. Some rootkit code, poorly written, could have sat in this non-paged memory area and been paged out, causing the BSOD. I have not much knowledge about it at this time but I am going to search on this. At least, I now have good starting point to look at.

That’s all for today, folks. I wrote it while I am still working on it, so sorry if it looks rough and incomplete. It is sort of live, thoughts are still in process.

Again, I am looking forward to reading your comments and suggestions. (Hopefully) there will be a part IV!

Rootkit in my lab? (part II)

I finished checking the RAM with Volatility and… I found nothing. Nada.

It’s a lot of fustration. There must be something just there, but my findings are certainly limited by my skills.

I attach here some of the main outputs of Volatility. As far as I can tell:

  • no evidence of injection or kernel hooking
  • no suspicious process
  • no suspicious driver
  • no suspicious registry entry
  • etc.

Based on my observations, I first tried to narrow my investigations (drivers and hooks) but as I could not find anything, I ended dumping most of Volatility outputs in hope to see something unusual. I also compared them with a fresh Windows XP SP3 install. I extracted keyboard related drivers (keyboard.sys, kbdclass.sys, i8042prt.sys), hashed them, scanned them: there were native. I am less sure on how to deal with the software certificate system, but I did checked all Microsoft and root certificates in the bank along with their signature with a clean system: nothing wrong.

Dear reader, any help or tip is welcomed! Am I missing something obvious? Could it be possibly not a rootkit but some kind of corruption? If so, how to detect it?

Just drop me an e-mail if you want to have a look on the dump itself.

Volatility outputs:

Rootkit in my lab?

Context

For now, I can’t tell much about the context, mainly because it may – or may not – involve other people. The only thing I am interested in is to spot the issue and understand precisely what is going on.

What makes the case really interesting though, is that it occurred on a fresh install of a Windows XP virtual machine. I aimed it to be a clean malware reversing snapshot. I noticed the weired behavior minutes after finishing the system install and setting up a bunch of reversing and live analysis tools.

So I bet that if I got some malware, it probably comes from one of those. At this time, unfortunately, there are too many and I could not spot the exact time, so I can not start the analysis from this angle.

This article is almost written in live, so pardon my mistakes. I will update it as soon as I find something new. Of course, I am really expecting your feedback, suggestions and corrections. I see it as a great opportunity to learn, even though this one may not be the easiest…

Symptoms

Two things alerted me quickly.

The first one was, at a point, the permanent failure of going through the full windows update process. Believe me, I have tried all ways.

The second one was the weird dialog when trying to access to the keyboard layout settings. It says “Incompatible driver detected“. To me, this looks like there is a keylogger somewhere…

Suspicious activities: the keyboard driver and windows update seem to be messed

Then, as I started to check around, more odd stuff came out.

I fired up Process Explorer, and soon realize that it was “unable to verify” the signatures of all the running Windows processes. I could not find anything else suspicious, though (no odd process, memory content looks normal, etc.).

On the left, Process Explorer fails to validate any Windows process.
On the right, expected behavior on a clean system.

Ok, while I am with the Sysinternal suite, why not scanning with Rootkit Revealer:

Rootkit Revealer cannot access to the SYSTEM hive of the registry

Interesting… and what about GMER:

GMER crashes when accessing the registry…

Oops! Now it crashes when it is accessing the registry…

For the fun, let’s see what happens if we try to set up an antivirus (Security Essentials):

Windows certificate warning when installing… Microsoft Security Essentials!!!

Nice one! Very suspicious! Note that after a full scan, Security Essentials reports me that the system is clean and everything is fine. I am so relieved. :)

Curious to see how my certificates are, I run certmgr.msc. I compared all Microsoft root certificates with a clean machine and could not see anything different. But again something happened:

certmgr.msc crashes

Oh, just one of my last attempts to do live analysis (this the WinPcap setup included with Wireshark):

WinPCAP installation also fails

Ok, so enough played. The thing seems to be nicely done, and live analysis is going to be way too hard and unreliable.

Memory Analysis

This is where I am now. I reverted to a snapshot prior to my live analysis attemps, confirmed the strange behaviors are still observable, and suspended the VM to get the vmem file.

So I have spent the last hours scanning the memory with, of course, Volatility.

So far, I have to confess that I found NOTHING. But analyzing the memory can be a harsh process when it comes to sophisticated threats, and I may have reached the limits of my skills.

But, anyway, I could not dream of a greater and more exciting opportunity to learn!

My discoveries, if there are, will be published in another article.

UPDATE: I forgot to tell that it is a Windows XP SP3 machine, but not fully updated due to the issues.

Network virtualization and the DMZ paradigm

The virtualization buzz

I have recently worked on network virtualization. Many people, especially the network guys, have been recently excited with the VMware Vswitch or Cisco Nexus stuff.  It is something that I understand because virtualization is cool. It brings many convenient features that truly make the life easier.

But what about the security? Convenience and security rarely come together, right? Oh, wait… we are in 2011, so lessons must have been learned. After all, Mr Salesman swear that it is more secure than ever. Convenience and security packed together, he says… it sounds promising. Let’s dig a little to find out what they won’t tell you…

I will focus on what really changes with virtualization : the architecture. One of the main goals of the technology is to reduce the number of physical devices to cut the costs, save space and energy. Of course, it goes with a simplification of the physical architecture. Therefore, some features previously handled by dedicated physical devices are now handled logically by a unique piece of hardware.

This obviously goes against the security best practices about designing network architectures with various degrees of exposure. But has the technology evolved so much that we should reconsider these recommendations?

VMware Vswitches or Nexus 1000V

These technologies are similar in the sense that they are designed to work directly inside the VMware platform. Vswitches are integrated with the solution of VMware, while Nexus benefits from the experience of Cisco and bring more layer 2 control (more settings, more protocols).

As well on the architecture documents of VMware as within the administration interface of Vcenter, it appears so easy to create segregated switches and build this way in a few clicks a DMZ architecture:

But it is slightly different in reality, as Brad Hedlund from Cisco shows in an interesting article: the vswitch illusion and DMZ virtualization. In short, whether you are using VMware Vswitches or Nexus 1000V, a single threaded program runs all the configured virtual switches. In clear, all the virtual switches share the same memory space. So, any vulnerability in the code would compromise all the switches, in other words: the entire network. And, not a surprise here, there have been many vulnerabilities. Just browse a CVE database if you want to check.

So you don’t want to rely on such a design for your datacenter, right?

Nexus 7000

In the case of the Nexus 7000, it is a little bit different because most of the switching work is handled by specific hardware, which have a much smaller attack surface than the vswitches stuff. But is it really safe?

The Nexus family is quite new and from what I could witness, they are quite pushy selling that. Because it is new, there is still neither much info surrounding the technologies used, nor user feedback, nor security research. Anyway, below is a quick sum-up of what I could find.

A few words about the architecture

In a layer 3 Nexus architecture, Nexus 2000, 5000 and 7000 are designed to work together. Nexus 2000 are basically top-of-the-rack port panels, with no intelligence. Nexus 5000 takes care of most of the layer 2 switching, while Nexus 7000 adds layer 2 functionalities and layer 3 support. Nexus 2000 and 5000 can work without the 7000, but in that case there is not so much difference with a classic layer 2 switch in terms of security (but it has the advantage to be more flexible to integrate in a datacenter). This and this may help you to visualize the differences.

So we will focus on the Nexus 7000 architecture, which bring VDC as a way to handle DMZ architectures. VDC are somehow similar to VLANs. But whereas VLANs virtualized LANs on a switch, VDC virtualize switches. So, on the same Nexus 5000 device, VDC will add the capacity to have multiple virtual switches which are in theory properly isolated.

This is a very basic sum-up for what we are interested in, but if you want to learn more, I encourage you to read the Cisco whitepaper about VDCs.

The flaws

Now that the presentations are made, the downside…

George Hedfors is the only researcher that worked notably on this platform, as far as I am aware. He made some really great findings, that you can discover within his slides.
At the time of his work – 2010, it appeared that the NX-OS consisted of a Linux Kernel 2.6.10 (released in 2004!). We can imagine that the OS has been signifiantly customized and hardened by Cisco. They may have include NX-bit support  (included since 2.6.8 and later improved). However, there is probably no ALSR support (2.6.12), no MAC system (SELinux or Tomoyo). Of course, I may be wrong but I haven’t found any documentation about that and my Cisco contact did not provide me with any consistent detail.

Anyway, he found a bunch of design flaws:

  • Poor CLI design: there are 686 hidden commands (system, debugging) that can be launched as root (sudo without password). One of these command is gdb, which can start a network daemon as root. The attacker can then connect to the socket to attach to any process on the system to elevate his privileges. Of course, it requires some shell access, so the exposure is limited. However, it is very instructive of how the system was designed!
  • Insecure daemon configuration: Daemon are not chrooted and run with the root user.
  • Embarassing CDP vulnerability : a vulnerability from 2001 was reintroduced in the code handling CDP. So it is possible to crash a daemon running as root. What if another vulnerability on a layer 2 daemon (vtp, hsrp, stp…) was discovered and allowed to rewrite the stack? Game over, the attacker is root.
  • Strange hidden account : there is a ftpuser hidden account with a dumb password (nbv123). Secret backdoor? I don’t know, but anyway it is not serious at all and should have been revealed by any consistent audit.
  • Shell design flaw: the VSH shell accepts a parameter (-a) that allow to spawn any command over the security roles normaly in place.
  • You can also get a root shell by simply spawning ssh `/bin/bash` from the CLI.

To any serious security guy or unix administrator, these should look like amateurism. And what’s the hell are all the security audits for?

So concerning the Nexus 7000, it is obvious that at best it is not specifically designed to be secure, at worst it was simply as poorly designed (or released too quickly) as most stuff.

Conclusion

In conclusion, one thing we can tell for sure is that none of the virtualized networking solutions are designed to be secure. Of course, all these flaws are hopefully already or will be soon fixed. But, despite what Cisco may claim, the facts are here: there is no VDC miracle. The Nexus platform is certainly great, but not more bug-free, flaw-free than any other piece of code.
No virtualized architecture can give the same degree of protection than physical segregation.

In the case of Vswitches or Nexus 1000, the attack surface is just too high to use it for DMZ segregation if you are serious about security. The vulnerabilities are already here and it will be feasible for a skillful and motivated attacker to own your datacenter.

Concerning the Nexus 7000 and its VDC, the attack surface is considerably reduced because there is less code and fewer protocols at layer 2. However, it is undoubtly less secure than physical segregation. Any zero-day vulnerability would potentially expose the datacenter (and we all know that some zero-day sometimes take years before coming to the public, which is a lot of time for the criminals or the government agencies to exploit it). You can’t take it lightly when it comes to the whole datacenter integrity and it doesn’t make sense if you have expensive (in cash or in labor hours) security at upper layers.

But, of course, it may depend on what you have to protect. If your datacenter hosts sensitive data for your company’s buisiness, then you should think twice on how you deploy virtualization or use the cloud.

Don’t get me wrong. These technologies are great and very useful. In many areas, there are an improvement. Simply, they must be used with as much care as always. Concerning the DMZ topic, as far as I am concerned, I will not rely on virtualization and keep physical segregation between zones, supported by different devices from different makers.

One thing I keep an eye on, though, is the development of virtualized firewalls, IPS, etc. In a few years, if these technologies should became really mature (enforcing segregation on all OSI layers) and the hosting OS security should really improved, most of the concerns here would be addressed.

Corrupted virtual disk with VMware

Wow, this article and especially one of its comments saved my day.

My computer crashed and one of the VMware machine hosted on it could not start anymore :

“Cannot open the disk ‘path of vmdk’ or one of the snapshot disks it depends on.
Reason: the specific virtual disk needs repair.

Checking on the VMware forums, I quickly found the command that was supposed to help :

$ vmware-vdiskmanager -R /path/to/disk.vmdk
The virtual disk, '/path/to/disk.vmdk', is corrupted but the repair process has failed.

Damned ! I almost resigned restoring the last backup and loosing a week of work when, by chance, I found the article mentioned above.

As recommended, I downloaded the Virtual Disk Development Kit 1.2 from VMware, untared it and still doubtfully launched :

$ ./bin64/vmware-vdiskmanager -R /path/to/disk.vmdk
The virtual disk, '/path/to/disk.vmdk', was corrupted and has been  successfully repaired.

Saved! Thanks so much to the guys. I would have never thought about trying it, I wonder how they could find it.

But how is it possible that the utility coming with vmware workstation 7.1 suck so much and is not on par with other versions ?

VMWare Workstation 6.5

I have just upgraded WMWare from version 6.04 to 6.5, and I have to say that it has very nice new features.

The first surprising thing was the file I downloaded. It is now not anymore a tar.gz archive but a .bundle file.

After downloading, as root, just make it executable or start it with sh :

% sh VMware-Workstation-6.5.0-118166.x86_64.bundle

It now starts a graphic installer, that takes care of everything. All the compilation process is now hidden to the user.

I was expecting the compilation to fail and that I would have to look for a patch to run on my edge Linux kernel. Indeed, I just compiled 2.6.26 kernel (64 bits) a few days ago.

But nothing like that. the process went smoothly.

However, I was still prudent. Even after a compiling, previous versions almost always required some patch to get full networking to work.

So I gave a try and launch one of my virtual machines. Surprise : all worked out of the box !

For the first time, I even did not need any vmware-any-any patch or any network patched vmmon and vmnet modules to get wifi networking operational.

I also quickly noticed some very nice and fancy features :

  • 3D graphics support
  • more devices supported : fingerprint reader device, audio driver for Vista, …
  • a graphical virtual network settings editor : this utility had been for ages on the Windows version and finally will make your easier on Linux

At last, but not least, the Unity display mode.

Though I am not a Mac user, I believe this can be compared to VMWare Fusion. Anyway, it allows you to display the virtual machines programs within your X session.

Look at this screenshot :

VMWare Workstation 6.5 and Unity

The result is quite spectacular. On my Gnome desktop, I am now able to display some windows from Windows XP and Windows Vista.

Well, this is not yet perfectly smooth or artifact free, but this is already really usable and responsive enough to be used intensively.

Another limit is the operating system support. So far, among my virtual machines, I was able to do it with Windows systems but not Open Solaris for instance.

There must have been more improvements, more or less visible, that I am not aware of. I won’t go for a full review.

I just wanted to insist that if you are a VMWare user,  you really should consider to upgrade for the complete support of the latest kernel and the Unity feature.

It seems that VMWare has listened to the Linux users, or at least is taking it more seriously. Not that they are nice, but the competitors are close (Virtual box, KVM, Xen…) !