The "CPU Steal Time" Metric in Unix/Linux Virtual Machines and a Windows Counterpart

I haven't posted in a while; been busy both studying for Windows Server 2012 stuff and also preparing for a possible slight career shift.  But I do want to put this up here, because it's one of my answers to a Serverfault  question that I'm a little proud of.  Nevertheless, it's a deep enough topic that I expect someone who knows more about it than me to come along and correct me.  Which I welcome.  That's how science works.  I'm not learning if I'm not wrong.

Here was the question:

In order to assess performance monitoring accuracy on virtualization platforms, the CPU steal time has become an increasingly relevant metric - see EC2 monitoring: the case of stolen CPU for an instructive summary in the context of Amazon EC2 and IBM's paper on CPU time accounting for a more in-depth technical explanation (including illustrations) of the concept:

Steal time is the percentage of time a virtual CPU waits for a real CPU while the hypervisor is servicing another virtual processor.

Accordingly, it is exposed in most related Unix/Linux monitoring tools nowadays - see e.g. columns %steal or st in sar or top:

st -- Steal Time
The amount of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine).

I've been unable to figure out how to capture the same metric on Windows though, is this possible already? (Ideally for the Windows 2008 Server R2 AMIs on EC2 and via a respective Windows Performance Counters of course.)

 And here was my answer:

Let me preface by saying that I am coming from the point of view of Hyper-V as a virtualization platform because that is where I have the most experience. Even though there may be certain tenets of virtualization, as we know it, that cannot be deviated from, Microsoft and VMware and Xen all have different strategies for how they design their hypervisors.

That's the first thing that makes your question challenging. You pose your question as if it were hypervisor-agnostic, when in truth it is not. Amazon EC2, for example, uses the Xen hypervisor, and the "CPU Steal Time" metric that you see in the output of a top command issued from within a Linux VM running on that hypervisor is a result of the integration services installed on that guest OS (or virtualization-aware tools on the guest) in conjunction with data provided by that specific hypervisor.

First off let me just answer your question straight up: There is no way to see from inside a virtual machine running Windows how much time the processors belonging to the physical machine on which the hypervisor runs spends doing other things, unless the particular virtual tools/services or virtualization-aware tools for your particular hypervisor are installed in the guest VM and the particular hypervisor on which the guest is running exposes that data. Even a Windows guest running on a Hyper-V hypervisor will not have immediate access to information regarding the time spent that the physical processors on the hypervisor were doing other things. (To quote voretaq7, something that "breaks the fourth wall.") Even though Windows client and server operating systems running as virtualized guests in Hyper-V with the correct integration services/tools installed make use of "enlightenments" (which are literally kernel code alterations made especially for VMs) that significantly increase their performance in using the resources of a physical host, the bottom line is that the hypervisor does not have to give any more information to the guest OS than it wants to. That means the hypervisor does not have to tell a guest VM what else it is doing besides servicing that VM... unless it wants to. And that information about what else the physical processors are doing is necessary for deriving a metric from the perspective of the VM such as "CPU Steal Time: the percentage of time the vCPU waits for a physical CPU."

How could the guest OS know that, if it didn't even realize that it was actually virtualized? It's like The Truman Show... for computers.

In other words, without the right integration tools installed on the guest, the guest OS won't even know that its CPU is actually a *v*CPU. It won't even know that there is another force outside of itself "stealing" CPU cycles from it, therefore that metric will not exist on the guest VM.

That's why I don't even like the phrase "CPU Steal Time." The word steal just puts everybody in the wrong frame of mind from the get-go.

A hypervisor such as Hyper-V does not give guests direct access to physical resources such as physical processors or processor cores. Instead the hypervisor gives them vDevs - virtual devices - such as vCPUs.

A prime example of why: Say a virtual machine guest OS makes the call to flush the TLB (translation look-aside buffer) which is a physical component of a physical CPU. If the guest OS was allowed to clear the entire TLB on a physical processor, that would have negative performance effects for all the other VMs that were also sharing that same physical TLB. In the case of Windows, that call in the guest OS is translated into a "hypercall" or "enlightened" call which is interpreted by the hypervisor so that only the section of the TLB that is relevant to that virtual machine is flushed.

(Interestingly, that hints to me that guest VMs that do not have the proper integration tools and/or services could have the ability to impact the performance of all the other VMs on the same host, but that is completely outside the scope of this topic.)

All that to say that you can still detect in a Hyper-V host the time that a virtual processor spent waiting for a real processor to become available so that it could scheduled to run. But you can only see that data on a Windows Hyper-V hypervisor. If it is possible to see this in other hypervisors, I urge others to tell us how to see this in that hypervisor and also if it is exposed to the guests. And that is before we even get to whether that data is exposed to the guest OS or not.

My test machine was Hyper-V Server 2012, which is the free edition of Server 2012 that only runs Core and the Hyper-V role. It's effectively the same as any Windows Server 2012 running Hyper-V.

Fire up Perfmon on your parent partition, aka physical host. Load this counter:

Hyper-V Hypervisor Virtual Processor\CPU Wait Time Per Dispatch\* 

You will notice that there will be an instance of that counter for each virtual machine on that hypervisor, as well as _Total. The Microsoft definition of that Perfmon counter is:

The average time (in nanoseconds) spent waiting for a virtual processor to be dispatched onto a logical processor.

Obviously, you want that number to be as low as possible. For computers, waiting is almost never a good thing.

Other performance counters on the hypervisor that you will want to investigate are Hyper-V Hypervisor Root Virtual Processor\% Guest Run Time, % Hypervisor Run Time, and % Total Run Time. These counters provide you with the percentages that could be used to determine facts such as how much time the "real" processors spend doing things other than servicing a VM or all VMs.

So in conclusion, the metric that you are looking for in a guest virtual machine depends on the hypervisor that it is running on, whether that hypervisor chooses to provide the data about how it spends its time other than servicing that VM, and if the guest OS has the right virtualization integration tools/services/drivers to be aware enough to realize that the hypervisor is making that data available.

I know of no way on a Windows guest, integration tools installed or not, to see how much time, in terms of seconds or percentage, that VM's host has spent servicing it or not servicing it respective to the total physical processor time.

Are Windows Administrators Less Likely to Script/Automate?

I wrote this bit as a comment on another blog this morning, and then after typing a thousand words I realized that it would make good fodder as a post on my own blog. The article that I was commenting on was on the topic that Windows administrators are less likely to script and/or automate because Windows uses a GUI and Linux is more CLI-centric. And because Linux is more CLI-focused, it is more natural for a Linux user to get into scripting than it is a Windows user. Without further ado, here is my comment on that article:

This article and these comments suffer from a lack of the presence of a good Microsoft engineer and/or administrator. As is common, this discussion so far has been a bunch of Linux admins complaining that Windows isn't more like Linux, but not offering much substance to the discussion from a pro-Microsoft perspective.

That said, I may be a Microsoft zealot, but I do understand and appreciate Linux. I think it’s a great, fast, modular, infinitely customizable and programmable operating system. So please don’t read this in anger, you Linux admins.

First I want to stay on track and pay respect to the original article, about scripting on Windows. Scripting has been an integral part of enterprise-grade Windows administration ever since Windows entered the enterprise ecosystem. It has evolved and gotten a lot better, especially since Powershell came on the scene, and it will continue to evolve and get better, but we've already been scripting and automating Windows in the enterprise space since the '90s. (Though maybe not as well as we would have liked back then.)

But I will make a concession. There are Windows admins out there that don't script. A lot of them. And I do view that as a problem. Here's the thing: I believe that what made Windows so popular in the first place - its accessibility and ease of use because of its GUI - is also what leads to Windows being frequently misused and misconfigured by unskilled Windows administrators. And that, in turn, leads people to blame Windows itself for problems that could have been avoided had you hired a better engineer or admin to do the job in the first place.

GUIs and wizards are nice and have always been a mainstay of Windows, but I won’t hire an engineer or administrator without good scripting abilities. Even on Windows you should not only know how to script, but you should want to script and have the innate desire to automate. If you find yourself sitting there clicking the same button over and over again, it should be natural for you to want to find a way to make that happen automatically so you can go do other more interesting things. Now that I think about it, that’s a positive character trait that applies universally to any occupation in the world.

And yeah, it was true for a long time that there were certain things in Windows that could only be accomplished via the GUI. But that’s changing – and quickly. For instance, Exchange Server is already fully converted to the point where every action you take in the GUI management console is just executing Powershell commands in the background. There’s even a little button in the management console that will give you a preview of the Powershell commands that will be executed if you hit the ‘OK’ button to commit your changes. SQL Server 2012 will be the first version of SQL that’ll go onto Server Core. (About time.) The list goes on, but the point is that Microsoft is definitely moving in the right direction by realizing that the command line is (and always has been) the way to go for creating an automatable server OS. Microsoft is continuing to put tons of effort into that as we speak.

However, just because scripting on Windows is getting better now doesn’t mean we haven’t already been writing batch files and VB scripts for a long time that do pretty impressive things, like migrate 10,000 employee profiles for an AD domain migration.

I really love Server Core, but it's just a GUI-less configuration of the same Windows we've been using all along. Any decent Windows admin has no trouble using Core, because the command line isn't scary or foreign to them. For instance, one of the comments on this article reads:

"The root of the problem seems to be that Linux started with the command line and added GUIs later, whereas Windows did it the other way around."

I think that's false. Windows started as a shell on top on top of DOS – a command line-only operating system. DOS was still the underpinning of Windows for a long time and even after Windows was re-architected and separated from DOS, the Command Prompt and command-line tools were and still are indispensible. Now I will grant you that Linux had way better shells and shell scripting capabilities than Windows did for a long time, and Microsoft did have to play catch-up in that area. Powershell and Server Core came along later and augmented the capabilities of and possibilities for Windows – but the fact remains we’ve been scripting and automating things using batch files and VB Script for a long time now.

There was also this comment: “Another cause for slow uptake is that Windows skills don't persist.”

Again I would say false. I can run a script I wrote in 1996 on Server 2012 just fine, with no modification. Have certain tools and functions evolved while others have been deprecated? Of course. Maybe a new version of Exchange came out with new buttons to click? Of course – that’s the evolution of technology. But your core skillset isn’t rendered irrelevant every time a new version of the software comes out. Not unless your skillset is very small and narrow.

There was also this comment:

“I also complain that PowerShell is not a "shell" in a traditional sense. It is not a means of fully interacting with the OS. There is no ecosystem of text editors, mail clients, and other tools that are needed in the daily operation and administration of servers and even clients.”

As I mentioned earlier, there are fewer and fewer things every day that cannot be done directly from Powershell or even regular command-line executables. And to the second sentence - I’m not sure if there will ever be a desire to go back to an MS-DOS Edit.exe style text editor or email clients… but I could probably write you a Powershell-based email client in an hour if you really wanted to read your emails with no formatted styles or fonts. :)

So in the end, I think the original article had a good point - there probably are, or were, more Linux admins out there with scripting abilities than Windows admins. But I also think that's in flux and Windows Server is poised in a better position than ever for the server market.

DNS 101: Round Robin (Or Back When I was Young And Foolish Part II)

I learned something today. It's something that made me feel stupid for not knowing. Something that seemed elemental and trivial - yet, I did not know it. So please, allow me to relay my somewhat embarrassing learning experience in the hopes that it will save someone else from the same embarrassment.

I did know what DNS round robin was. Or at least, I would have said that I did.

Imagine you configure DNS1, as a DNS server, to use round robin. Then, you create 3 host (A or AAAA) records for the same host name, using different IPs. Let's say we create the following A records on DNS1:

server01 - A
server01 - A
server01 - A

Then on a workstation which is configured to use DNS1 as a DNS server, you ping server01. You receive as a reply. You ping server01 again. With no hesitation, you get a reply from again. We assume that your local workstation has cached locally and will reuse that IP for server01 until the entry either expires, or we flush the DNS cache on the workstation with a command like ipconfig/flushdns.

I run ipconfig/flushdns. Then I ping server01 again.

This time I receive a response from Now I assume DNS round robin is working perfectly. I go home for the day feeling like I know everything there is to know about DNS.

But was it that the DNS server is responding to DNS queries with the single next A/AAAA record that it has on file, in a round-robin type sequential fashion to every DNS query that it receives? That is what I assumed.

But the fact of the matter is that DNS servers, when queried for a host name, actually return a list of all A/AAAA records associated with that host name, every time that host name is queried for. (To a point - the list must fit within a UDP packet, and some firewalls/filters don't let UDP packets longer than 512 bytes through. That's changing though. Our idea of how big data is and should be allowed to be is always growing.)

I assume that, being one of the busiest websites in the world, has not only some global load balancing and other advanced load balancing techniques employed, but probably also has more than one host record associated with it. To test my theory, I fire up Wireshark and start a packet capture. I then flush my local DNS cache with ipconfig/flushdns and then ping

Notice how I pinged it, got one IP address in response (.148), then flushed my DNS cache, pinged it again and got another different IP address (.144)? But despite what it may look like, that name server is not returning just one A/AAAA record each time I query it:

*Click for Larger*

My workstation is ::9. My workstation's DNS server is ::1. The DNS server is configured to forward DNS requests for zones for which it is not authoritative on to yet another DNS server. So I ask for, my DNS server doesn't know, so it forwards the request. The forwardee finally finds out and reports back to my DNS server, which in turn relays back to me a list of all the A records for I get a long list containing not only a mess of A records, but a CNAME thrown in there too, all from a single DNS query! (We're not worried about the subsequent query made for an AAAA record right now. Another post perhaps.)

I was able to replicate this same behavior in a sanitary lab environment running a Windows DNS server and confirmed the same behavior. (Using the server01 example I mentioned earlier.)

Where round robin comes in is that it rotates the order of the list given to each subsequent client who requests it. Keep in mind that while round robin-ing the A records in your DNS replies does supply a primitive form of load distribution, it's a pretty poor substitute for real load balancing, since if one of the nodes in the list goes down, the DNS server will be none the wiser and will continue handing out the list with the downed node's IP address on it.

Lastly, since we know that our client is receiving an entire list of A records for host names which have many IP addresses, what does it actually do with the list?  Well, the ping utility doesn't do much. If the first IP address on the list is down, you get a destination unreachable message and that's it. (Leading to a lot of people not realizing they have a whole list of IPs they could try.) Web browsers however, have a nifty feature known as "browser retry" or "client retry," where they will continue trying the other IPs in the list until they find a working one. Then they will cache the working IP address so that the user does not continue to experience the same delay in web page loading as they did the first time. Yes, there are exploits concerning this feature, and yes it's probably a bad idea to rely on this since browser retry is implemented differently across every different browser and operating system. It's a relatively new mechanism actually, and people may not believe you if you tell them. To prove it to them, find (or create) a host name which has several bad IPs and one or two good ones. Now telnet to that hostname. Even telnet (a modern version from a modern operating system) will use getaddrinfo() instead of gethostbyname() and if it fails to connect the first IP, you can watch it continue trying the next IPs in the list.

More info here, here and here. That last link is an MSDN doc on getaddrinfo(). Notice that it does talk about different implementations on different operating systems, and that ppResult is "a pointer to a linked list of one or more addrinfo structures that contains response information about the host."

The Linux Kerberos Project

I am absolutely a Windows engineer and an extremely avid advocate of most everything Microsoft, but more importantly I'm an enthusiast of all forms of technology that help to achieve business goals. Whatever it takes to further the state of the art. That means I occasionally enjoy dabbling in Linux too. Whatever gets me closer to the bleeding edge of technology. Not to mention that the vast majority of enterprises have some sort of mixture of both operating systems.

But it's rare to see a deployment in which the Unix/Linux servers participate in Active Directory. Yes, Active Directory is a Microsoft technology and *nix isn't just ready to jump into domain membership right out of the box, but I strongly believe that AD is the mortar that glues any corporate IT environment together. Let us not think Linux vs. Windows... but Linux and Windows!

So what are the ways *nix could benefit from Active Directory?

  • Secure, central management:
    No more maintaining a separate list of local user accounts and passwords on each and every machine. Why not keep just one database of users and machines in your Active Directory that is guaranteed to stay consistent and secure among every single member server forever?
  • Authentication:
    The main mode of authentication in an Active Directory domain is Kerberos. It was invented by some nerds at MIT. Kerberos is Greek for the three-headed hound that guards the gates of hell. (Cerberus in Latin.) This name is apt, because Kerberos is an authentication system that requires three parties. This authentication system involving a "trusted third party" has proven to be secure and trustworthy in any enterprise environment. And the best part? Kerberos is an open protocol that both Microsoft and *nix can both enjoy.
  • As if that wasn't enough:
    Authenticate from machine to machine to machine, without having to re-type your password; without any user intervention at all even! Use one account to run a service on every machine. Active Directory-integrated machines can securely and dynamically update their own DNS records. Log on to a freshly-built machine with domain credentials, without ever needing to manage the local accounts on each and every box. The list goes on and on...

As any IT company grows, it becomes increasingly important that they maintain a cohesive, easily manageable structure that includes all of their devices. So, to that end, I took the time to replicate in my personal lab the steps necessary to join a Linux machine to my existing Windows Active Directory domain. And I've documented the journey. So without further ado: 

As you can see, I've created a virtual machine and installed Linux on it. My domain is at the 2008 R2 forest and domain functional levels. It's pretty much the best domain ever. I'd put my AD architecting skills artistry up against anyone's.

Here I am on said virtual machine, downloading the Likewise (free edition) client. I was planning on doing it all the long, complex, hard way. This software saved me a lot of time.

I created a basic user, and delegated domain-joining permissions to him, but nothing else. I'm going to use this service account for the sole purpose of joining *nix machines to my domain.

Here's where the hair on the back of the neck of any real nerd would start standing up. See what I did there? I just joined my Linux machine to my Active Directory domain, using my specified service account. "SUCCESS" it says. I shall stand for nothing less.

Now we rush off to look at the security log on our domain controller. And what else do we see there but zero audit failures, and a handful of beautiful Kerberos ticket requests and grants. The machine account even popped up in my AD Users & Computers!

And finally - the one screenshot to rule them all - here I am SSH'ing into my Linux box for the first time using domain creds! Kerberos wins the day.

So, that's all I've got for now. I haven't really done any more in-depth research into this than what you've just seen. You're probably already wondering if I can make it do smartcards next, aren't you?