Powershell Dynamic Arrays Will Murder You, Also... A Pretty Picture! (Part 1 of X)

I was browsing the web a couple days ago and I saw this post, which I thought was a fun idea.  I didn't want to look at his code though, because, kind of like reading movie spoilers, I wanted to see if I could do something similar myself first before spoiling the fun of figuring it out on my own. The idea is that you create an image that contains every RGB color, using each color only once.

Plus I decided that I wanted to do it in Powershell, because I'm a masochist.

There are 256 x 256 x 256 RGB colors (in a 24-bit spectrum.) That equals 16777216 colors. Since each color will be used only once, if I only use 1 pixel per color, a 4096 x 4096 image would be capable of containing exactly 16777216 different colors.

First I thought to just generate a random color, draw the pixel in that color, then add it to a list of "already used" colors, and then on the next iteration, just keep generating random colors in a loop until I happen upon one that wasn't in my list of already used colors. But I quickly realized this would be horribly inefficient and slow.  Imagine: to generate that last pixel, I'd be running through the loop all day hoping for the 1 in ~16.7 million chance that I got the last remaining color that hadn't already been used. Awful idea.

So instead let's just generate a big fat non-random array of all 16777216 RGB colors:

#
$AllRGBCombinations = @()
For ([Int]$Red = 0; $Red -LT 256; $Red++)
{
    For ([Int]$Green = 0; $Green -LT 256; $Green++)
    {
        For ([Int]$Blue = 0; $Blue -LT 256; $Blue++)
        {
            $AllRGBCombinations += [System.Drawing.Color]::FromArgb($Red, $Green, $Blue)
        }
    }
}

That does generate an array of 16777216 differently-colored and neatly-ordered pixel objects... but guess how long it takes?

*... the following day...*

Well, I wish I could tell you, but I killed the script after it ran for about 20 hours. I put a progress bar in just to check that it wasn't hung or in an endless loop, and it wasn't... the code is just really that slow.  It starts out at a decent pace and then gradually slows to a crawl.

Ugh, those dynamic arrays and the += operator strike again. I suspect it's because the above method recreates and resizes the array every iteration... like good ole' ReDim back in the VBscript days.  It may be handy for small bits of data, but if you're dealing with large amounts of data, that you want processed this decade, you better strongly type your stuff and use lists.  Let's try the above code another way:

#
$AllRGBCombinations = New-Object 'System.Collections.ObjectModel.Collection[System.Drawing.Color]'
For ([Int]$Red = 0; $Red -LT 256; $Red++)
{
    For ([Int]$Green = 0; $Green -LT 256; $Green++)
    {        
        For ([Int]$Blue = 0; $Blue -LT 256; $Blue++)
        {
            $AllRGBCombinations.Add([System.Drawing.Color]::FromArgb($Red, $Green, $Blue))
        }
    }
    $PixelsGenerated += 65536
    Write-Progress -Activity "Generating Pixels..." -Status "$PixelsGenerated / 16777216" -PercentComplete (($PixelsGenerated / 16777216) * 100)
}

Only 5.2 seconds in comparison, including the added overhead of writing the progress bar. Notice how I only update the progress bar once every 256 * 256 pixels, because it will slow you down a lot if you try to update the progress bar after every single pixel is created.

Now I can go ahead and generate an image containing exactly one of every color that looks like this:

Yep, there really are 16.7 million different colors in there, which is why even a shrunken PNG of it is 384KB.  Hard to compress an image when there are NO identical colors! The original 4096x4096 bitmap is ~36MB.  And I ended up loading a shrunken and compressed JPG for this blog post, because I didn't want every page hit consuming a meg of bandwidth.

It kinda' makes you think about how limited and narrow a human's vision is, doesn't it?  24-bit color seems to be fine for us when watching movies or playing video games, but that image doesn't seem to capture how impressive our breadth of vision should be.

Next time, we try to randomize our set of pixels a little, and try to make a prettier, less computerized-looking picture... but still only using each color only once.  See you soon.

Website Upgrade, Coding, and Dealing with NTFS ACLs on Server Core

I apologize in advance - this blog post is going to be all over the place.  I haven't posted in a while, mainly because I've been engrossed in a personal programming project. Part of which includes a multithreaded web server I wrote over the weekend that I'm kind of proud of. My ShareDiscreetlyWebServer is single-threaded, because when I wrote it, I had not yet grasped the awesome power of the async and await keywords in C#.  They're très sexy. Right now the only verb I support is GET (because it's all I need for now,) but it's about as fast as you could hope a server written in managed code could be.

Secondly, I just upgraded this site to Blogengine.NET v2.9.  The motivation behind it was that today, I got this email from a visitor to the site:

Hi,
I tried to leave you a comment but it didnt work.
Can you please go through steps you took to migrate your blog to Azure as I am interested in doing the same thing.
Did you set it up as a Azure Web Site or use an Azure VM and deployed it that way?
Are you using BlogEngine or some other blog publishing tool.

Wait, my comments weren't working? Damnit. I tried to post a comment myself and sure enough, commenting on this blog was busted. It was working fine but it just stopped working some time in the last few weeks. And so, I figured that if I was going to go to the trouble of debugging it, I'd just go ahead and upgrade Blogengine.NET while I was at it.

But first, to answer the guy's question above, my blog migration was simple. I used to host this blog out of my house on a Windows Server running in my home office. I signed up for a Server 2012 Azure virtual machine, RDP'ed to it, installed the IIS role, robocopy'd my entire C:\inetpub directory to the new VM, and that was that.

So version 2.9 so far is a little lackluster so far.  They updated most of the UI to the simplistic, sleek "modern" look that's all the rage these days, especially on tablets and phones.  But in the process it appears they've regressed to the point where the editor is no longer compatible with Internet Explorer 11, 10, or 9. (Not that it worked perfectly before either.)  It's annoying as hell. I'm writing this post right now in IE with compatibility mode turned on, and still half of the buttons don't work.  It's crippled compared to the version 2.8 that I was on this morning.

That's ironic that the developers who wrote a CMS entirely in .NET, in Visual Studio, couldn't be bothered to test it on any version of IE.  Guess I'll wait patiently for version 3.0.  Or maybe write my own CMS after I get finished writing the web server to run it on.

But even after the upgrade, and after fixing all the little miscellaneous bugs that the upgrade introduced, it still didn't fix my busted comment system. So I had to dig deeper. I logged on to the server, fired up Process Monitor while I attempted to post a comment: 

w3wp.exe gets an Access Denied error right there, clear as day.  (Thanks again, ProcMon.)

If you open the properties of w3wp.exe, you'll notice that it runs in the security context of an application pool, e.g. "IIS APPPOOL\Default Web Site". So just give that security principal access to that App_Data directory.  Only one problem...

Server Core.

No right-clicking our way out of this one.  Of course we could have done this with cacls.exe or something, but you know I'm all about the Powershell.  So let's do it in PS.

$Acl = Get-Acl C:\inetpub\wwwroot\App_Data
$Ace = New-Object System.Security.AccessControl.FileSystemAccessRule("IIS APPPOOL\Default Web Site", "FullControl", "ContainerInherit, ObjectInherit", "None", "Allow")
$Acl.AddAccessRule($Ace)
Set-Acl C:\inetpub\wwwroot\App_Data $Acl

Permissions, and commenting, are back to normal.

FIPS 140

FIPS 140-2 Logo

Oh yeah, I have a blog! I almost forgot.  I've been busy working.  Let's talk about an extraordinarily fascinating topic: Federal compliance!

FIPS (Federal Information Processing Standard) has many different standards.  FIPS holds sway mainly in the U.S. and Canada.  Within each standard, there are multiple revisions and multiple levels of classification.  FIPS 140 is about encryption and hashing algorithms.  It’s about accrediting cryptographic modules.  Here’s an example of a certificate.  The FIPS 140-2 revision is the current standard, and FIPS 140-3 is under development with no announced release date yet.  It does not matter if your homebrew cryptography is technically “better” than anything else ever.  If your cryptographic module has not gone through the code submission and certification process, then it is not FIPS-approved.  You have to submit your source code/device/module to the government, in order to gain FIPS approval.  Even if you have the most amazing cryptography the world has ever seen, it is still not FIPS approved or compliant until it goes through the process.  In fact, the government is free to certify weaker algorithms in favor of stronger ones just because the weaker algorithms have undergone the certification process when the stronger ones have not, and they have historically done so.  (Triple-DES being the prime example.)

There is even a welcome kit, with stickers.  You need to put these tamper-proof stickers on your stuff for certain levels of FIPS compliance.

So if you are ever writing any software of your own, please do not try to roll your own cryptography. Use the approved libraries that have already gone through certification. Your custom crypto has about a 100% chance of being worse than AES/SHA (NSA backdoors notwithstanding,) and it will never be certifiable for use in a secure Federal environment anyway.  Also avoid things like re-hashing your hash with another hashing algorithm in attempt to be ‘clever’ – doing so can ironically make your hash weaker.

And the Feds are picky.  For instance, if programming for Windows in .NET, the use of System.Security.Cryptography.SHA1 classes may be acceptable while the use of System.Security.Cryptography.SHA1Managed classes are not acceptable.  It doesn’t mean the methods in the SHA1Managed classes are any worse, it simply means Microsoft has not submitted them for approval. 

Many major vendors such as Microsoft and Cisco go through this process for every new version of product that they release.  It costs money and time to get your product FIPS-certified.  Maybe it’s a Cisco ASA appliance, or maybe it’s a simple Windows DLL. 

The most recent publication of FIPS 140-2 Annex A lists approved security functions (algorithms.)  It lists AES and SHA-1 as acceptable, among others. So if your application uses only approved implementations of AES and SHA-1 algorithms, then that application should be acceptable according to FIPS 140-2.  If your application uses an MD5 hashing algorithm during communication, that product is NOT acceptable for use in an environment where FIPS compliance must be maintained. 

However, there is also this contradictory quote from NIST:

“The U.S. National Institute of Standards and Technology says, "Federal agencies should stop using SHA-1 for...applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010" [22]”

So it seems to me that there are contradictory government statements regarding the usage of security functions.  The most recent draft of FIPS 140-2 Annex A clearly lists SHA-1 as an acceptable hashing algorithm, yet, the quote from NIST says that government agencies must use only SHA-2 after 2010.  Not sure what the answer is to that. 

These algorithms can be broken up into two categories: encryption algorithms and hashing algorithms.  An example of a FIPS encryption algorithm is AES (which consists of three members of the Rijndael family of ciphers, adopted in 2001, and has a much cooler name.)  Encryption algorithms can be reversed/decrypted, that is, converted back into their original form from before they were encrypted.

Hashing algorithms on the other hand, are also known as one-way functions.  They are mathematically one-way and cannot be reversed.  Once you hash something, you cannot “un-hash” it, no matter how much computing power you have.  Hashing algorithms take any amount of data, of an arbitrary size, and mathematically map it to a “hash” of fixed length.  For instance, the SHA-256 algorithm will map any chunk of data, whether it be 10 bytes or 2 gigabytes, into a 256 bit hash.  Always 256 bit output, no matter the size of the input.

This is why the hash of a password is generally considered decently secure, because there is NO way to reverse the hash, so you can pass that hash to someone else via insecure means (e.g. over a network connection,) and if the other person knows what your password should be, then they can know that the hash you gave them proves that you know the actual password.  That's a bit of a simplification, but it gets the point across.

If you were trying to attack a hash, all you can do, if you know what hash algorithm was used, is to keep feeding that same hash algorithm new inputs, maybe millions or billions of new inputs a second, and hope that maybe you can reproduce the same hash.  If you can reproduce the same hash, then you know your input was the same as the original ‘plaintext’ that you were trying to figure out.  Maybe it was somebody’s password.  This is the essence of a ‘brute-force’ attack against a password hash.

Logically, if all inputs regardless of size, are mapped to a fixed size, then it stands to reason that there must be multiple sets of data that, when hashed, result in the same hash.  These are known as hash collisions.  They are very rare, but they are very bad, and collisions are the reason we needed to migrate away from the MD5 hashing algorithm, and we will eventually need to migrate away from the SHA-1 hashing algorithm.  (No collisions have been found in SHA-1 yet that I know of.)  Imagine if I could create a fake SSL certificate that, when I creatively flipped a few bits here and there, resulted in the same hash as a popular globally trusted certificate!  That would be very bad.

Also worth noting is that SHA-2 is an umbrella term, that includes SHA256, SHA384, SHA512, etc.

FIPS 140 is only concerned with algorithms used for external communication.  Any communication outside of the application or module, whether that be network communication, or communication to another application on the same system, etc.  FIPS 140 is not concerned with algorithms used to handle data within the application itself, within its own private memory, that never leaves the application and cannot be accessed by unauthorized users.  Here is an excerpt from the 140-2 standard to back up my claim:

“Cryptographic keys stored within a cryptographic module shall be stored either in plaintext form or encrypted form. Plaintext secret and private keys shall not be accessible from outside the cryptographic module to unauthorized operators…”

Let’s use Active Directory as an example.  This is why, when someone gets concerned about what algorithms AD uses internally, you should refer them to the above paragraph and tell them not to worry about it.  Even if it were plaintext (it’s not, but even if hypothetically it were,) it isn’t in scope for FIPS because it is internal only to the application.  When Active Directory and its domain members are operated in FIPS mode, connections made via Schannel.dll, Remote Desktop, etc., will only use FIPS compliant algorithms. If you had applications before that make calls to non-FIPS crypto libraries, those applications will now crash.

Another loophole that has appeared to satisfy FIPS requirements in the past, is wrapping a weaker algorithm inside of a stronger one.  For instance, a classic implementation of the RADIUS protocol utilizes the MD5 hashing algorithm during network communications.  MD5 is a big no-no.  However, see this excerpt from Cisco:

“RADIUS keywrap support is an extension of the RADIUS protocol. It provides a FIPS-certifiable means for the Cisco Access Control Server (ACS) to authenticate RADIUS messages and distribute session keys.”

So by simply wrapping weaker RADIUS keys inside of AES, it becomes FIPS-certifiable once again.  It would seem to follow that this logic also applies when using TLS and IPsec, as they are able to use very strong algorithms (such as SHA-2) that most applications do not natively support.

So with all that said, if you need the highest levels of network security, you need 802.1x and IPsec if you need to protect all those applications that can't protect themselves.

Mind Your Powershell Efficiency Optimizations

A lazy Sunday morning post!

As most would agree, Powershell is the most powerful Windows administration tool ever seen. In my opinion, you cannot continue to be a Windows admin without learning it. However, Powershell is not breaking any speed records. In fact it can be downright slow. (After all, it's called Power-shell, not Speed-shell.)

So, as developers or sysadmins or devopsapotami or anyone else who writes Powershell, I implore you to not further sully Powershell's reputation for being slow by taking the time to benchmark and optimize your script/code.

Let's look at an example.

$Numbers = @()
Measure-Command { (0 .. 9999) | ForEach-Object { $Numbers += Get-Random } }

I'm simply creating an array (of indeterminate size) and proceeding to fill it with 10,000 random numbers.  Notice the use of Measure-Command { }, which is what you want to use for seeing exactly how long things take to execute in Powershell.  The above procedure took 21.3 seconds.

So let's swap in a strongly-typed array and do the exact same thing:

[Int[]]$Numbers = New-Object Int[] 10000
Measure-Command { (0 .. 9999) | ForEach-Object { $Numbers[$_] = Get-Random } }

We can produce the exact same result, that is, a 10,000-element array full of random integers, in 0.47 seconds.

That's an approximate 45x speed improvement.

We called the Get-Random Cmdlet 10,000 times in both examples, so that is probably not our bottleneck. Using [Int[]]$Numbers = @() doesn't help either, so I don't think it's the boxing and unboxing overhead that you'd see with an ArrayList. Instead, it seems most likely that the dramatic performance difference was in using an array of fixed size, which eliminates the need to resize the array 10,000 times.

Once you've got your script working, then you should think about optimizing it. Use Measure-Command to see how long specific pieces of your script take. Powershell, and all of .NET to a larger extent, gives you a ton of flexibility in how you write your code. There is almost never just one way to accomplish something. However, with that flexibility, comes the responsibility of finding the best possible way.

Taking a Peek Inside Powershell Cmdlets

Have you ever wondered how a particular Powershell Cmdlet works under the hood?  Maybe you're trying to mimic a certain behavior of a Cmdlet, and you'd like to see how Microsoft did it.

Turns out, it's surprisingly easy. The first thing you need is a .NET decompiler. There are many to choose from, but I like DotPeek.

Next, pick a Cmdlet, such as  Get-ADUser . To find the DLL that the Cmdlet comes from, do this:

Cmdlet DLL

If you add a  | clip  on the end there, the output will go straight to your clipboard.

(Did you know the hexadecimal color code for the Powershell background color is 012456?)

Anyhow, now that we know in what DLL the Cmdlet resides, we need to find out what method(s) within that DLL the Cmdlet is actually calling.  We can do that with  Trace-Command :

Trace-Command

There's a little more output after that, but this last line here is what we want. Microsoft.ActiveDirectory.Management.Commands.GetADUser.

Now we know the actual .NET method being called, and which DLL it's in. All that's left to do is fire up your .NET decompiler and disassemble!

Get-ADUser DotPeek