Back When I Was Young and Foolish (Part I of Oh-So-Many)

I'd like to do a little reminiscing now, of a time when I was young and foolish. I anticipate many such posts, which I guess means I spend a lot of time being young and foolish.

It was my first real IT job, and it'd be fair to call me a sysadmin. I was basically just an assistant to the lead architect. I was doing things like setting up backup schedules in DPM and creating user accounts in AD. Just the usual stuff you'd expect a guy first getting started in IT would be doing.

One day, my boss and I were discussing one of the many issues caused by being in the middle of a domain migration, and that was that we currently had employees working in two different Windows domains simultaneously. For various reasons, there would not and could not be a trust relationship established between the two. To further complicate matters, we now needed two separate ISA (now known as TMG) servers at the office, one for each domain, so as to keep the employees off of Facebook and ESPN.com. One might suggest that it was a social problem that should be dealt with by the employees' managers, but the managers were among the worst offenders. But I digress.

Now since both of these domains are on the same subnet, we can really only have one DHCP server. So if we only have one DHCP server, how are we going to push out two different sets of options to the clients, such as which proxy server to use, to members of either domain?

We settled on using two different DHCP classes.

So while we were fiddling around with the DHCP server, for no good reason we decided that right now was a great time to screw around with the basic settings of a DHCP server that had been serving us faithfully with no problems for years.

What is a good DHCP lease duration, really? Is it the default of 8 days? Well I suppose that depends on a lot of factors, such as how mobile your DHCP clients are and how often you expect them to be connecting to and disconnecting from the network, how many DHCP addresses you have to distribute, etc..  But I've already put more thought into it just now than we did on that day, for we almost completely arbitrarily set the DHCP lease duration to 30 minutes.

Fast forward a few months.

Everything had been working great and the office DHCP server, as DHCP servers should be, was all but forgotten about. I had just been offered a much higher-paying job with a much larger IT company, so I was now a short-timer at this job. Then, out of the blue, our primary domain controller (the one hosting all the FSMOs) at our remote datacenter goes dark.  I can still ILO into it, but both of its network adapters are just gone. No error messages in the logs, no sign that Windows had ever known that it did once have network connections, nothing.  It just looked like it had never had network adapters in the first place.  (This still perturbs me to this day. Why wasn't there an event in either the Windows logs or in the IML that said "Uhh dude where did your network adapters go!?")

After seizing the FSMO roles on the backup DC, we took a trip out to the datacenter to have a look. Sure enough, the link lights on the physical hardware were out, and upon rebooting, the BIOS of the machine didn't even recognize that it had ever had network adapters in it. So I had HP come out and replace the motherboard, which fixed the issue. I don't know what the point of having redundant NICs is if they're wired to both blow at the same time, but I digress again...

So now I have essentially a brand new server out of what used to be our primary domain controller. The question, my boss wanted to know, was how do we restore a former DC after its FSMO roles have been seized? My answer: You don't. You never bring it back. Ever. If I don't wipe the hard drive right now, it thinks it's still the owner of those FSMO roles, and the only way for it to learn that it no longer owns those FSMO roles is to connect it back to the domain network and let the KCC do its magic.  What's going to happen in that span of time that we now have two DCs in the domain that both simultaneously think that they own all the FSMOs? I didn't want to find out. So I argued that we just rebuild it and I eventually got my way.

Now since I had to rebuild the machine, I figured now was as good a time as any to upgrade it from Windows 2008 to Windows 2008 R2. Having spent the last several months studying my ass off for the MCITP tests and passing them, I was feeling pretty self-confident. I took the freshly rebuilt server back out to the datacenter. I re-cabled it and re-racked it, and powered it on. I grabbed the 2008 R2 version of adprep.exe and on the other domain controller, used it to run adprep /forestprep followed by adprep /domainprep. The domain was now ready to accept a 2008 R2 domain controller. I dcpromo'ed the new machine, and it went without a hitch. I then gracefully transferred the FSMOs back from the other DC. (I didn't alter the actual functional levels of the forest or domain.)

Now I had one new, fresh shiny 2008 R2 domain controller in the datacenter, with 5 other DCs in 2 other sites that were still on Windows 2008. Still high off my earlier success, I felt like this was a great time to do an in-place upgrade on the rest of those DCs.  One by one, I upgraded them from 2008 to 2008 R2, and you know what? I was actually pleasantly surprised at how smooth and painless it was. My hat goes off to Microsoft for making an in-place upgrade of a production domain controller so seamless.  I just remoted into the server, mounted the 2008 R2 media and hit "Upgrade."  That's all there was to it. The installation took about 30 minutes, and when the machine came back up, it was a pristine 2008 R2 box.

Everything was going great, until I got around to upgrading the last of the 6 DCs. That sixth DC was the DHCP server at the office. I followed the same procedure that had gone off without a hitch 5 times already. Confident that all was well, I set off to browse reddit while the DC upgrade finished. I was just thinking about how the upgrade was going kind of slow, when suddenly my Internet connection dropped. At the exact same instant, the phone rang.

Me: "Hello?"
Boss: "What did you do?"
Me: "Wha, I don't, I ..."
Boss: "RUN to the server room and set up a new DHCP scope. Now."

The entire office had just lost their Internet connections. In the intervening months, I had completely forgotten that we had set the DHCP lease duration on the domain controller to... oh, just a hair shorter than the amount of time it takes to complete an in-place upgrade on that domain controller. Had the lease time been just 5 minutes longer, the DC/DHCP server would have finished the upgrade and been back up serving renewals as if nothing had ever happened. By the time I had gotten the replacement scope set up on the other DC and done the dance of de-authorizing the first DC as as a DHCP server in Active Directory, and authorizing the new one, the original DC was back up and ready.

After about 30 minutes of repairing the damage, I sheepishly emerged from the server room, to the sound of ironic applause coming from all the employees in their cubicles.

Comments (1) -

Mats Jakobsson 1/14/2014 3:53:23 PM

What a story
BIG LOL
thanks for sharing

greetings from sweden

Comments are closed