Monday, July 9, 2012

Wow, time flies! I wrote my first blog post almost two weeks ago and it doesn't even feel like it has been that long.

Anyways, this post will finally be about some BSOD discussion, but not totally in to any cases just yet. This post will be more towards how I got myself got into BSOD Kernel Dump analysis, etc. After I write this post, I'll go have a look around and compile some BSOD analysis I have done that was successful to post here, and go into in-depth explanation. So, without further ado, let's begin.

I built my current rig I have now about almost I'd say... almost two years ago? It worked well for a long time, absolutely loved it. Well, one day, I shut it down real quick to see if an adapter I bought fit one of my GPUs, and when I went to power it back on, my Corsair 750w PSU shorted out and killed every single one of my components EXCEPT my two ATI 5850 video cards.

However, I did not know this until VERY much later, I figured my motherboard and HDD were the only ones lost in combat. So I powered my rig back on and it wouldn't POST. I knew right away that my SATA controller was probably dead as the HDD wasn't being detected in another system, or in mine. So I called up Asus and Samsung and get RMAs set up with both and had my system up and running again within almost two weeks.

Flash forward to two weeks when I get everything up and running, and it's actually all down hill from here. I would BSOD VERY randomly after a few days of system uptime, never really soon after a cold boot, it always took a few days.

So, I had always been somewhat knowledgeable in computers in various different ways, so I went ahead and took it upon myself to see if anybody else was having my issues, what they were doing for it, etc. Back then, at the time, I only knew that the crashes were being caused by what I saw as the driver culprit on that actual blue screen itself. The two most popular were dgxmms1.sys, and of course... atikmpag.sys. So, with those two drivers in mind, I set off on a journey to Google. I came across various forums: Seven Forums, Tech Support Forum, etc.. all discussing this. When I was reading up on this, I took it upon myself to go ahead and learn how to analyze dump files.

I got the WinDbg client set up, the symbols and the path set up, and then set off. At first, I didn't know what the hell I was looking at. It was like looking at another language, hieroglyphics would be the closest thing I can think of. I just closed the client and gave up. I decided to RMA my RAM and my PSU to Corsair and both were replaced. After about 3 weeks, I got my system up and running again, and on a fresh Windows 7 install. After a few days... boom, BSOD.

I sat there and was almost to tears, I couldn't believe it. I couldn't believe I pretty much replaced my entire computer, and it's still crashing. I couldn't believe I spent over $2000 on a computer that is completely unstable, but a backup rig that I have in a plastic drawer with the mobo screwed to a piece of wood NEVER crashes. It just hurt... it really did, and it REALLY bothered me.

I said enough was enough and decided to learn how to debug and analyze dump files. So I once again set off on a journey, but I was for sure this time going to learn. I remember reading posts on various forums from jcgriff2, JMH3143, zigzag3143, satrow, writhziden, Vir Gnarus, etc. Without people like that, I wouldn't be where I am today with this hobby, and my knowledge would be absolute EONS behind.

After lots of reading, I had enough knowledge to understand that my issue was related to either DirectX... or my AMD drivers. I tackled DirectX first just in case, and it wasn't that... so I moved onto AMD drivers. Before uninstalling and reinstalling the drivers, I thought about when these BSODs started happening. I remembered that before I RMA'd all of my parts, my system worked completely fine. I thought to myself, unless I am incredibly unlucky and all of this brand new replacement hardware somehow became DoA, it should be fine and something ELSE is the culprit. I then at that moment remembered on my old Windows install prior to my PSU failing, I was on CCC 12.1 and not CCC 12.3 (which is what I was on after I got the replacement hardware).

To be sure everything went smooth, I went ahead and did a clean Windows 7 install, I might as well have anyways. I was BSOD'ing left and right on the new install, and had to make a lot of crappy adjustments and optimizations to be stable for more than 15 minutes... so my OS and registry in general was probably in shambles.

After I got a clean install of Windows 7 going, satrow helped me get everything up and running again by running me through a checklist and such. After reinstalling 12.1 rather than 12.3 CCC, 116 TDR nightmares... GONE!

At that moment, I took it upon myself to spend all of the free time I have helping others with their BSOD related issues. BSODs are not fun, let's get that out of the way. They cause real life stress, annoyance, waste time when work needs to be done, etc. They are a plague, and if you're having issues that are so ongoing like I did, and you don't know how to solve them, imagine that? I don't want people to have to go through what I did.

I made it my goal to make it so people when they get a BSOD, they don't have to go to the nearest Geek Squad and pay $100 to have them say "Oh, we need to reinstall Windows" when it's a simple driver culprit, etc. I want to help others with their issues, for free. There aren't many of us in the BSOD analysis community if you think about it. Compared to the amount of computers and users that use said computers in the world, we're almost non-existent.

So now, with my free time, which currently in my life is ALL of the time until I start IT school, I read and read about BSODs courtesy of our BSOD communities. I also spend tons of time on these various communities solving BSOD related issues. The best part is, I have been noticed for this, and awarded.

Ever since all of this, here I am almost a year later with this knowledge that increases every single day because of my peers, etc. I have made it my long term goal to one day achieve the Microsoft MVP award. I know for a fact that my hard work, determination, and personality will get me there. It's just a matter of time.

:)

7 comments: