Windows Blue Screen of Death — a Account of My PC's Memory Failure

, , ,

This page documents a episode of personal computer failure.

I started to get Blue Screen of Death in the past week. It got worse, and started to happen every hour or so. Spend the whole past 2 days diagonalizing the problem. It's extremely painful. It seems to be a faulty memory module.

It's gonna take me perhaps 10 hours to write my story coherently, starting with my system spec, expected problems, symptoms of past year, recent new problems, with all the tech detail and specs. And also document my experience, my actions, and all sort of software incompetence. Instead of doing that, here's i'll haphazardly comment some random points that comes to my mind.

Note: after i wrote this article, i noticed that i'm still getting blue screen of death. The problem may not be my memory after all. So, don't pay much attention to places where i said how i fixed my memory or PC Doctor. In particular, PC Doctor's report of faulty memory might be due to malfunction in OS software somewhere.

What's Prefetcher, SuperFetch, ReadyBoost?

I learned about: Windows: What's Prefetcher, SuperFetch, ReadyBoost?.

Microsoft's Memory Diagnostics Tool Problems

In Windows, there's a “Memory Diagnostics Tool” under 〔Control Panel\Administrative Tools〕. The file path is %SystemRoot%\system32\MdSched.exe. When you launch it, it asks you to restart, basically setting up a scheduled task. When you reboot, it checks your memory. There are 2 problems i found with this tool:

(actually, as of now, i found the memory diagnostics result in Event Viewer in 〔Control Panel\Administrative Tools〕 (full path at %SystemRoot%\system32\eventvwr.msc /s)). Once you are in Event Viewer, it's under the tree 〔Microsoft, Windows, MemoryDiagnostics-Results, Debug〕.

Note: when the tool is running, any error it found is displayed right away. But i only see the message something like “so far no error found” for the full length of the time it's running. I also have tried the extended test. Same result.

Microsoft has a web page about the “Windows Memory Diagnostic” at http://oca.microsoft.com/en/windiag.asp. It's not clear to me this software is exactly the same as the one bundled with Windows Vista, but i think it's the same or variant. The page has full documentation of the software and info about memory, quite well written. For the record, here's a text version capture: Windows_Memory_Diagnostic.txt.

PC-Doctor

What actually helped me finding the problem was the tool PC-Doctor. On HP machines, it's in the menu 〖Start ▸ All Programs ▸ PC Help & Tools ▸ Hardware Diagnostic Tools〗. The full path is C:\Program Files\PC-Doctor for Windows. I didn't trust any such program that was bundled with HP shit. Never launched it since i got the machine 2 years ago. Only after a day of frustration, i ran it. What a miracle, this tool actually told me i have a memory problem.

PC Doctor start menu location screenshot
“PC Doctor start menu location screenshot”
pc doctor screenshot
Screenshot of PC Doctor's memory test page.

PC-Doctor has several memory tests. The one that failed is the first one: “Advanced Pattern Test”. Here's it's online description about this tool:

Verifies memory cell corruption does not occur from read/write activity on adjacent cells.

This test checks for memory cell corruption from read/write activity on adjacent cells (cells are individual bits). It is run from memory address 0 through each memory cell sequentially to the top of extended memory, then from the top down to memory address 0.

Note: This test will only run on Extended Memory. Each version of the Windows operating system has an absolute minimum amount of physical memory that must be available to it at all times. The Advanced Pattern Test can cause a lot of paging, which can lead to a test time in excess of several hours.

Warning!: This test will stop and record a result of Cannot Run in the test log if the minimum available memory falls below 20 MB. This is only a factor when running the System Stress Test, which includes the Advanced Pattern Test.

pc doctor memory help screenshot
Screenshot of PC-Doctor's “Advanced Pattern Test” help screen.

Note: apparently, you cannot copy text from PC-Doctor help screen. Either as a sloppiness in software or intentional. What a pain in the pass with these mass-market software. However, the help file is at c:/Program Files/PC-Doctor for Windows/pcdrmemory.p5p. Here's text version of the help file for the record: 〔pcdrmemory.p5p.txt

My Memory Spec

There are 4 slots (called “banks”), each slot is a module “2048 MB DDR2-SDRAM (PC2-6400 / 800 MHz)”. They are all made by Samsung, manufactured in 2009 Jan. (detail here: memory_spec.txt.) (See: Wikipedia DDR2 SDRAM)

The 4 banks (slots) are divided into 2 channels, A and B. (See: Dual-channel architecture) Memory module replacement must be per bank (i.e. 2 slots at a time. ⁖ you cannot just remove a module from 1 slot.) PC-Doctor didn't report which memory module is faulty, but the problem seems to be the first 2 slots. Because, i first removed the memory modules at 3rd and 4th slots and tried to reboot but it won't reboot.

Symptoms Over the Year

I always had a problem with this machine since i got it in 2009-05. (My PC spec and story can be seen at: Why I'm Switching from Mac/Unix To PC/Windows.)

The problem is that whenever the PC just woke up, or it just started, and when i run Second Life, it would often lockup. (no blue screen, just that screen froze, and mouse and keys are not operative. At that point, i have to hold the power button for few secs to force the machine to shut down.) This happened ever since i got the machine in 2009-05. I run Second Life about every day, so this happens about every other day.

I've ran Microsoft's memory test and it didn't report any problem, and i always thought the problem is related to my graphics card, because the crash almost always (≈98% of time) happens when i run Second Life.

On 2010-08-12, i posted my question to a online help forum at Source superuser.com, and got a helpful answer, realizing that the problem may be due to a under-powered power supply. My power supply is rated at 300W, but my graphics card spec demands ≈400W. Here's my post

i have this random freezing problem for a year now. I'm pretty sure it's related to graphics card. Though, hard to pinpoint a description or search for solution.

about ≈2 times a week, my Windows Vista would froze. Meaning, the screen freeze and mouse and keyboard have no effect. The only way to get out i know of is holding down the Power key on the PC to force a shutdown. (Ctrl+Alt+Delete does not help)

When this happens, there's a random goggling sound for about 1 second.

The freezing happens usually when i'm in Second Life (which is a 3D game), but not always.

There's a very high chance of freezing when Windows just woke up from sleep, or the PC just powered on from power off. (so, every time i restart my computer due to freezing, there's a high chance that it'll freeze again immediately, sometimes before the Windows login screen shows up. More rarely (perhaps every few months), the LCD won't even show when i reboot (as if not getting any signal; stays in sleep state). When this happens, i force power-off, unplug the power and wait for 30 seconds, then power on the pc again.)

This freezing happens ever since i bought my PC last year.

I'm pretty sure it's not a gpu temperature problem, because i've install nvidia's system monitor. Also, i don't think it is a cpu ram problem, because i used Windows Memory Diagnostic to check and it didn't report problem. I'm a unix sys admin and web app programer. I don't know much about hardware.

my graphics card is: “BFG Tech NVIDIA GeForce 9800 GT 512 MB”

my graphics card driver has always up-to-date from nvidia. Right now it is version 257.21 (8.17.12.5721).

Microsoft Windows Vista Home Premium Edition, 64-bit Service Pack 2, Build 6002 DirectX 10.0 (6.0.6000.16386) AMD Phenom(tm) 9650 Quad-Core Processor my PC is: HP_Pavilion_A6750F_spec.txt

I don't have money to get a new power supply (i live on a $3 per day for food in the past 3+ years and in fact this blue-screen incidence triggered me to contemplate suicide for real.) So, in the past year, a habit i developed is to never set my machine to sleep. With that approach, the crash is reduced to perhaps once a week or every 2 weeks.

About 2 or 3 months ago, something else is wrong. That is, in Second Life, sometimes the screen will froze or black out for like 10 seconds, and the app will crash often. (since then i've read that this is symptom of video driver crash. The video driver crashes, and the OS restart it and re-connect to it. According to some gaming forum online, this can happen if you overclock your graphics card.)

And in the past 3 days, blue-screen started to happen about every hour, and arbitrary software would crash randomly.

Overall, in summary, i think there are 2 problems with my machine:

Addendum: I still don't know what's going on. Right now, my comp is running on 2 memory modules (took out the other 2; one of them might be defective.)

I think one thing that caused the frequent blue screen of death is due to a faulty software. It'll take a lot writing to explain but here's a quick try. First, a quick chronology:

One part i haven't mentioned in this story is about installing a driver for the onboard video chip the “ATI Radeon™ HD 3200”. That's a whole ordeal. Here's that story. So, in 2011-05-13, due to the BSoD, i thought perhaps i'll just use the onboard graphics chip. If my problem is caused by the graphics card with under-powered power supply, then switching to the onboard gpu would solve the problem. Though, i've almost never used the onboard gpu except the 1st day i got the machine. So, the driver must be very outdated.

At first, i just unplugged the NVIDIA graphics card, re-plugged my DVI connected to my display, and started machine, press F10 to get into bios, switch to onboard graphics (instead of PCIe). I don't remember exactly what happened except there are problems. Then, i recall that the driver must be very outdated. So i tried to update the driver. Now, that is a pain in the ass, took me about some 4 hours to do, admits many soft reboots or forced reboots and waiting for the computer, many times are in “Safe Mode with Networking”. The gist is that, when i went to the site 〔http://support.amd.com/〕 to get the driver, it took many tries. Took many tries to find the driver, because often it's not listed. Then, at one time finally able to download after the nth time of entering all the hardware model number and OS etc, then the site says something like permission denied because i didn't arrive at the site naturally, or some such. (yes i have full cookies and js on.) Extremely incompetent. Eventually, i found another page on the domain that lets me download the driver. I think it's this one: 〔http://support.amd.com/us/gpudownload/Pages/index.aspx〕.

After i downloaded it (it's called Catalyst Software Suite), then it crashes Windows or just isn't successful in some way. (it's funny, that the installation software has the luxury to display game ads while it is running. F��� you.) After several tries (translation: several hours), i learned that this software cannot be installed while in Safe Mode. So, eventually, i got it installed while in normal Windows mode. (which is a miracle, because: your machine crashes while in normal mode because you don't have the right driver, but the driver can only be installed in normal mode. LOL.)

In the end, after a day, of several soft/forced reboot and crashes and screen freezes and BSoD or screen scrambling, i was able to run windows with the latest driver software for my onboard “ATI Radeon™ HD 3200” gpu. Guess what, it doesn't work! In fact, in retrospect, the BSoD is more frequent, and another new symptom started to show. The system would froze with the screen scrambled.

Now, sometimes the next day or so, between many switches among the onboard gpu chip or the nvidia gpu, i realized one thing. That is, i think the Catalyst driver software installed a outdated Windows component that has caused me more crashes. I realized this because i happened to run Windows update manually by a whim (how thoughtful of me!), and it updated some “MS11-025: Description of the security update for Visual C++ 2010 Redistributable Package: April 12, 2011”, which is i think that Catalyst installed a older version while a newer version already exists on my machine.

… the above account is all slightly muddied up and written almost as fast as i can type. Suffice to say it's 3 days of frustration, reboot, wait for 10 to 30 min, crash, reboot … check event viewer, errors this or that, check task scheduler, try PC Doctor, try memory diagnostic, chkdsk, try search web randomly, think, try, reboot, suddenly Windows says “The task image is corrupt or has been tampered with.User_Feed_Synchronization-{42646F39-1993-43C3-9079-3A0DF3E11259}” … whatnot shit. Among the action i've taken is to think about suicide and sleep and go back to sleep again. Y'know? When i wake up, everything will be better.

Back to present. I think after the Windows update, things become more normal. I don't get BSoD for 3 or 4 hours if i just don't start to run and 3D app.

Right now, i'm using the NVIDIA graphics card. 2 memory modules. On the whole, i still don't know what exactly is wrong. Running the NVIDIA graphics card seems much more stable than using the onboard gpu. I think something must be faulty in the graphics department, because often the crash is related to running 3D apps, but i think there are also other things that went wrong in the past week. I still am not sure if my other 2 memory modules are defective. Certainly some software in my machine got problems. I don't know why Windows report that my task scheduler image is corrupt. I suppose forced reboot or memory error could corrupt OS files, but i think that's rare.

Note: in the whole experience, i actually learned quite a lot about PC and Windows admin tools. Event Viewer, BIOS, etc shit. Also, when you search for Windows/PC problems on the web, most of it is f���ing garbage. Ignorant teen gamer's chats, money-making sites with tons of ads but garbage info (f�ck Google), partial and circumstantial info (in you can discern it at all). Though, almost always you'll find your question out there, but no answer.

blog comments powered by Disqus