Windows Blue Screen of Death: A Account of My PC's Memory Failure

By Xah Lee. Date: . Last updated: .

This page documents a episode of personal computer failure.

I started to get Blue Screen of Death in the past week. It got worse, and started to happen every hour or so. Spend the whole past 2 days diagonalizing the problem. It's extremely painful. It seems to be a faulty memory module.

It's gonna take me perhaps 10 hours to write my story coherently, starting with my system spec, expected problems, symptoms of past year, recent new problems, with all the tech detail and specs. And also document my experience, my actions, and all sort of software incompetence. Instead of doing that, here's i'll haphazardly comment some random points that comes to my mind.

Note: after i wrote this article, i noticed that i'm still getting blue screen of death. The problem may not be my memory after all. So, don't pay much attention to places where i said how i fixed my memory or PC Doctor. In particular, PC Doctor's report of faulty memory might be due to malfunction in OS software somewhere.

What is Prefetcher, SuperFetch, ReadyBoost?

I learned about: Windows: What's Prefetcher, SuperFetch, ReadyBoost? .

Microsoft's Memory Diagnostics Tool Problems

In Windows, there's a “Memory Diagnostics Tool” under [Control Panel\Administrative Tools]. The file path is %SystemRoot%\system32\MdSched.exe. When you launch it, it asks you to restart, basically setting up a scheduled task. When you reboot, it checks your memory. There are 2 problems i found with this tool:

(actually, as of now, i found the memory diagnostics result in Event Viewer in [Control Panel\Administrative Tools] (full path at %SystemRoot%\system32\eventvwr.msc /s)). Once you are in Event Viewer, it's under the tree [Microsoft, Windows, MemoryDiagnostics-Results, Debug].

Note: when the tool is running, any error it found is displayed right away. But i only see the message something like “so far no error found” for the full length of the time it's running. I also have tried the extended test. Same result.

Microsoft has a web page about the “Windows Memory Diagnostic” at [ http://oca.microsoft.com/en/windiag.asp ]. It's not clear to me this software is exactly the same as the one bundled with Windows Vista, but i think it's the same or variant. The page has full documentation of the software and info about memory, quite well written. For the record, here's a text version capture: Windows_Memory_Diagnostic.txt

PC-Doctor

What actually helped me finding the problem was the tool PC-Doctor. On HP machines, it's in the menu [Start ▸ All Programs ▸ PC Help and Tools ▸ Hardware Diagnostic Tools]. The full path is C:\Program Files\PC-Doctor for Windows. I didn't trust any such program that was bundled with HP shit. Never launched it since i got the machine 2 years ago. Only after a day of frustration, i ran it. What a miracle, this tool actually told me i have a memory problem.

PC Doctor start menu location screenshot
“PC Doctor start menu location screenshot”
pc doctor screenshot
Screenshot of PC Doctor's memory test page.

PC-Doctor has several memory tests. The one that failed is the first one: “Advanced Pattern Test”. Here's it's online description about this tool:

Verifies memory cell corruption does not occur from read/write activity on adjacent cells.

This test checks for memory cell corruption from read/write activity on adjacent cells (cells are individual bits). It is run from memory address 0 through each memory cell sequentially to the top of extended memory, then from the top down to memory address 0.

Note: This test will only run on Extended Memory. Each version of the Windows operating system has an absolute minimum amount of physical memory that must be available to it at all times. The Advanced Pattern Test can cause a lot of paging, which can lead to a test time in excess of several hours.

Warning!: This test will stop and record a result of Cannot Run in the test log if the minimum available memory falls below 20 MB. This is only a factor when running the System Stress Test, which includes the Advanced Pattern Test.

pc doctor memory help screenshot
Screenshot of PC-Doctor's “Advanced Pattern Test” help screen.

Note: apparently, you cannot copy text from PC-Doctor help screen. Either as a sloppiness in software or intentional. What a pain in the pass with these mass-market software. However, the help file is at c:/Program Files/PC-Doctor for Windows/pcdrmemory.p5p. Here's text version of the help file for the record: [pcdrmemory.p5p.txt ]

My Memory Spec

There are 4 slots (called “banks”), each slot is a module “2048 MB DDR2-SDRAM (PC2-6400 / 800 MHz)”. They are all made by Samsung, manufactured in 2009 Jan. (detail here: memory_spec.txt.) (See: Wikipedia [ https://en.wikipedia.org/wiki/DDR2_SDRAM ])

The 4 banks (slots) are divided into 2 channels, A and B. (See: Dual-channel architecture) Memory module replacement must be per bank (i.e. 2 slots at a time. For example, you cannot just remove a module from 1 slot.) PC-Doctor didn't report which memory module is faulty, but the problem seems to be the first 2 slots. Because, i first removed the memory modules at 3rd and 4th slots and tried to reboot but it won't reboot.

Symptoms Over the Year

I always had a problem with this machine since i got it in 2009-05. (My PC spec and story can be seen at: Why I'm Switching from Mac/Unix To PC/Windows.)

The problem is that whenever the PC just woke up, or it just started, and when i run Second Life, it would often lockup. (no blue screen, just that screen froze, and mouse and keys are not operative. At that point, i have to hold the power button for few secs to force the machine to shut down.) This happened ever since i got the machine in 2009-05. I run Second Life about every day, so this happens about every other day.

I've ran Microsoft's memory test and it didn't report any problem, and i always thought the problem is related to my graphics card, because the crash almost always (~98% of time) happens when i run Second Life.

On 2010-08-12, i posted my question to a online help forum at [ http://superuser.com/questions/175079/pc-randomly-freeze-due-to-graphics-card/175120#175120 ], and got a helpful answer, realizing that the problem may be due to a under-powered power supply. My power supply is rated at 300W, but my graphics card spec demands ~400W. Here's my post

i have this random freezing problem for a year now. I'm pretty sure it's related to graphics card. Though, hard to pinpoint a description or search for solution.

about ~2 times a week, my Windows Vista would froze. Meaning, the screen freeze and mouse and keyboard have no effect. The only way to get out i know of is holding down the Power key on the PC to force a shutdown. (Ctrl+Alt+Delete does not help)

When this happens, there's a random goggling sound for about 1 second.

The freezing happens usually when i'm in Second Life (which is a 3D game), but not always.

There's a very high chance of freezing when Windows just woke up from sleep, or the PC just powered on from power off. (so, every time i restart my computer due to freezing, there's a high chance that it'll freeze again immediately, sometimes before the Windows login screen shows up. More rarely (perhaps every few months), the LCD won't even show when i reboot (as if not getting any signal; stays in sleep state). When this happens, i force power-off, unplug the power and wait for 30 seconds, then power on the pc again.)

This freezing happens ever since i bought my PC last year.

I'm pretty sure it's not a gpu temperature problem, because i've install nvidia's system monitor. Also, i don't think it is a cpu ram problem, because i used Windows Memory Diagnostic to check and it didn't report problem. I'm a unix sys admin and web app programer. I don't know much about hardware.

my graphics card is: “BFG Tech NVIDIA GeForce 9800 GT 512 MB”

my graphics card driver has always up-to-date from nvidia. Right now it is version 257.21 (8.17.12.5721).

Microsoft Windows Vista Home Premium Edition, 64-bit Service Pack 2, Build 6002 DirectX 10.0 (6.0.6000.16386) AMD Phenom(tm) 9650 Quad-Core Processor my PC is: HP_Pavilion_A6750F_spec.txt

I don't have money to get a new power supply (i live on a $3 per day for food in the past 3+ years and in fact this blue-screen incidence triggered me to contemplate suicide for real.) So, in the past year, a habit i developed is to never set my machine to sleep. With that approach, the crash is reduced to perhaps once a week or every 2 weeks.

About 2 or 3 months ago, something else is wrong. That is, in Second Life, sometimes the screen will froze or black out for like 10 seconds, and the app will crash often. (since then i've read that this is symptom of video driver crash. The video driver crashes, and the OS restart it and re-connect to it. According to some gaming forum online, this can happen if you overclock your graphics card.)

And in the past 3 days, blue-screen started to happen about every hour, and arbitrary software would crash randomly.

Overall, in summary, i think there are 2 problems with my machine:

Addendum: I still don't know what's going on. Right now, my comp is running on 2 memory modules (took out the other 2; one of them might be defective.)

I think one thing that caused the frequent blue screen of death is due to a faulty software. It'll take a lot writing to explain but here's a quick try. First, a quick chronology:

One part i haven't mentioned in this story is about installing a driver for the onboard video chip the “ATI Radeon™ HD 3200”. That's a whole ordeal. Here's that story. So, in 2011-05-13, due to the BSoD, i thought perhaps i'll just use the onboard graphics chip. If my problem is caused by the graphics card with under-powered power supply, then switching to the onboard gpu would solve the problem. Though, i've almost never used the onboard gpu except the 1st day i got the machine. So, the driver must be very outdated.

At first, i just unplugged the NVIDIA graphics card, re-plugged my DVI connected to my display, and started machine, press F10 to get into bios, switch to onboard graphics (instead of PCIe). I don't remember exactly what happened except there are problems. Then, i recall that the driver must be very outdated. So i tried to update the driver. Now, that is a pain in the ass, took me about some 4 hours to do, admits many soft reboots or forced reboots and waiting for the computer, many times are in “Safe Mode with Networking”. The gist is that, when i went to the site [http://support.amd.com/] to get the driver, it took many tries. Took many tries to find the driver, because often it's not listed. Then, at one time finally able to download after the nth time of entering all the hardware model number and OS etc, then the site says something like permission denied because i didn't arrive at the site naturally, or some such. (yes i have full cookies and js on.) Extremely incompetent. Eventually, i found another page on the domain that lets me download the driver. I think it's this one: [http://support.amd.com/us/gpudownload/Pages/index.aspx].

After i downloaded it (it's called Catalyst Software Suite), then it crashes Windows or just isn't successful in some way. (it's funny, that the installation software has the luxury to display game ads while it is running. FAAK you.) After several tries (translation: several hours), i learned that this software cannot be installed while in Safe Mode. So, eventually, i got it installed while in normal Windows mode. (which is a miracle, because: your machine crashes while in normal mode because you don't have the right driver, but the driver can only be installed in normal mode. LOL.)

In the end, after a day, of several soft/forced reboot and crashes and screen freezes and BSoD or screen scrambling, i was able to run windows with the latest driver software for my onboard “ATI Radeon™ HD 3200” gpu. Guess what, it doesn't work! In fact, in retrospect, the BSoD is more frequent, and another new symptom started to show. The system would froze with the screen scrambled.

Now, sometimes the next day or so, between many switches among the onboard gpu chip or the nvidia gpu, i realized one thing. That is, i think the Catalyst driver software installed a outdated Windows component that has caused me more crashes. I realized this because i happened to run Windows update manually by a whim (how thoughtful of me!), and it updated some “MS11-025: Description of the security update for Visual C++ 2010 Redistributable Package: April 12, 2011”, which is i think that Catalyst installed a older version while a newer version already exists on my machine.

… the above account is all slightly muddied up and written almost as fast as i can type. Suffice to say it's 3 days of frustration, reboot, wait for 10 to 30 min, crash, reboot … check event viewer, errors this or that, check task scheduler, try PC Doctor, try memory diagnostic, chkdsk, try search web randomly, think, try, reboot, suddenly Windows says “The task image is corrupt or has been tampered with.User_Feed_Synchronization-{42646F39-1993-43C3-9079-3A0DF3E11259}” … whatnot shit. Among the action i've taken is to think about suicide and sleep and go back to sleep again. Y'know? When i wake up, everything will be better.

Back to present. I think after the Windows update, things become more normal. I don't get BSoD for 3 or 4 hours if i just don't start to run and 3D app.

Right now, i'm using the NVIDIA graphics card. 2 memory modules. On the whole, i still don't know what exactly is wrong. Running the NVIDIA graphics card seems much more stable than using the onboard gpu. I think something must be faulty in the graphics department, because often the crash is related to running 3D apps, but i think there are also other things that went wrong in the past week. I still am not sure if my other 2 memory modules are defective. Certainly some software in my machine got problems. I don't know why Windows report that my task scheduler image is corrupt. I suppose forced reboot or memory error could corrupt OS files, but i think that's rare.

Note: in the whole experience, i actually learned quite a lot about PC and Windows admin tools. Event Viewer, BIOS, etc shit. Also, when you search for Windows/PC problems on the web, most of it is faaking garbage. Ignorant teen gamer's chats, money-making sites with tons of ads but garbage info (faak Google), partial and circumstantial info (in you can discern it at all). Though, almost always you'll find your question out there, but no answer.

In the past 2 days, i spend like 20 hours rebooting my PC, trying to fix it. (See: Windows Blue Screen of Death: A Account of My PC's Memory Failure) In the process, i learned quite a lot things about computer memory, PC component, and Windows technology. Here's something i learned about memory.

memtest86plus screen
memtest86+ screenshot (from Wikipedia)

** memory problem detected by Memtest86+ v4.10.

failing address • error bit
0012290d7d8 • 00004000
0012290d838 • 00004000
0012290d7b8 • 00004000
0012290d818 • 00004000

[Memtest][ https://en.wikipedia.org/wiki/Memtest ]

** ddr2 sdram
http://en.wikipedia.org/wiki/DDR2_SDRAM
# Optionally implement ECC, which is an extra data byte lane used for correcting minor errors and detecting major errors for better reliability. Modules with ECC are identified by an additional ECC in their designation. PC2-4200 ECC is a PC2-4200 module with ECC.

** ecc
 http://en.wikipedia.org/wiki/ECC_RAM#Errors_and_error_correction

Electrical or magnetic interference inside a computer system can cause a single bit of DRAM to spontaneously flip to the opposite state. It was initially thought that this was mainly due to alpha particles emitted by contaminants in chip packaging material, but research[6] has shown that the majority of one-off ("soft") errors in DRAM chips occur as a result of background radiation, chiefly neutrons from cosmic ray secondaries, which may change the contents of one or more memory cells or interfere with the circuitry used to read/write them. There was some concern that as DRAM density increases further, and thus the components on DRAM chips get smaller, while at the same time operating voltages continue to fall, DRAM chips will be affected by such radiation more frequently—since lower energy particles will be able to change a memory cell's state. On the other hand, smaller cells make smaller targets, and moves to technologies such as SOI may make individual cells less susceptible and so counteract, or even reverse this trend. Recent studies[7] show that single event upsets due to cosmic radiation have been dropping dramatically with process geometry and previous concerns over increasing bit cell error rates are unfounded.

This problem can be mitigated by using DRAM modules that include extra memory bits and memory controllers that exploit these bits. These extra bits are used to record parity or to use an error-correcting code (ECC). Parity allows the detection of all single-bit errors (actually, any odd number of wrong bits). The most common error correcting code, a SECDED Hamming code, allows a single-bit error to be corrected and (in the usual configuration, with an extra parity bit) double-bit errors to be detected.

Seymour Cray famously said "parity is for farmers" when asked why he left this out of the CDC 6600.[8] He included parity in the CDC 7600. The original IBM PC and all PCs until the early 1990s used parity checking.[9] Later ones mostly did not. Wider memory buses make parity and especially ECC more affordable. Many current microprocessor memory controllers, including almost all AMD 64-bit offerings, support ECC, but many motherboards and in particular those using low-end chipsets do not.

An ECC-capable memory controller as used in many modern PCs can typically detect and correct errors of a single bit per 64-bit "word" (the unit of bus transfer), and detect (but not correct) errors of two bits per 64-bit word. Some systems also 'scrub' the errors, by writing the corrected version back to memory. The BIOS in some computers, and operating systems such as Linux, allow counting of detected and corrected memory errors, in part to help identify failing memory modules before the problem becomes catastrophic.

Memory Testing, PC Details, Hello Ubuntu Linux

About a week has passed. I finaly found out what's my PC's problem. At least, one of the problem, for sure.

Last time, remember that i took out 2 memory modules and the blue-screen-of-death (BSoD) stopped? But after 2 days, i thought, why not put it back, since, afterall, Microsoft's Memory Diagnostics Tool didn't find any problem with it? (and i have ran that memory dianostic a thousand times.) Perhaps, my problem is elsewhere.

the first time i put back the 2 modules, then reboot, the machine started to beep. Tried it again, beep again. That usually means hardware problem, but was too lazy and tired to search for the motherboard manual to see what the beep signify. But next day, i tried to put the memory back in again. This time, no beep. Then i have ran memory dianostic many times after wards, and it didn't report any problem. But next day, BSoD came back. As before, it's frequent, like few times a hour. This got me completely discouraged. Today and yesterday, i spend about 20 hours dealing with my PC. Finially, the last resort is either to do a restore to factory default, or installing linux. Either solution is going to take me a day, and no guarantee that it'll work. I wasn't looking forward to it, but have to.

So, i took few hours to make sure all my personal files are backed up and up-to-date. (am pretty sure that if i go with linux route, i'll lose lots of file details, such as a year's worth of email in Microsoft Mail format. (anytime you switch OS or change email client, you are bound to lose lots of format, attachement, etc, of your emails. See: History of Email, 2009Unix and the mbox Email Format.)

First i tried is to look at linux. Ubuntu in particular. I wasn't hopeful this would work. Long story short, after 5 hours trying to install it, i also found the answer to my PC problem, namely, a faulty memory module. And the Ubuntu experience is actually quite spectacular. But first, let's discuss memory.

[ http://www.pendrivelinux.com/universal-usb-installer-easy-as-1-2-3/ ]

long story

Windows Update error 84C40007

Windows Update error 84C40007 http://windows.microsoft.com/en-GB/windows-vista/Windows-Update-error-84C40007

windows update error .net framework 4 will not update - error code 66A

[ http://answers.microsoft.com/en-us/windows/forum/windows_7-windows_update/ms-sql-server-2008-service-pack-2-kb2285068-error/554de6fd-c1ba-4974-b669-af517b4bdfea ]