I had an incredible amount of problems with a new PC that I sold to a customer.
The PC used an intel DG41RQ motherboard, 4Gb RAM, and I installed Windows 7 64-bit.
Windows 7 installed without any problems, but while testing the PC, I would sometimes get an application error (eg internet explorer has stopped working, etc etc)… but it would be very infrequent, and after the error, the application would seem to just keep going as if nothing had happened.
So I quickly forgot it. But once the customer got it, she would often get error with thunderbird email (stopped working)… but worse than that, sometimes the PC would spontaneously restart, wouldn’t shut down, and sometimes would go into the “repair windows 7” screen… All very worrying for someone who purchased a brand new PC,and doesn’t know much about computers.
This would also not be good for my good reputation, unless I can fix the problem quickly.
I go back and run memtest86+ (just until it completes the block copy test), and it finds no problems.
Next, I update the BIOS. In the past, bios problems have often caused problems with new PCs.
After that I leave, asking the customer to keep an eye on things, and to let me know if the problem is still there.
A few days later, I’m told there are more problems.
I go out again, and I take a look at the windows event log.
I see it has a lot of weird errors. Some of the most severe are:
- MSE oobe stopped: 0xc000000d
- driver detected a controller error on deviceharddisk2dr2
- Circular Kernel Context Logger failed to start with the following error: 0xC0000035
- bad pool caller (this was a blue screen of death!)
During all this I found a program called bluescreenview was great at analysing what past BSODs meant.
I downloaded, installed, and ran, seatools, as I had read some reports that the seagate st31500341as could sometimes be problematic.
I managed to run the short generic test, the short DST test, the long DST test, but the long generic test locked up about half way through…
At this stage, I took the PC back to the office for some longer-term tests.
I ran memtest86+ overnight (ie several passes), but no problems were reported.
I then tried the seatools long generic test, and I had to try twice, before it managed to complete its test successfully.
And now I’m stumped.
What could be causing all these windows 7 errors?
I also noticed some SQM client errors in the event log, which led me to disabling the “customer experience improvement program”… but that didn’t help either.
Other errors led me to disabling (it can’t be removed) .net 3.5
I then decided to try the windows repair facility (you get to it by pressing F8 while the PC is starting, then select “repair your computer”
I then noticed an option to perform a “windows memory diagnostic”.
I thought: at this stage, it can’t hurt.
The diagnostic forces the PC to restart into the stand-alone diagnostic tool.
Once the tool was running, I found it didn’t detect any memory errors (after a few passes).
However, I did notice I could press F1 to configure the memory diagnostic tool.
Within the configuration area, I selected the “extended test”, set the pass count to 0, then pressed F10 to restart the testing.
After many hours of testing, I found the tool displayed a status of:
hardware problems were detected
At last! Now I’m sure there is a hardware problem.
I send the PC back to the supplier (I didn’t want to waste any more time trying to pinpoint a possible motherboard or CPU fault).
After a few days, (and some pointers from myself), the supplier eventually found a RAM fault. They replaced the RAM, and the problem went away.
Once I got the system back, I ran my own test to confirm that everything was, indeed, OK… then I returned the PC to a very relieved customer.
In the end, I was quite surprised that the (extended) memory test that is built into windows 7,is actually better that memtest86+