s_e_a_n wrote:So what's happening Desmond? Are there some particular processes triggering this?
It's basically hardware failure.
Back story: I have a 2010 MBP, a model which is notorious for having cooling problems. I run my fans higher than default anyway as I like machines to run cool, I hate overheating computers (but I also hate fans too...)
Anyway, the GPU eventually died as they all do in this generation of laptops, and Apple "fixed it" under a special repair program for failed GPUs. However, what i didn't know, despite asking the Geniuses at the time whether they replace the GPU with chips that no longer have the design flaw (Yes, I was told, that's correct), but no, what they *actually* do is swap out the motherboard for a different motherboard where currently the GPU hasn't failed yet and is still working.
So those too will fail. As mine did (I think maybe a year later).
So, as this is common, and not being in a position to replace it, I did what a lot of people did which is use some of the ways smarter people than me figured out how to keep these machines working by bypassing the GPU, and forcing the machine to only use the integrated GPU - this then lets them boot and continue to work, albeit without the ability to use external displays and other GPU related functions.
So this is why in the above screenshot the GPU VCore is at 0.00 volts - the GPU is bypassed and so there is no voltage passed.
Anyway, starting a couple of months ago I started noticed some new overheating behaviour. Normally I use this as an opportunity to clean out the internal fans which do get gunked up over time, but this didn't help. On keeping an eye on temps, I noticed that the GPU Die - Analog sensor was getting to very high - somewhat concerningly high - temps. I stuck that temperate gauge on my main menu bar, as I do the main CPU temperature, so I could keep an eye on it.
What would basically happen is that when the CPU starts working (ie doing anything other than being idle), the GPU Die temps start to climb. If I back off CPU work, the temps would relax a bit, and cranking the fans helps, but it was a problem and a real inhibitor for doing anything processor intensive. What I figured was that perhaps the heatsink had recently detached a bit from the GPU and was thus not cooling it properly anymore.
However, currently (since the last few days or so), it's just maxed at 128 degrees permanently, even cranking the internal fans *and* an external desktop fan don't bring it down. It's basically permanently maxing out the sensor all the time the machine is on - likely the GPU sensor has failed/burnt out now, rather than the chip actually being at that temperature. I thought it might have done that before as it maxed out at 128 for a while, but then started reading temps properly again,.
So yes, just bad stuff happening internally all around. You can see why the M1, and it's power efficiency (no heat! hardly any fans!) is so attractive to me right now (and why I'd hoped that the MBP's were not going to be late 2021 products, as I'm not sure my MBP will make it, and I do want to keep it running for legacy/compatibility reasons.)
Fun times. Honestly, the day I can do some proper work without heat and fans I'm going to be ecstatic...
So that's my tale of woe...