Relax... everything is fine.

For current or would-be users of Apple Mac computers, with answers to many FAQs.
Post Reply

Relax... everything is fine.

Post by muzines »

Image

It's ok, it's probably not at 128 degrees.

It's probably much higher, it's just the sensor maxes out at 128...

Hurry up and release some new M1 MBP's Apple, before my machine melts through the desk... :shocked::?
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by ManFromGlass »

You could warm up a meal with that while you work.
Would that be true multitasking?
:smirk:
User avatar
ManFromGlass
Longtime Poster
Posts: 7657 Joined: Sun Jul 24, 2011 12:00 am Location: O Canada

Re: Relax... everything is fine.

Post by s_e_a_n »

So what's happening Desmond? Are there some particular processes triggering this?
s_e_a_n
Poster
Posts: 71 Joined: Thu Apr 03, 2003 12:00 am Location: Kerry
Programmer, sound engineer, artist, musician

Re: Relax... everything is fine.

Post by muzines »

s_e_a_n wrote:So what's happening Desmond? Are there some particular processes triggering this?

It's basically hardware failure.

Back story: I have a 2010 MBP, a model which is notorious for having cooling problems. I run my fans higher than default anyway as I like machines to run cool, I hate overheating computers (but I also hate fans too...)

Anyway, the GPU eventually died as they all do in this generation of laptops, and Apple "fixed it" under a special repair program for failed GPUs. However, what i didn't know, despite asking the Geniuses at the time whether they replace the GPU with chips that no longer have the design flaw (Yes, I was told, that's correct), but no, what they *actually* do is swap out the motherboard for a different motherboard where currently the GPU hasn't failed yet and is still working.

So those too will fail. As mine did (I think maybe a year later).

So, as this is common, and not being in a position to replace it, I did what a lot of people did which is use some of the ways smarter people than me figured out how to keep these machines working by bypassing the GPU, and forcing the machine to only use the integrated GPU - this then lets them boot and continue to work, albeit without the ability to use external displays and other GPU related functions.

So this is why in the above screenshot the GPU VCore is at 0.00 volts - the GPU is bypassed and so there is no voltage passed.

Anyway, starting a couple of months ago I started noticed some new overheating behaviour. Normally I use this as an opportunity to clean out the internal fans which do get gunked up over time, but this didn't help. On keeping an eye on temps, I noticed that the GPU Die - Analog sensor was getting to very high - somewhat concerningly high - temps. I stuck that temperate gauge on my main menu bar, as I do the main CPU temperature, so I could keep an eye on it.

What would basically happen is that when the CPU starts working (ie doing anything other than being idle), the GPU Die temps start to climb. If I back off CPU work, the temps would relax a bit, and cranking the fans helps, but it was a problem and a real inhibitor for doing anything processor intensive. What I figured was that perhaps the heatsink had recently detached a bit from the GPU and was thus not cooling it properly anymore.

However, currently (since the last few days or so), it's just maxed at 128 degrees permanently, even cranking the internal fans *and* an external desktop fan don't bring it down. It's basically permanently maxing out the sensor all the time the machine is on - likely the GPU sensor has failed/burnt out now, rather than the chip actually being at that temperature. I thought it might have done that before as it maxed out at 128 for a while, but then started reading temps properly again,.

So yes, just bad stuff happening internally all around. You can see why the M1, and it's power efficiency (no heat! hardly any fans!) is so attractive to me right now (and why I'd hoped that the MBP's were not going to be late 2021 products, as I'm not sure my MBP will make it, and I do want to keep it running for legacy/compatibility reasons.)

Fun times. Honestly, the day I can do some proper work without heat and fans I'm going to be ecstatic...

So that's my tale of woe...
Last edited by muzines on Sun Apr 11, 2021 9:07 pm, edited 4 times in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by muzines »

Well, the temperature sensor does still "work". Woke the machine up from sleep, the GPU temp was 92 straight away and then over ten minutes slowly climbed with a more or less idle computer up to 128 degrees and then stayed there...
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by Drew Stephenson »

That's actually more concerning. Best keep the fans and innards clear otherwise you could be looking at a fire hazard. :(
User avatar
Drew Stephenson
Apprentice Guru
Posts: 28771 Joined: Sun Jul 05, 2015 12:00 am Location: York
(The forumuser formerly known as Blinddrew)
Ignore the post count, I have no idea what I'm doing...
https://drewstephenson.bandcamp.com/

Re: Relax... everything is fine.

Post by muzines »

blinddrew wrote:That's actually more concerning. Best keep the fans and innards clear otherwise you could be looking at a fire hazard. :(

It's currently idling at 106 degrees, so not as slammed as it has been over the past few days. The thing is, I don't really know what it means (other than making me nervous!).

Even when the thing is sitting at 128, it's not like the laptop feels hot, like it does when the CPU is working hard, and/or you connect thunderbolt stuff (where the cable also heats up the laptop). And all the rest of the temps are fine. So it's just in that one sensor spot.

Anyway, it's on the external GPU which I *know* has failed, and is also bypassed, so I can't really trust what the sensor is saying. It might just be that the GPU is in a worse state than it was a year ago, but that it doesn't really matter anyway.

Fun and games... :shifty:
Last edited by muzines on Mon Apr 12, 2021 12:46 pm, edited 2 times in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by ManFromGlass »

It’s amazing you’ve kept it going for so long. My towers are 2010 and cranky. Every day I get out of them is a bonus. I think they are waiting until I am mid-project with the deadline looming that something will happen.
Is it possible your temp sensor is flakey and the temp is fine?
User avatar
ManFromGlass
Longtime Poster
Posts: 7657 Joined: Sun Jul 24, 2011 12:00 am Location: O Canada

Re: Relax... everything is fine.

Post by muzines »

ManFromGlass wrote:It’s amazing you’ve kept it going for so long.

It has been a struggle at times, but I've been holding on as long as I can...

ManFromGlass wrote:Is it possible your temp sensor is flakey and the temp is fine?

Yes, it's very possible.

It does vary with load (when it's not continually maxed out I mean), but there are hardware problems with the GPU, and it could be that the sensor is either on, or close to the GPU and is also failing, or at least misreading.

One of the reasons I've been keeping an eye on it is to try to deduce where the problem might be, but I don't have anything conclusive from the observed behaviour. I've sort of been trying to train myself to not worry about it so much, even to the point of removing the temp sensor reading from the menu bar as it was unbearable watching it hit 110+, making me want to reach out to crank the fans etc.

But there are longtime processing things that require significant CPU load over time I've had to put a pause on as I'm not sure the machine can handle the continual stress. (The same applies to me, for that matter..! :lol: )
Last edited by muzines on Mon Apr 12, 2021 1:47 pm, edited 1 time in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by muzines »

I think it's just trying to mess with me...

Image

Image

Currently coasting along at a positively chilly 75 degrees... What does it mean? Who the h*ll knows, not me for sure!
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by Humble Bee »

Looks like you slept way past 10 o’clock! :)

“Late shall the sinner awaken...”
User avatar
Humble Bee
Regular
Posts: 395 Joined: Mon Jan 08, 2007 12:00 am Location: Cloughton Newlands

Re: Relax... everything is fine.

Post by muzines »

Humble Bee wrote:Looks like you slept way past 10 o’clock! :)

“Late shall the sinner awaken...”

Despite certain AI-related claims here to the contrary, I am not my computer... ;)

Morning is iPad time, anyway...
Last edited by muzines on Mon Apr 12, 2021 7:11 pm, edited 1 time in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by s_e_a_n »

I'd keep a close eye on it. Don't want it burning the house down or anything :wtf:
s_e_a_n
Poster
Posts: 71 Joined: Thu Apr 03, 2003 12:00 am Location: Kerry
Programmer, sound engineer, artist, musician

Re: Relax... everything is fine.

Post by merlyn »

I would believe the moderately inaccurate but reliable finger as temperature sensor. If it was really 128 degrees you would feel it, surely?

Also heat spreads and you'd get high readings on all the sensors, or at least see them all steadily rise while the computer is on.

From what you've posted this seems like the sensor is broken rather than an impending volcanic event.
merlyn
Frequent Poster
Posts: 1636 Joined: Thu Nov 07, 2019 2:15 am
It ain't what you don't know. It's what you know that ain't so.

Re: Relax... everything is fine.

Post by muzines »

merlyn wrote:I would believe the moderately inaccurate but reliable finger as temperature sensor. If it was really 128 degrees you would feel it, surely?

You'd think so, but if it's just in one highly localised position, it might be a small spot that doesn't spread much heat outwards.

merlyn wrote:Also heat spreads and you'd get high readings on all the sensors, or at least see them all steadily rise while the computer is on.

I do, but nothing that's particularly abnormal that I've noticed. But when the CPU is working, this sensor does also climb too.

merlyn wrote:From what you've posted this seems like the sensor is broken rather than an impending volcanic event.

Yes, but cranking the internal fans on *does* bring the sensor down (unless it's maxed out), so it's kind of working.

My best guess at this point is that it's working, but intermittently - ie it goes a bit crazy and climbs until it maxes out at 128, and no fan work changes it - I think this is the phase where it's behaving unreliably, and it stays broken until it starts working again.

That's my best guess on the evidence I've seen over the past weeks, anyway. So like I say, I'm trying to tell myself not to worry too much about it...
Last edited by muzines on Mon Apr 12, 2021 9:59 pm, edited 2 times in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by merlyn »

No, don't worry.

GPU Die -- Analog and GPU Vcore can't both be right. One of them is spurious. If the voltage is 0 the GPU isn't doing anything.

The happier place to be is to believe GPU Vcore and ignore the temperature.
merlyn
Frequent Poster
Posts: 1636 Joined: Thu Nov 07, 2019 2:15 am
It ain't what you don't know. It's what you know that ain't so.

Re: Relax... everything is fine.

Post by muzines »

merlyn wrote:GPU Die -- Analog and GPU Vcore can't both be right. One of them is spurious. If the voltage is 0 the GPU isn't doing anything.

Yep, the GPU is intentionally disabled, hence the 0V.

merlyn wrote:The happier place to be is to believe GPU Vcore and ignore the temperature.

True, but the sensor *is* reading - for instance, this morning the temp was running around 80 degs on casual use, and I ran Logic (at the beginning of the graph below) to test some plugins and as the CPU was working, the CPU was generating heat (from about 55 to about 85 degrees, which is pretty normal) and the GPU temp sensor again climbed and flatlined at 128, until I cranked up the fans and finished what I was doing, and after that the temps lowered again. So it's not like it's not reading temperature changes as such.

Image

Without knowing how these sensors work and where they are placed - presumably they can't be on the GPU die because without power they wouldn't work at all - it's difficult to know what's going on and how to read whether it's a real problem or not, which is a bit frustrating...
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by Folderol »

They could indeed be on the die, and very likely are. The simplest, and fairly linear type is a reverse biased diode, typically -2mV/degC (depending on current).
On the die, this can be placed right at the hottest part of the chip
Last edited by Folderol on Tue Apr 13, 2021 11:51 am, edited 1 time in total.
User avatar
Folderol
Forum Aficionado
Posts: 20278 Joined: Sat Nov 15, 2008 12:00 am Location: The Mudway Towns, UK
Seemingly no longer an 'elderly'.
Now a 'Senior'. Is that promotion?

Re: Relax... everything is fine.

Post by Wonks »

They will undoubtedly be NTC thermistors, which are a very cheap solid-state temp sensor. Their resistance goes down as the temperature increases. So a higher indicated temperature means either a lower measured resistance (maybe a partial short circuit), or the reference voltage used to put across them is a bit out of whack. e.g. if the circuit was supposed to have 3.3v across it but instead has 3.1v, then you can get significantly different temperatures reported by the software.

I found this Intel paper on on-chip temperature monitoring, which may provide some more background information for you.

https://www.intel.com/content/dam/www/p ... -paper.pdf
User avatar
Wonks
Jedi Poster
Posts: 18646 Joined: Thu May 29, 2003 12:00 am Location: Reading, UK
Reliably fallible.

Re: Relax... everything is fine.

Post by muzines »

Useful, thanks! :thumbup:
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by Folderol »

Interesting. That means Intel are way behind the curve. I'm pretty certain the ARM chips use diodes - don't know about AMD.
User avatar
Folderol
Forum Aficionado
Posts: 20278 Joined: Sat Nov 15, 2008 12:00 am Location: The Mudway Towns, UK
Seemingly no longer an 'elderly'.
Now a 'Senior'. Is that promotion?

Re: Relax... everything is fine.

Post by muzines »

I tried a different temperature reporting tool.

This one reports "N/A" once the temperature reaches 110, but otherwise agrees with the temperature readings from iStatPro.

Image

I would have thought the temp sensor can't be on die if the GPU is powered down, but the sensor is still working?

I guess the better explanation is likely to be as Wonks suggests - some recent-ish changes in resistance/etc from around the faulty GPU area that's throwing the temperature reading off, even though the sensor is responding to temperature changes...
Last edited by muzines on Tue Apr 13, 2021 12:13 pm, edited 4 times in total.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax... everything is fine.

Post by muzines »

Folderol wrote:They could indeed be on the die, and very likely are.

Actually, looking at how the two GPU sensors are labelled, they are called:

GPU Die - Analog (or "GPU Diode" - this is the problem sensor)
GPU Die - Digital (this is non-functional and gives no reading, presumably because the GPU is disabled)

So maybe the "digital" sensor is the on-die sensor, and the "analog" problem one is a separate thermistor associated to the GPU, and misbehaving as per Wonks' suggestion.
User avatar
muzines
Jedi Poster
Posts: 12332 Joined: Tue Jan 10, 2006 12:00 am
..............................mu:zines | music magazine archive | difficultAudio  | Legacy Logic Project Conversion

Re: Relax...

Post by tea for two »

desmond wrote:Image

It's ok, it's probably not at 128 degrees.

It's probably much higher, it's just the sensor maxes out at 128...

Hurry up and release some new M1 MBP's Apple, before my machine melts through the desk... :shocked::?

Clearly this gpu got all hot n flustered watching Frankie Goes to Hollywood.
Last edited by tea for two on Thu Apr 22, 2021 6:34 am, edited 1 time in total.
tea for two
Frequent Poster
Posts: 4009 Joined: Sun Mar 24, 2002 12:00 am
Post Reply