@@BattleOverride856 would pretty much be a little... well... dumb? - cause there are PC's that dont even turn on, how are you going to record that? "Oh, im pressing this button on top and nothing happens." - Video over. not the best thing to add to a video, would it? xD
@@haxkztasy your right maybe not all videos but like this one where he games and shuts off while playing and the graphic card overheats. I meant those kind of videos.
@@ItsRebel98 one could do that, but sometimes you may not wanna risk anything, especially when you personally dont know the problem and dont know much about PC's in the first place, but i do know what you mean. you could ask for such, but it can also always get flimsy cause you never know the exact problem and when you turn on your PC while you know something isnt going as it should, you might break something somewhere cause you turned it on again. had it happen to me, where i knew the problem, turned on my PC and suddenly i had to buy nearly a complete new PC cause my Power Supply popped and if i say: Popped, then i mean that it throwed sparkles and my Mainboard was done aswell cause of it, even tho i already knew that my Power Supply was the problem. (tho i wanted to make sure... which was dumb.)
Clearly, it was an Overheating issue. As the Card turned off and the fans ramped up, it means a failsafe-mechanism has been triggered. In another words, the GPU tried to safe itself from permanent damage. At 13:38 you can literally see the cause. A Poor thermal paste application results in poor die coverage, thus creating a high enough Hot-Spot temperature (105C°) somewhere in the die that triggered it. EDIT: It's definitely the manufactures fault.
exactly... Looks like basically a quarter of the die had no paste. Just because the temp read out didn't spike, doesn't mean there weren't hot spots causing the failsafe crash.
first crash it was pretty clear it was on the gpu. first thought was a bad overclock or bad thermal interface. probably could be temporarily mended with undervolting but that should not happen.
That is the risk when you build. It could be anything when the issue isn't obvious, such as a faulty power supply. Especially when it is an instability issue like this, could be just about anything... from drivers, to OS, to faulty hardware, to a short, to hardware incompatibility. Very difficult to diagnose. The worst is when you build and the machine doesn't post. The worst feeling ever... all of that excitement and time taken to build, just to see it not post. Then have to figure out what is wrong. Has happened to me a couple of times over the years.
And a lot of the builders skip over the important things that need to be done . Like Bios settings and the more important technical details - configurations ...
Not joking sometimes reseating and pasting everything absolutely does work! I think you did fix it and even the fans are much more silent now which is a good sign lol.
The bottom section of the GPU die didn’t have any thermal paste on it. The card went into thermal shutdown after running a game or benchmark which would explain the symptoms. When pasting a bare die, it’s important that the entire die is covered otherwise you get situations like this. I’m glad you were able to fix it and save the customer downtime on his pc!
Yup. On 13:21 you can even briefly see the heatsink and how it also didnt have any paste on that part. Sometimes all the paste gets stuck on either the die or heatsink, but that wasnt the case here; there was no paste covering that part of the die whatsoever, and it was like 1/4 of the die
@@DavorBa Yeah, that could've happened, by either the manufacturer, or by refurbish, cause I saw that too. Sometimes, people do get overworked, in those settings, and things go missed, like that. Luckily, this card was saved, by the looks of it.
I can Agree on that, I didn't put enough paste on my rx580 (or bad spread) and it would crash after heaven in the same way (the gpu fan maxes out for safety). For me the symptoms was reduced clocks and temps relative to normal operation. After the repaste, everything is fine since. Although, I could have snagged a 5700xt just before the market increase if I didn't troubleshoot it...
its possible that part of the chip gets too hot. these chips have multiple sensors. The single gpu thermal sensor is the average of all the sensors and really doest tell much. Some drivers expose GPU hotspot thermal sensor to get the correct information. If your gpu hotspot thermal sensor is much hotter than the average gpu thermal sensor than you likely have a bad paste.
I had this problem with my 2080ti, tried everything possible until I found a forum thread that told me about thermal paste in the GPU. I was hesitant to take it apart myself but I was out of options so I just had to give it a go. Once I had the card apart it was clear as day that there was next to no thermal paste left, so I went out and bought paste, reapplied and the GPU was as good as new and still going strong.
This! My former gpu 2080 ti gaming x trio also had this too, factory thermal paste just "evaporated" to oblivion, oddly the gpu temp wasn't so bad, it was around 74-75 degrees, but games crashed a lot even after downclocking. After repasting, the problem went away.
Man, that really sucks having an essentially defective gpu. Good thing it's as simple as putting more thermal paste instead of something like a blown capacitor
@@tictechto apparently it was a common occurrence with the 2080 ti cards from what I have read online, some believe they used generic paste maybe so the cards don't live as long lives so you have to upgrade more frequently but who knows. Most likely a manufacturing fault.
@@bigbubba0439 yeah very luck that it was a simple fix, at the time I before i knew about the paste I was already looking to buy another high end card but luckily spent like $8 on Arctic MX-4 and this was like 6 months ago and havn't had any issues since reapplying.
13:21 looks like maybe 20% of the die didnt get good thermal coverage. the paste on the heatsink is also not flattened out. so definately was causing local overheating.
@@MrPruske I was wondering the same, because the GPU really was suffering. I agree with your observation. Uneven pressures also can cause similar issues.
Yeah and if the probes aren’t on that area or certain memory modules that are overheating, their temps won’t show up on the software. GDDR6X memory runs very hot so new thermal pads and a paste job can make a big difference
As soon as you took off that gpu cooler, I knew it had to be bad heat dissipation. That thermal paste looked a little dry, and being a newer card that would lead me to believe that factory uses cheap paste out the door. Looked like a bad paste application as well. Awesome work!
The paste job from factory looked really iffy. Probably caused a localized 'hotspot' that was far enough away from the normal sensors to not get noticed by monitoring software, that was tripping a temp protection fault on one of the components somewhere on the card. Repaste and reseat solves the hotspot, causes error to disappear.
Man I had this exact same issue for almost a year. I finally decided to take apart my GPU and re-applied thermal paste, and I haven't had an issue since. It was a little scary since it was my first time doing it, but I'm very glad I did.
13:37 (what a time) shows the die isnt even covered for 20ish % of its surface. Manufacturer screwed up. You cleaned it and pasted it properly + reassembly... and voila, works. Love to see it.
I can attest to just opening up a gpu.. repasting, cleaning and PROPERLY screwing it back together causing the gpu to function flawlessly again. In face, it very well could have been a bad cooler mount or overtightened from the factory.
Almost certain it's a software issue going on with Apex Legends. Respawn has yet to address or solve the issue after 6+ months. Countless redditors have experienced the same situation with the game but there's an in-game fix for it which involves disabling some graphical settings.
@@TheDeluche hm? He was on Heaven and it still didn’t work. Based off of the thermal paste application it was 100% that because about a quarter of the dye was uncovered so it was causing a hotspot
Honestly Greg is becoming the most wholesome tech UA-camr at this point. Also for all the UK bros this card is on big discount on ebuyer to the point where it’s around 250£ cheaper than any other model. 750£ for a 3080ti
I think that GPU hotspot temperature was bad enough for it to shut off because the bottom of the GPU die was clean of any paste. That lack of paste is shameful for a manufacturer. Edit: next time you have a GPU issue on your hands, use GPU-Z to check hotspot temps. Sometimes, it's more important than your core temp.
I had a very similar issue with my 2070, worked great until I started gaming. Had to change all settings to low in game, would work for a while, but eventually would crash the whole PC. I actually downgraded and put in a GT1030 and have had not had any issues since. My son took the 2070 and put it in his rig and confirmed it was definitely a problem with that card.
Many years ago I built a pc for a friend and this exact thing kept happening. Changing the power supply to a higher wattage one fixed the issue that time.
if that 2070 has a backplate check it for tightness....had one do this exact same thing, turned out that manf (actually PNY lol) had over tightened the back plate on the card and it was flexing a memory chip. just so turned out when I re pasted it with some noctua, I torqued it right also lol
Did your 2070 like make a really loud feedback noise through the speakers or headphones and then cut off moments later? Mine does that currently and I have no clue why it doesn’t do it often but I thought maybe it could be from it letting the pc warm up and just going to temps around 65 too fast. But that isn’t the issue, also was your 2070 overclocked
@@Everdreamjustplays hmmmm this is very interesting bc I keep getting a memory error too, my pc will crash and it will give an error code such as “Stop Code: Memory management” or something like that.
In my opinion the absolute best diagnostic and repair video series on UA-cam. Anyone that wants to learn about repairing computers should start with this series for starters and watch Greg's step by step progressive diagnostic methods.
Mr. Salazar for the rescue big heart and knowledge if it were me i would have sent it for an RMA and called it a day. appreciate all the thinking out of the box and loved the series. cheers good sir.
The same issue I had last week; after a quick inspection, I discovered that the extension cable is slightly looser than the pcie cable. It only happens if the computer shakes.
I had a similar issue with my Gigabyte RTX2070 Super Gaming OC after around 2 1/2 years of gaming use. Mine didn't go black screen, but the fans did go max speed, and temp went upto 105c, the gpu usually runs around the 70c mark while gaming. Returned under warranty, and when I got it back, the report was that they "Replaced the thermal paste and pads, fans, and other bits and pieces". The exact words on the report. Since then, the card has been faultless. Now has a mind oc, and temps top out at around 75c. The issue was happening even at factory settings btw. Another great video Greg. Keep them coming.
When you took the back plate off, I noticed that capacitors (iirc those are caps) on the back of the GPU chip itself are 5 bigger black squares and one group of smaller ones. Back when 30xx series launched, that was actually the issue, card's power delivery where those black caps couldn't filter out input voltage well enough and would cause similar / same problems as in this video. Basically PNY cheaped out on caps where they are needed for power delivery. Edit: fix was software related by dropping clocks or RMA for a better, newer card. This one may be from an old batch of cards.
i think jaytwocentz has some videos about that, i remember that game from amazon bricking gpus and ultimately the source of the problem where those caps, as soon as i saw those big caps i knew it was those
Well done Greg, you definitely fixed it. I, like many other commenters, believe it had to do with the dried out, nearly cracking thermal paste on the die. The fans ramping up symptom certainly fits.
Just to throw it out there, I had the same Symptoms with a AX1200i and a r9 290 Crossfire. Spiking loads on the GPUs managed to throw the OCP, which led to black monitors and the GPU fans going full tilt. Repasting and underclocking has reduced the Powerdraw, so the Problem is temporarily fixed in my Opinion.
So you're saying these cards experience power draw problems? and you're mentioning a 1200Watt PSU and 2 identical AMD cards in crossfire.. the crossfire setup alone draws 500Watt on load.. it doesn't exceed this number by much... so how is it a power draw issue if you have 1200Watts of raw power xD Don't tell me your other pc components draw the other 700+ Watts.. that's impossible. i've had the same problem however with an AMD RX590 from the Sapphire vendor.. i also did a card clean and repaste + 100mhz underclock on the boost state clock + slight undervolt to match it with the underclock and the problem has stayed away ever since. But just like you i also have a beefy PSU, that is capable of powering more then 4 RX590s.. so how is it a power draw issue? we both have very different parts yet the problem is the same.
@@VintageCR Rather late for the reply, but let me tell you how I managed that: 2x R9 290 @ unlocked Bios, 800W E5-1650v2@4.6 GHz 250W 8 Dimms ddr3 1866 , EATX Board X79, HDDs and Fans did the rest. Each Card worked flawlessly alone or with dismounted Crossfire, switched the Crossfire on and saw at least every hour a blckscreencrash with fans to the Max.
this series got me hook!!.... whats not to love about greg .. the dude is generous, the dude is smart, the dude is kind, the dude is always lending a hand, the dude is always sharing his knowledge and skills, id say let greg run for the presidency,.. i for one will definitely be voting for GREG!!
If he is planning to keep the card and not RMA it. I would at least put the clocks back to the factory settings. If the card crashes at factory settings in means it has some bad components and needs to be RMA'd. But like the others said it could have just been overheating because of poor thermal paste application. But either way, I would not keep a GPU that crashes at factory clocked settings.
The weird fixes that resolve problems successfully that cannot be explained happens all the time. This is a good series as while also solving problems for people it also shows them a first hand experience of a problem and how to approach fixing it.
I've never re watched a video for information on the sponsor. I'm constantly going back because I can never remember lolol Thanks for having a sponsor that is useful and reasonably priced. :)
I had something similar like this as well when it came to the problem on a RTX 2080. My fans would ramp up to 100% at 60c being unable to control the fans at all but the difference between my issue and this guys issue was mine was still operational but with the obnoxious fans ramping up to 100% at 60c and normalising when below 60c. I ended up opening my card (warranty was already out) and found out that a huge chunk of the thermal paste had completely melted to the point that some of the dye itself was visible. So what i am assuming is the temps being shown normal but the cards auto fan curve kicked in due to part of the exposed dye overheating. I repasted it and it solved my issue. Took me a while to figure out the issue with my card since i wanted to avoid opening it up, so went through so much software, update and bios crap for it to turn out to be such a simple fix yet there was hardly any info about the issue online so hoping this helps someone else out there who has the same issue cause it was driving me crazy.
I just posted a similar comment, I had many online chats with customer support people telling me to try so many things that didn't work and then by chance found a random thread in a forum that was talking about thermal paste problems in gpu's, so I took mine apart and sure enough paste was near non-existent, reapplied and worked good as new. Crazy how there isn't more attention about this but hopefully this video and comment section helps.
0:33 my pc did the exact same thing it random times of usage (more often than not, while searching file explorer or on google chrome.. haha) and a -heavily- out of date BIOS seemed to be the problem, updated that and never had another crash!
I wouldn't consider it fully tested until it has the back plate on it, because that's how it shipped from the manufacturer. If it can't pass go with the plate on, then the card is no good. It's all about eliminating any variables
he also should have OCed the Card a bit, to confirm (or disprove) his GPU clock theorie .... but again I think we are on the wrong channel to expect such things ...
I have learned so much watching your videos. You're doing a service to teach how to trouble shoot and fix our own systems without paying someone hundreds of dollars in just labor. Keep them coming.
i think i see what happened poor die contact, at 13:32 you can see that the lower portion of the gpu die had almost no thermal paste, which means that at some point it overheated enough to call it quits and do a complete gpu shutdown, so repasting and re assembling it made it go back and sit just right, also at 13:21 i can se a little bit of the cooler just right when you open the card, and you can see that the cooler itself wasnt covered completely with thermal paste in that portion of the die where it also didn't had much or at all so maybe the gpu detected that the area got way too hot and shut down itself
I had a similar hunch. GDDR6X runs super toasty, including versus the GPU. Add to that, I've seen plenty of enthusiasts push 1000+ memory OCs. This is also one of the reasons why buying a used graphics card is risky -- something I wondered if it was the possibility in this case.
This is exactly what my rig was doing. I have 3 monitors on my rig, 2 21" Asus and 1 43" 4K. I have a RTX 2070 in this rig. In my old rig I had a GTX 1060. This had a "feature" that would power down the GPU fans (3) under light or no load. You could go into software and disable the "Fan Stop" feature so the fans would run all of the time. Haven't seen that feature on my newer RTX 2070. Thought this might be worth mentioning. God bless and stay well.
This was a great one. I had no idea what it was. Usually I can figure it out during the video but thanks for picking different stuff as much as you can!!!
I have personally seen this similar issue with my own PC. Common fixes that I have found and learned from the distributer that i got the PC from are: Checking card seating, RAM seating, ANY form of damage in the PCIE lane (mine personally wasn't even visible), driver update, or even a bad windows install (which helped fix my most current PC with this fix). So remember that if you are getting Black Screen and full fan RPM there are A LOT of checks to do.
Keep up the excellent service. I'm sure you are much-appreciated by people in your area. As a former computer repairer, I was conscious of the gap in the home user service market, and tried to fill it. So, I greatly appreciate the thought that has gone into your business model. I also appreciate the fact that you recognise that you cannot help everyone, and so are careful to choose which challenges to accept.
I have this same case (LanCool II Mesh Performance) with an RTX 3080, he can shave a few degrees off his GPU temp by mounting a 120mm Fan under the card blowing up into the GPU. The Mesh power supply cover allows air flow through it, so the case can support basement fans for the GPU. If he decides to just roll with the re-paste, he might be able to aid the card further by just slapping a fan in there.. he might have to swap the GPU Brace as his doesn't seem to allow enough room. I use a lower fan on my card and it shaved on average of 6-8c off the temps which could help him. Some people have had issues where this setup raised CPU temps (pushing GPU heat up to the CPU) but mine has not behaved that way. A good re-paste and a GPU fan could be all he needs. This was an interesting video, never seen this exact issue before. Very informative.. thanks!
Ive experienced this issue with my PC personally before, often it the PCIe cable from the GPU to the power supply that is lose or constantly touching something and moving slightly i.e. the case door. Once the GPU detects an error with the PCIe cable it ramps up the external fans as it thinks the GPU has started the fry itself and cools ASAP!! Great video though!
I had the EXACT same issue with a MicroCenter prebuilt I purchased in mid-2020. It had a Ryzen 2700X and RX 5700XT. I narrowed it down to the graphics card, replaced it with a current-gen upgrade, and never had a problem since.
As soon as you took the cooler off I saw the issue, and it's because I went thru the same issue. The thermal paste didn't get full coverage of the die. It doesn't show up on most temp monitors because it causes *hotspot* temps to rise, but not necessarily total core temperature. I had the same symptom happen when I installed my waterblock slightly misaligned, and repasting it and installing it correctly immediately fixed the issue.
At 13:40 if you look at the top right corner of the GPU chip & the lower right corner it looks like it's missing some compound coverage. The issue might have been a bad seat with the cooler. That could've been a zero point overheat in those corners causing the card to freak out. I do think that once you found it "stable", after you reapplied the thermal compound, you should've increased the load to see if it was fixed or not, then turn it down & tell the guy to RMA it...food for thought for next time?
I Absolutely love the Fix Or Flop Playlist. I plan to build a PC this summer instead of using a gaming laptop. I've learned a lot about PC building from mistakes others have made or isolating the issues with the PC. Much more detailed than the traditional PC building videos on YT. Thank you!! 👍
I see some comment mention that "some" part of the gpu looks like some part of it had little or to no thermal paste, starting at around 13:22 into the video one can see. Dont know if that could be the problem, since you re-pasted it. Hmm tricky one.. What you think @Greg Salazar ?
13:31 you can see bottom of the die has no thermal paste. gpu die only have 1 temp read and its on the center of the die. It can show normal tempurature but it was overheating from bottom side of die. simply repasting it fixed it.
Seeing this problem with a PNY product and my own experiences with PNY branded products has cemented my resolve to steer clear of them in the future. I bought a PNY brand 512GB SATA SSD for my OS drive and it too lasted about 6 months before kicking the bucket. I wasn't overclocking my system one bit and my system temps never got above 70C. Anyway, I got a Kioxia M.2 SSD and I've been happy with that so far.
Troubleshooting a PC = Process of elimination+ deductive reasoning+ some experience and sometimes not being afraid to break something. Carry lots of extra "bench" parts like my trusty PCIE 8400GS fanless GPU and some basic ram to switch out if you have to. Great channel by the way, it's what this new generation of builders needs.
Love the devil voice asking "How do you get out of Heaven?" LOL 17:30 did you reapply thermal paste when you reassembled it? Makes sense to me that you may have resolved the issue then.
Greg, I learn something every episode that I use in my business. This episode was no exception. The education you provide is priceless. Thank you again!
I had this exact issue with my gaming pc too. After upgrading the thermal pads on the gpu and reapplying thermal paste on both the cpu and gpu, it delayed the crash but did not resolve it. My issue ended up being a faulty power supply. I purchased a power supply tester on amazon to do the test. After replacing my power supply for a new one, with a higher wattage, I have had no more issues. This video helped a lot with my own troubleshooting. Thank you.
Right video at right time. I am having the same problem with my sons PC. Its not happing everyday, while playing games, his GPU (1660 super) fans suddenly start running at full speed and screen becomes blank. I was using extension cables, 24 pin and 8 pin GPU. Today i removed both and running Heaven test rigjht now from last 15 minutes. Its working absolutely fine.
Oh Wow! That's crazy because I have a similar issue (kind of) with my PC. My PC hard crashes when it comes underload (gaming while watching UA-cam with other apps in the background). The only way I circumvent this is by selecting the extreme profile in the icue software of my corsair AIO (I have the same AIO as the viewer BTW). I then suspected it was probably the CPU overheating (Ryzen 3700x) but I was not able to prove it by replicating the issue with Cinebench. I plan to reapply the thermal paste in the future to see if that could fix the issue. For now, I only have to deal with my bedroom getting hotter when I game and the AIO fans noise but I always game with a headset on so… Thanks Greg for these series, you have no idea the impact you they have on the community. Keep up with the good work 👍
Greg Salazar, as someone who had this happen to me before with my RTX 3080 10GB card, and being frustrated since it had no consistency of time it would happen. It could be 60C on load and crash and 80C and crash, every time it crashed it would also delete its own Nvidia driver so had to install it back again. Had alot of blue screens back in the day. Now fiddling with fan curves and fan curve on my case it was fine after a few weeks, no problems since. Unrelated, I have had some blue screens since I tried PBOing my CPU through BIOS instead of Ryzen master since I was having problems with my CPU locked at 4.2 GHz. Found out it was ASUS optimal settings that locked it to 4.2 GHz, turned off the settings and since Ive had blue screens a fair few times, without consistency. Can crash when I launch Teamspeak after a few hours watching netflix, or a few hours gaming. Think its got to be the CPU but who knows.
The more I watch this series, the more anxious I get that my perfectly fine PC is just gonna throw the towel in for no reason. And I'm in Australia, Greg! We don't have a Greg, Greg!
👍 I don't have a modern card when the VRAM gets that toasty so it could have been an over heating issue. Before video cards started to have good coolers from the third party card makers, I used to tap and screw intel CPU coolers to my GPUs. Sometimes my measurements were off and the CPU heatsink I screwed on barely touched one of the components around the GPU causing similar issues but much faster. It could have been the backplate warped just enough due to possible uneven tightening to cause it to touch some of the solders on the back of the board. Either way, very good video. This channel is real troubleshooting without editing out errors, incorrect theories, or any other human qualities. This is such a great channel to watch.
The only thing I did not see you do is remove the customer graphics cards and then reinstall the card it with out making any changes. I have solved so weird issues by removing a board, cable and then resinstalling. This was a head scratcher for sure. LOVE you channel!!!! Keep up the great content.
Funny, I had a similar problem recently. I have a friend that had a faulty Sapphire R9 390 that would constantly have driver timeouts and get a black screen like this too after 5 minutes of gaming. I was over at his house several times to try and fix it. We first thought his FX 8350 was overheating when the bios already reported 60+°C after starting it for the first time that day. Repasted that but it still happened. The GPU was staying at around 70°C. Since I thought it was a driver issue, I downloaded and installed the NimeZ drivers which actually fixed the timeouts. When doing benchmarks or gaming, it still crashed. After a while I could actually rule everything out and just thought it was the VRM overheating (Thermal pads disintegrated when touched). I contacted Sapphire for the sizes of the pads but never got them. By the time I waited for an answer, my friends dad had bought him an R9 290 and sold the 390. Thats one solution I guess. It has been running good since then and he wants to upgrade the platform since the 8350 is becoming slower and slower.
I only subbed to you as part of a giveaway... saw your videos popping up on my youtube and now I'm hooked! This is the only time a channel promoted on a giveaway has completely captured my attention with quality videos! Keep it up man :)
(Before watching) a friend of mine ran into the same issue. He had a 1070 for years now and the problem was the PCIe power cable from the PSU. As soon as he used the PSU's other PCIe cable, the issue stopped. He ended up changing the whole PSU altogether just in case.
Thanks for the videos man, i just built my pc and you are one of the main channels i watch. you go above and beyond to help viewers and that is awesome.
Didn't yet watched beyond 1:15 but I had crashing problem where after gaming 6-9h. it just booted. Reason was overheating CPU VRM. Tested with prime95 and it booted under 3 minutes. Added fan to exhaust air near VRM and prime95 run 25 minutes without problem. Also no problems with games after that. I think I troubleshooted this many weeks. Got better PSU and UPS etc. ok UPS saved few time that I didn't lost my work. Note: When watching video I noticed that part of die didn't have thermal paste so maybe overheat shutdown or overheating caused timing violations in chip and crashed it. With bare die it's important to have whole die to have good cooling.
If the card has had a jolt in transit or been mishandled it can cause a break between the thermal paste and chip/cooler so it is possible that a repaste could resolve the issue. It's also possible that either a misfitted or slightly out back plate could be causing the PCB to flex as pressure is applied so you need to check the card with it attached.
I love these videos! I will saw my first thought at the symptoms is a bad overclock, but i do suppose poor contact at a thermal point could be the culprit.
Does Greg not know that modern GPUs have another temp sensor you can read out for hottest die part? Which is normally around 10-20C above the die temp? GPUz can read it out. Hot Spot @18:27 With my 3080 showing similar issues: My regular gpu temperature under load was 83C However my hottest part temp was reaching 110-115C and it would do exactly as you see in the video of the owner. It ended up being a bad thermal application missing part of the die completely, exactly like the stock thermalpaste was in the video.
Sometimes, the lack of thermal paste is right on the spot were there is no temperature sensor. Just enough margin to let the computer think there is no spike in temps, while actually there is one. Works good enough for light use, but goes in overheating protection when gaming or using intensive tasks.
psu fan is upside.... don't want to be picky but at 6:20 you can see it with the panel down intaking air trough the psu shroud while the gpu intakes air at the same time. Also the card could just have been seated just wrong enough or even the pci-e power cables to trigger the black screen . i seen similar on a system with an just barely inapropriate wattage psu for sustained loads and it is what it felt like to me and that goes for badly seated cards / pci-e connectors and sometimes all of those at once. also the thermal paste seemed not applied correctly all over the die and a sudden heat increase might just trigger the failsafe to save itself unlike the 3090's that died because of some badly coded mmo but they were ramping fans all the same even dead . At least the gpu don't seem to have suffered long term use so the silicon is probably ok and was not badly damaged . also another issue is that antisag bracket without the pci-e cover brackets will make this system very dirty very quickly and not help the airflow to not intake hot air from the back between the psu and gpu . the case is full mesh yes but i tought i'd point out these issues since hot air would likely go places it shouldn't ,with the panel on the temps will likely not be at all the same ones you saw.
I had a similar problem with a new build pc {nowhere near the spec of the machine in the video} running Windows 11 where the pc would randomly reboot and sometimes do this 6+ times in a row like powercycling. Strangely it worked fine in the workshop where it was built and would only reboot in my house. It now seems to have sorted itself out. I watched this video to get an insight into what causes things like this.
PC issues … are sometimes an enigma and not to be understood or solved 🤓🫤. Personal experience over 24yrs of pc building … I’ve had a few that couldn’t be pinpoint accurately located …. But like this episode….. a device anomaly could be identified. Good work Greg 👍😇 love these series 🥰🥰
fixes the card shutting down by replacing thermal paste, ntl proceeds by underclocking for good measure XD love this series, so much that hobby builders can learn here, thx Greg!
I've read about mounting pressure from the factory, but also about 3080+ having power spikes. For people who don't have you around and are out of warranty, always best to clean it up and give it new pads/paste.
Ok, I paused the video at 2:22 because I wanted to take a guess at this issue. I had a VERY similar issue that this viewer had, and in my case, I had 2 defective memory DIMMs that were having an issue. I ruled it out by doing a 2 to 3 day test of all the sticks by myself and found the defective DIMMs and had Corsair replace them. That's my guess. I'll comment again after the video is done to see if I was right.
It may not have been paste. Was it hard and dried out? you said temps were ok before taking it apart. Bad paste would have shown signs of higher temps. That leads me to think it is a cracked solder joint that opened when it got hot. When you remounted the backplane with new grease, there was less stress so it didn't open...yet. Could still be a failure waiting to open.
I had this same issue with my pc! I did every test possible and turned out to be my power source. Replaced it and have been going strong for 8 months now!
i had the same issue on an RX5700XT, MSI Mech OC. couldn't figure out anything to get it to stop crashing. I was even using water cooling and it wasn't working. I took it apart, cleaned it, and refilled the loop. works great now. I think sometimes, the dirty card will just crap out for almost no reason.
I had tons of issues with Corsair iCue previously with all of these same issues, random black screens, PC needs reboot, Fans on 100%. I tried to RMA my Commander Pro thinking that was the problem but eventually could not reach an agreement with the Customer Support team at Corsair with regards to shipping and my PC potentially being down for about a week so I did a fresh windows install and haven't had any issues since. I had put up with these symptoms for almost 2 years
Hey! I finished watching the video. I wouldn't have thought it was the GPU. Maybe just you taking the GPU out and putting back in reseated the GPU because perhaps it wasn't making good contact with the PCIe connector. Just a thought, not sure if that was it. Good steps of troubleshooting too. I like these videos are on the internet that in future if any of us IT people need to know something, this channel/series servces that purpose! Please keep this series going because it's a good learning and educational experience for all of us!
Haven't seen the past the start yet but my EVGA GTX 970 used to do this, card seemed to overheat for some reason once it passed 69-73C. Custom fan curve in MSI Afterburner helped mitigate it.
i had this problem a long time ago, just 2 days before a vacation for a month, this problem happened, i did not have time to fix it, so i left the pc like that, i come back next month to see test it out then the issue was miraculously fixed XD
The start of the video where the viewer records the problem in real time. That is a cool idea. Maybe you could add that in future videos?
Yes. I think all future fix or flop submissions should record the issue
@@BattleOverride856 would pretty much be a little... well... dumb? - cause there are PC's that dont even turn on, how are you going to record that? "Oh, im pressing this button on top and nothing happens." - Video over.
not the best thing to add to a video, would it? xD
Like he says it usually ads nothing. I watched the video 1:15
@@haxkztasy your right maybe not all videos but like this one where he games and shuts off while playing and the graphic card overheats. I meant those kind of videos.
@@ItsRebel98 one could do that, but sometimes you may not wanna risk anything, especially when you personally dont know the problem and dont know much about PC's in the first place, but i do know what you mean.
you could ask for such, but it can also always get flimsy cause you never know the exact problem and when you turn on your PC while you know something isnt going as it should, you might break something somewhere cause you turned it on again.
had it happen to me, where i knew the problem, turned on my PC and suddenly i had to buy nearly a complete new PC cause my Power Supply popped and if i say: Popped, then i mean that it throwed sparkles and my Mainboard was done aswell cause of it, even tho i already knew that my Power Supply was the problem. (tho i wanted to make sure... which was dumb.)
Clearly, it was an Overheating issue. As the Card turned off and the fans ramped up, it means a failsafe-mechanism has been triggered. In another words, the GPU tried to safe itself from permanent damage.
At 13:38 you can literally see the cause.
A Poor thermal paste application results in poor die coverage, thus creating a high enough Hot-Spot temperature (105C°) somewhere in the die that triggered it.
EDIT: It's definitely the manufactures fault.
Good call. Very interesting that something like this passed QC.
exactly... Looks like basically a quarter of the die had no paste. Just because the temp read out didn't spike, doesn't mean there weren't hot spots causing the failsafe crash.
Yep! Saw this right away when he removed the cooler...... 🤦There it is, that's the problem.
I thought that thermal paste was kinda light.......glad im not the only one
first crash it was pretty clear it was on the gpu. first thought was a bad overclock or bad thermal interface. probably could be temporarily mended with undervolting but that should not happen.
I love watching this series! Many people focus on the building part but few show what it's like to have things go wrong
Thanks for watching!
Yeah it's very valuable content
Same this is the best series on UA-cam!
That is the risk when you build. It could be anything when the issue isn't obvious, such as a faulty power supply. Especially when it is an instability issue like this, could be just about anything... from drivers, to OS, to faulty hardware, to a short, to hardware incompatibility. Very difficult to diagnose.
The worst is when you build and the machine doesn't post. The worst feeling ever... all of that excitement and time taken to build, just to see it not post. Then have to figure out what is wrong. Has happened to me a couple of times over the years.
And a lot of the builders skip over the important things that need to be done . Like Bios settings and the more important technical details - configurations ...
Not joking sometimes reseating and pasting everything absolutely does work! I think you did fix it and even the fans are much more silent now which is a good sign lol.
The bottom section of the GPU die didn’t have any thermal paste on it. The card went into thermal shutdown after running a game or benchmark which would explain the symptoms. When pasting a bare die, it’s important that the entire die is covered otherwise you get situations like this. I’m glad you were able to fix it and save the customer downtime on his pc!
Yup. On 13:21 you can even briefly see the heatsink and how it also didnt have any paste on that part. Sometimes all the paste gets stuck on either the die or heatsink, but that wasnt the case here; there was no paste covering that part of the die whatsoever, and it was like 1/4 of the die
I thought that also
@@DavorBa Yeah, that could've happened, by either the manufacturer, or by refurbish, cause I saw that too. Sometimes, people do get overworked, in those settings, and things go missed, like that. Luckily, this card was saved, by the looks of it.
I can Agree on that, I didn't put enough paste on my rx580 (or bad spread) and it would crash after heaven in the same way (the gpu fan maxes out for safety).
For me the symptoms was reduced clocks and temps relative to normal operation. After the repaste, everything is fine since.
Although, I could have snagged a 5700xt just before the market increase if I didn't troubleshoot it...
its possible that part of the chip gets too hot. these chips have multiple sensors. The single gpu thermal sensor is the average of all the sensors and really doest tell much. Some drivers expose GPU hotspot thermal sensor to get the correct information. If your gpu hotspot thermal sensor is much hotter than the average gpu thermal sensor than you likely have a bad paste.
I had this problem with my 2080ti, tried everything possible until I found a forum thread that told me about thermal paste in the GPU. I was hesitant to take it apart myself but I was out of options so I just had to give it a go. Once I had the card apart it was clear as day that there was next to no thermal paste left, so I went out and bought paste, reapplied and the GPU was as good as new and still going strong.
This! My former gpu 2080 ti gaming x trio also had this too, factory thermal paste just "evaporated" to oblivion, oddly the gpu temp wasn't so bad, it was around 74-75 degrees, but games crashed a lot even after downclocking. After repasting, the problem went away.
Man, that really sucks having an essentially defective gpu. Good thing it's as simple as putting more thermal paste instead of something like a blown capacitor
@@tictechto apparently it was a common occurrence with the 2080 ti cards from what I have read online, some believe they used generic paste maybe so the cards don't live as long lives so you have to upgrade more frequently but who knows. Most likely a manufacturing fault.
@@bigbubba0439 yeah very luck that it was a simple fix, at the time I before i knew about the paste I was already looking to buy another high end card but luckily spent like $8 on Arctic MX-4 and this was like 6 months ago and havn't had any issues since reapplying.
I suspect the same issue with my 5700 xt. It's been real fucky lately but I'm still trying to get myself to bite the bullet and open it up
i'm already a pc expert after watching this channel
If you wish hard enough, anything can happen.
I legit lol'd.
@@x8jason8x Same XD
13:21
looks like maybe 20% of the die didnt get good thermal coverage. the paste on the heatsink is also not flattened out.
so definately was causing local overheating.
looks like that fixed it!
good work Greg!
@@MrPruske I was wondering the same, because the GPU really was suffering. I agree with your observation. Uneven pressures also can cause similar issues.
Yeah and if the probes aren’t on that area or certain memory modules that are overheating, their temps won’t show up on the software. GDDR6X memory runs very hot so new thermal pads and a paste job can make a big difference
Could very well have been the problem area.
At least if the re-paste didnt fix it, the minor down clock does. its only 100mhz.. which is nothing.
Funny timing. My new RTX 3080 was showing the same symptoms. In my case repasting the GPU fixed it. Part of the die had no thermal paste on it.
As soon as you took off that gpu cooler, I knew it had to be bad heat dissipation. That thermal paste looked a little dry, and being a newer card that would lead me to believe that factory uses cheap paste out the door. Looked like a bad paste application as well.
Awesome work!
The paste job from factory looked really iffy. Probably caused a localized 'hotspot' that was far enough away from the normal sensors to not get noticed by monitoring software, that was tripping a temp protection fault on one of the components somewhere on the card. Repaste and reseat solves the hotspot, causes error to disappear.
Man I had this exact same issue for almost a year. I finally decided to take apart my GPU and re-applied thermal paste, and I haven't had an issue since. It was a little scary since it was my first time doing it, but I'm very glad I did.
13:37 (what a time) shows the die isnt even covered for 20ish % of its surface. Manufacturer screwed up. You cleaned it and pasted it properly + reassembly... and voila, works. Love to see it.
Hey Greg, don't worry about these videos occasionally being a bit chaotic. That's what makes them great actually!
I can attest to just opening up a gpu.. repasting, cleaning and PROPERLY screwing it back together causing the gpu to function flawlessly again. In face, it very well could have been a bad cooler mount or overtightened from the factory.
Almost certain it's a software issue going on with Apex Legends. Respawn has yet to address or solve the issue after 6+ months. Countless redditors have experienced the same situation with the game but there's an in-game fix for it which involves disabling some graphical settings.
@@TheDeluche hm? He was on Heaven and it still didn’t work. Based off of the thermal paste application it was 100% that because about a quarter of the dye was uncovered so it was causing a hotspot
my fav series on youtube man. I've learned so much here. keep up the great content.
Honestly Greg is becoming the most wholesome tech UA-camr at this point. Also for all the UK bros this card is on big discount on ebuyer to the point where it’s around 250£ cheaper than any other model. 750£ for a 3080ti
I think that GPU hotspot temperature was bad enough for it to shut off because the bottom of the GPU die was clean of any paste. That lack of paste is shameful for a manufacturer.
Edit: next time you have a GPU issue on your hands, use GPU-Z to check hotspot temps. Sometimes, it's more important than your core temp.
Love that you share all the bizarre issues we discover when working on tech.. Practice always.. beats theory when it comes to troubleshooting...
I had a very similar issue with my 2070, worked great until I started gaming. Had to change all settings to low in game, would work for a while, but eventually would crash the whole PC. I actually downgraded and put in a GT1030 and have had not had any issues since. My son took the 2070 and put it in his rig and confirmed it was definitely a problem with that card.
Many years ago I built a pc for a friend and this exact thing kept happening. Changing the power supply to a higher wattage one fixed the issue that time.
if that 2070 has a backplate check it for tightness....had one do this exact same thing, turned out that manf (actually PNY lol) had over tightened the back plate on the card and it was flexing a memory chip. just so turned out when I re pasted it with some noctua, I torqued it right also lol
Did your 2070 like make a really loud feedback noise through the speakers or headphones and then cut off moments later? Mine does that currently and I have no clue why it doesn’t do it often but I thought maybe it could be from it letting the pc warm up and just going to temps around 65 too fast. But that isn’t the issue, also was your 2070 overclocked
Same my rtx 2070 does the same problm... I just finished putting some thermal paste on it but i'm still suffering from the same problm
@@Everdreamjustplays hmmmm this is very interesting bc I keep getting a memory error too, my pc will crash and it will give an error code such as “Stop Code: Memory management” or something like that.
I get a burst of happiness when you're able to fix viewers PCs im sure they appreciate your work aswell!!
Great job viewer from Japan
awesome series im addicted!
In my opinion the absolute best diagnostic and repair video series on UA-cam. Anyone that wants to learn about repairing computers should start with this series for starters and watch Greg's step by step progressive diagnostic methods.
Love this series so much. another banger greg !
Mr. Salazar for the rescue big heart and knowledge if it were me i would have sent it for an RMA and called it a day. appreciate all the thinking out of the box and loved the series. cheers good sir.
I really love these videos, just keep em coming Greg
I found the opening very likable. When getting to see the PC crash, it helps at least for me, to be more immersed in the solution part of the video.
The same issue I had last week; after a quick inspection, I discovered that the extension cable is slightly looser than the pcie cable. It only happens if the computer shakes.
I had a similar problem with the black screens while gaming. Was the sound from the game still going when it happened?
@@DeathSeed32 yes, only the screen turns to black, i still manage talk to my fren in discord
I had a similar issue with my Gigabyte RTX2070 Super Gaming OC after around 2 1/2 years of gaming use. Mine didn't go black screen, but the fans did go max speed, and temp went upto 105c, the gpu usually runs around the 70c mark while gaming.
Returned under warranty, and when I got it back, the report was that they "Replaced the thermal paste and pads, fans, and other bits and pieces". The exact words on the report. Since then, the card has been faultless. Now has a mind oc, and temps top out at around 75c. The issue was happening even at factory settings btw.
Another great video Greg. Keep them coming.
When you took the back plate off, I noticed that capacitors (iirc those are caps) on the back of the GPU chip itself are 5 bigger black squares and one group of smaller ones. Back when 30xx series launched, that was actually the issue, card's power delivery where those black caps couldn't filter out input voltage well enough and would cause similar / same problems as in this video.
Basically PNY cheaped out on caps where they are needed for power delivery.
Edit: fix was software related by dropping clocks or RMA for a better, newer card. This one may be from an old batch of cards.
i think jaytwocentz has some videos about that, i remember that game from amazon bricking gpus and ultimately the source of the problem where those caps, as soon as i saw those big caps i knew it was those
you're talking about poscaps. it was a faulty driver at the end of the day.
Well done Greg, you definitely fixed it.
I, like many other commenters, believe it had to do with the dried out, nearly cracking thermal paste on the die. The fans ramping up symptom certainly fits.
Just to throw it out there, I had the same Symptoms with a AX1200i and a r9 290 Crossfire. Spiking loads on the GPUs managed to throw the OCP, which led to black monitors and the GPU fans going full tilt. Repasting and underclocking has reduced the Powerdraw, so the Problem is temporarily fixed in my Opinion.
So you're saying these cards experience power draw problems? and you're mentioning a 1200Watt PSU and 2 identical AMD cards in crossfire..
the crossfire setup alone draws 500Watt on load.. it doesn't exceed this number by much... so how is it a power draw issue if you have 1200Watts of raw power xD
Don't tell me your other pc components draw the other 700+ Watts.. that's impossible.
i've had the same problem however with an AMD RX590 from the Sapphire vendor.. i also did a card clean and repaste + 100mhz underclock on the boost state clock + slight undervolt to match it with the underclock and the problem has stayed away ever since.
But just like you i also have a beefy PSU, that is capable of powering more then 4 RX590s..
so how is it a power draw issue? we both have very different parts yet the problem is the same.
@@VintageCR Rather late for the reply, but let me tell you how I managed that:
2x R9 290 @ unlocked Bios, 800W
E5-1650v2@4.6 GHz 250W
8 Dimms ddr3 1866 , EATX Board X79, HDDs and Fans did the rest.
Each Card worked flawlessly alone or with dismounted Crossfire,
switched the Crossfire on and saw at least every hour a blckscreencrash with fans to the Max.
this series got me hook!!.... whats not to love about greg .. the dude is generous, the dude is smart, the dude is kind, the dude is always lending a hand, the dude is always sharing his knowledge and skills, id say let greg run for the presidency,.. i for one will definitely be voting for GREG!!
If he is planning to keep the card and not RMA it. I would at least put the clocks back to the factory settings. If the card crashes at factory settings in means it has some bad components and needs to be RMA'd. But like the others said it could have just been overheating because of poor thermal paste application. But either way, I would not keep a GPU that crashes at factory clocked settings.
The weird fixes that resolve problems successfully that cannot be explained happens all the time. This is a good series as while also solving problems for people it also shows them a first hand experience of a problem and how to approach fixing it.
AWESOME! This is a real time event that many of us have been through. Shock, AWE, and WTF moments!! Great video Thank you!
I've never re watched a video for information on the sponsor. I'm constantly going back because I can never remember lolol Thanks for having a sponsor that is useful and reasonably priced. :)
I had something similar like this as well when it came to the problem on a RTX 2080. My fans would ramp up to 100% at 60c being unable to control the fans at all but the difference between my issue and this guys issue was mine was still operational but with the obnoxious fans ramping up to 100% at 60c and normalising when below 60c. I ended up opening my card (warranty was already out) and found out that a huge chunk of the thermal paste had completely melted to the point that some of the dye itself was visible. So what i am assuming is the temps being shown normal but the cards auto fan curve kicked in due to part of the exposed dye overheating. I repasted it and it solved my issue. Took me a while to figure out the issue with my card since i wanted to avoid opening it up, so went through so much software, update and bios crap for it to turn out to be such a simple fix yet there was hardly any info about the issue online so hoping this helps someone else out there who has the same issue cause it was driving me crazy.
I just posted a similar comment, I had many online chats with customer support people telling me to try so many things that didn't work and then by chance found a random thread in a forum that was talking about thermal paste problems in gpu's, so I took mine apart and sure enough paste was near non-existent, reapplied and worked good as new. Crazy how there isn't more attention about this but hopefully this video and comment section helps.
0:33 my pc did the exact same thing it random times of usage (more often than not, while searching file explorer or on google chrome.. haha) and a -heavily- out of date BIOS seemed to be the problem, updated that and never had another crash!
I wouldn't consider it fully tested until it has the back plate on it, because that's how it shipped from the manufacturer. If it can't pass go with the plate on, then the card is no good. It's all about eliminating any variables
he also should have OCed the Card a bit, to confirm (or disprove) his GPU clock theorie .... but again I think we are on the wrong channel to expect such things ...
I have learned so much watching your videos. You're doing a service to teach how to trouble shoot and fix our own systems without paying someone hundreds of dollars in just labor. Keep them coming.
Initial prediction: the intakes pulled in some of that gfuel & the system ramps up to the max & crashes from the boost of pure power!😉
I play this game too :D watch the intro and try to guess the issue before continuing
i think i see what happened
poor die contact, at 13:32 you can see that the lower portion of the gpu die had almost no thermal paste, which means that at some point it overheated enough to call it quits and do a complete gpu shutdown, so repasting and re assembling it made it go back and sit just right, also at 13:21 i can se a little bit of the cooler just right when you open the card, and you can see that the cooler itself wasnt covered completely with thermal paste in that portion of the die where it also didn't had much or at all
so maybe the gpu detected that the area got way too hot and shut down itself
As someone who deals with a lot of graphics cards, I think the GPU memory is most likely causing the issue.
the 3080's and 3090's had horrid OEM pads.
Completely agree. I think one of the modules was getting hot really fast and his re-paste job made it to stay within tolerance.
I had a similar hunch. GDDR6X runs super toasty, including versus the GPU. Add to that, I've seen plenty of enthusiasts push 1000+ memory OCs. This is also one of the reasons why buying a used graphics card is risky -- something I wondered if it was the possibility in this case.
This is exactly what my rig was doing. I have 3 monitors on my rig, 2 21" Asus and 1 43" 4K. I have a RTX 2070 in this rig. In my old rig I had a GTX 1060. This had a "feature" that would power down the GPU fans (3) under light or no load. You could go into software and disable the "Fan Stop" feature so the fans would run all of the time. Haven't seen that feature on my newer RTX 2070. Thought this might be worth mentioning. God bless and stay well.
I’ve had the same issue with my pc, my culprit was my power supply extensions! Just in case anybody else had this issue :)
i had same issue but i solved it by changing my extensions as well , u r right !
This was a great one. I had no idea what it was. Usually I can figure it out during the video but thanks for picking different stuff as much as you can!!!
I have personally seen this similar issue with my own PC. Common fixes that I have found and learned from the distributer that i got the PC from are: Checking card seating, RAM seating, ANY form of damage in the PCIE lane (mine personally wasn't even visible), driver update, or even a bad windows install (which helped fix my most current PC with this fix). So remember that if you are getting Black Screen and full fan RPM there are A LOT of checks to do.
Keep up the excellent service. I'm sure you are much-appreciated by people in your area. As a former computer repairer, I was conscious of the gap in the home user service market, and tried to fill it. So, I greatly appreciate the thought that has gone into your business model. I also appreciate the fact that you recognise that you cannot help everyone, and so are careful to choose which challenges to accept.
I appreciate that you’re cutting down a little bit on your into
I have this same case (LanCool II Mesh Performance) with an RTX 3080, he can shave a few degrees off his GPU temp by mounting a 120mm Fan under the card blowing up into the GPU. The Mesh power supply cover allows air flow through it, so the case can support basement fans for the GPU. If he decides to just roll with the re-paste, he might be able to aid the card further by just slapping a fan in there.. he might have to swap the GPU Brace as his doesn't seem to allow enough room. I use a lower fan on my card and it shaved on average of 6-8c off the temps which could help him. Some people have had issues where this setup raised CPU temps (pushing GPU heat up to the CPU) but mine has not behaved that way. A good re-paste and a GPU fan could be all he needs. This was an interesting video, never seen this exact issue before. Very informative.. thanks!
12:49 "how do you get out of Heaven?"
that's a hell of a humor right there
mans origin
Ive experienced this issue with my PC personally before, often it the PCIe cable from the GPU to the power supply that is lose or constantly touching something and moving slightly i.e. the case door. Once the GPU detects an error with the PCIe cable it ramps up the external fans as it thinks the GPU has started the fry itself and cools ASAP!! Great video though!
I had the EXACT same issue with a MicroCenter prebuilt I purchased in mid-2020. It had a Ryzen 2700X and RX 5700XT. I narrowed it down to the graphics card, replaced it with a current-gen upgrade, and never had a problem since.
As soon as you took the cooler off I saw the issue, and it's because I went thru the same issue. The thermal paste didn't get full coverage of the die. It doesn't show up on most temp monitors because it causes *hotspot* temps to rise, but not necessarily total core temperature. I had the same symptom happen when I installed my waterblock slightly misaligned, and repasting it and installing it correctly immediately fixed the issue.
At 13:40 if you look at the top right corner of the GPU chip & the lower right corner it looks like it's missing some compound coverage. The issue might have been a bad seat with the cooler. That could've been a zero point overheat in those corners causing the card to freak out. I do think that once you found it "stable", after you reapplied the thermal compound, you should've increased the load to see if it was fixed or not, then turn it down & tell the guy to RMA it...food for thought for next time?
I Absolutely love the Fix Or Flop Playlist. I plan to build a PC this summer instead of using a gaming laptop. I've learned a lot about PC building from mistakes others have made or isolating the issues with the PC. Much more detailed than the traditional PC building videos on YT. Thank you!! 👍
I see some comment mention that "some" part of the gpu looks like some part of it had little or to no thermal paste, starting at around 13:22 into the video one can see. Dont know if that could be the problem, since you re-pasted it. Hmm tricky one.. What you think @Greg Salazar ?
Thank you for the opportunity to learn with you.
13:31 you can see bottom of the die has no thermal paste. gpu die only have 1 temp read and its on the center of the die. It can show normal tempurature but it was overheating from bottom side of die. simply repasting it fixed it.
Seeing this problem with a PNY product and my own experiences with PNY branded products has cemented my resolve to steer clear of them in the future. I bought a PNY brand 512GB SATA SSD for my OS drive and it too lasted about 6 months before kicking the bucket. I wasn't overclocking my system one bit and my system temps never got above 70C. Anyway, I got a Kioxia M.2 SSD and I've been happy with that so far.
Troubleshooting a PC = Process of elimination+ deductive reasoning+ some experience and sometimes not being afraid to break something. Carry lots of extra "bench" parts like my trusty PCIE 8400GS fanless GPU and some basic ram to switch out if you have to. Great channel by the way, it's what this new generation of builders needs.
Love the devil voice asking "How do you get out of Heaven?" LOL
17:30 did you reapply thermal paste when you reassembled it? Makes sense to me that you may have resolved the issue then.
Greg, I learn something every episode that I use in my business. This episode was no exception. The education you provide is priceless. Thank you again!
I had this exact issue with my gaming pc too. After upgrading the thermal pads on the gpu and reapplying thermal paste on both the cpu and gpu, it delayed the crash but did not resolve it. My issue ended up being a faulty power supply. I purchased a power supply tester on amazon to do the test. After replacing my power supply for a new one, with a higher wattage, I have had no more issues. This video helped a lot with my own troubleshooting. Thank you.
Right video at right time. I am having the same problem with my sons PC. Its not happing everyday, while playing games, his GPU (1660 super) fans suddenly start running at full speed and screen becomes blank. I was using extension cables, 24 pin and 8 pin GPU. Today i removed both and running Heaven test rigjht now from last 15 minutes. Its working absolutely fine.
Oh Wow! That's crazy because I have a similar issue (kind of) with my PC. My PC hard crashes when it comes underload (gaming while watching UA-cam with other apps in the background). The only way I circumvent this is by selecting the extreme profile in the icue software of my corsair AIO (I have the same AIO as the viewer BTW). I then suspected it was probably the CPU overheating (Ryzen 3700x) but I was not able to prove it by replicating the issue with Cinebench. I plan to reapply the thermal paste in the future to see if that could fix the issue. For now, I only have to deal with my bedroom getting hotter when I game and the AIO fans noise but I always game with a headset on so… Thanks Greg for these series, you have no idea the impact you they have on the community. Keep up with the good work 👍
Greg Salazar, as someone who had this happen to me before with my RTX 3080 10GB card, and being frustrated since it had no consistency of time it would happen. It could be 60C on load and crash and 80C and crash, every time it crashed it would also delete its own Nvidia driver so had to install it back again. Had alot of blue screens back in the day. Now fiddling with fan curves and fan curve on my case it was fine after a few weeks, no problems since. Unrelated, I have had some blue screens since I tried PBOing my CPU through BIOS instead of Ryzen master since I was having problems with my CPU locked at 4.2 GHz. Found out it was ASUS optimal settings that locked it to 4.2 GHz, turned off the settings and since Ive had blue screens a fair few times, without consistency. Can crash when I launch Teamspeak after a few hours watching netflix, or a few hours gaming. Think its got to be the CPU but who knows.
Could be cpu, could be mobo, could even be the windows install.
@@louiesatterwhite3885 Indeed, think I should do some more testing with temps and such to find out.
The more I watch this series, the more anxious I get that my perfectly fine PC is just gonna throw the towel in for no reason.
And I'm in Australia, Greg! We don't have a Greg, Greg!
Same issue with my 3080 FE. Repasting with Kryonaut and new thermal pads fixed the issue
👍
I don't have a modern card when the VRAM gets that toasty so it could have been an over heating issue. Before video cards started to have good coolers from the third party card makers, I used to tap and screw intel CPU coolers to my GPUs. Sometimes my measurements were off and the CPU heatsink I screwed on barely touched one of the components around the GPU causing similar issues but much faster. It could have been the backplate warped just enough due to possible uneven tightening to cause it to touch some of the solders on the back of the board. Either way, very good video. This channel is real troubleshooting without editing out errors, incorrect theories, or any other human qualities. This is such a great channel to watch.
Been watching your videos for over a year now I always learn something and enjoy the journey, keep up the good work boss.
The only thing I did not see you do is remove the customer graphics cards and then reinstall the card it with out making any changes. I have solved so weird issues by removing a board, cable and then resinstalling. This was a head scratcher for sure. LOVE you channel!!!! Keep up the great content.
Funny, I had a similar problem recently. I have a friend that had a faulty Sapphire R9 390 that would constantly have driver timeouts and get a black screen like this too after 5 minutes of gaming. I was over at his house several times to try and fix it. We first thought his FX 8350 was overheating when the bios already reported 60+°C after starting it for the first time that day. Repasted that but it still happened. The GPU was staying at around 70°C.
Since I thought it was a driver issue, I downloaded and installed the NimeZ drivers which actually fixed the timeouts. When doing benchmarks or gaming, it still crashed. After a while I could actually rule everything out and just thought it was the VRM overheating (Thermal pads disintegrated when touched). I contacted Sapphire for the sizes of the pads but never got them. By the time I waited for an answer, my friends dad had bought him an R9 290 and sold the 390. Thats one solution I guess. It has been running good since then and he wants to upgrade the platform since the 8350 is becoming slower and slower.
I only subbed to you as part of a giveaway... saw your videos popping up on my youtube and now I'm hooked! This is the only time a channel promoted on a giveaway has completely captured my attention with quality videos! Keep it up man :)
So the fix was your magic touch of applying new paste awesome video!
(Before watching) a friend of mine ran into the same issue. He had a 1070 for years now and the problem was the PCIe power cable from the PSU. As soon as he used the PSU's other PCIe cable, the issue stopped. He ended up changing the whole PSU altogether just in case.
Thanks for the videos man, i just built my pc and you are one of the main channels i watch. you go above and beyond to help viewers and that is awesome.
Didn't yet watched beyond 1:15 but I had crashing problem where after gaming 6-9h. it just booted. Reason was overheating CPU VRM. Tested with prime95 and it booted under 3 minutes. Added fan to exhaust air near VRM and prime95 run 25 minutes without problem. Also no problems with games after that. I think I troubleshooted this many weeks. Got better PSU and UPS etc. ok UPS saved few time that I didn't lost my work.
Note: When watching video I noticed that part of die didn't have thermal paste so maybe overheat shutdown or overheating caused timing violations in chip and crashed it. With bare die it's important to have whole die to have good cooling.
If the card has had a jolt in transit or been mishandled it can cause a break between the thermal paste and chip/cooler so it is possible that a repaste could resolve the issue. It's also possible that either a misfitted or slightly out back plate could be causing the PCB to flex as pressure is applied so you need to check the card with it attached.
I love these videos! I will saw my first thought at the symptoms is a bad overclock, but i do suppose poor contact at a thermal point could be the culprit.
Does Greg not know that modern GPUs have another temp sensor you can read out for hottest die part?
Which is normally around 10-20C above the die temp?
GPUz can read it out. Hot Spot @18:27
With my 3080 showing similar issues:
My regular gpu temperature under load was 83C
However my hottest part temp was reaching 110-115C and it would do exactly as you see in the video of the owner.
It ended up being a bad thermal application missing part of the die completely, exactly like the stock thermalpaste was in the video.
Sometimes, the lack of thermal paste is right on the spot were there is no temperature sensor. Just enough margin to let the computer think there is no spike in temps, while actually there is one. Works good enough for light use, but goes in overheating protection when gaming or using intensive tasks.
psu fan is upside.... don't want to be picky but at 6:20 you can see it with the panel down intaking air trough the psu shroud while the gpu intakes air at the same time. Also the card could just have been seated just wrong enough or even the pci-e power cables to trigger the black screen . i seen similar on a system with an just barely inapropriate wattage psu for sustained loads and it is what it felt like to me and that goes for badly seated cards / pci-e connectors and sometimes all of those at once. also the thermal paste seemed not applied correctly all over the die and a sudden heat increase might just trigger the failsafe to save itself unlike the 3090's that died because of some badly coded mmo but they were ramping fans all the same even dead . At least the gpu don't seem to have suffered long term use so the silicon is probably ok and was not badly damaged .
also another issue is that antisag bracket without the pci-e cover brackets will make this system very dirty very quickly and not help the airflow to not intake hot air from the back between the psu and gpu . the case is full mesh yes but i tought i'd point out these issues since hot air would likely go places it shouldn't ,with the panel on the temps will likely not be at all the same ones you saw.
I love this whole series
PCDC and Fix or Flop are the best part of my weekends
I had a similar problem with a new build pc {nowhere near the spec of the machine in the video} running Windows 11 where the pc would randomly reboot and sometimes do this 6+ times in a row like powercycling.
Strangely it worked fine in the workshop where it was built and would only reboot in my house.
It now seems to have sorted itself out.
I watched this video to get an insight into what causes things like this.
PC issues … are sometimes an enigma and not to be understood or solved 🤓🫤. Personal experience over 24yrs of pc building … I’ve had a few that couldn’t be pinpoint accurately located …. But like this episode….. a device anomaly could be identified. Good work Greg 👍😇 love these series 🥰🥰
12:49 "How do you get out of heaven?" LMAO! That made my day!
fixes the card shutting down by replacing thermal paste, ntl proceeds by underclocking for good measure XD love this series, so much that hobby builders can learn here, thx Greg!
I've read about mounting pressure from the factory, but also about 3080+ having power spikes. For people who don't have you around and are out of warranty, always best to clean it up and give it new pads/paste.
Ok, I paused the video at 2:22 because I wanted to take a guess at this issue. I had a VERY similar issue that this viewer had, and in my case, I had 2 defective memory DIMMs that were having an issue. I ruled it out by doing a 2 to 3 day test of all the sticks by myself and found the defective DIMMs and had Corsair replace them. That's my guess. I'll comment again after the video is done to see if I was right.
It's always the simplest fix that works. Great job Greg!
It may not have been paste. Was it hard and dried out? you said temps were ok before taking it apart. Bad paste would have shown signs of higher temps. That leads me to think it is a cracked solder joint that opened when it got hot. When you remounted the backplane with new grease, there was less stress so it didn't open...yet. Could still be a failure waiting to open.
I had this same issue with my pc! I did every test possible and turned out to be my power source. Replaced it and have been going strong for 8 months now!
i had the same issue on an RX5700XT, MSI Mech OC. couldn't figure out anything to get it to stop crashing. I was even using water cooling and it wasn't working. I took it apart, cleaned it, and refilled the loop. works great now. I think sometimes, the dirty card will just crap out for almost no reason.
I really enjoy this series hope it keeps going. Just love seeing computer repairs. And your very detailed and educational with your videos as well
I had tons of issues with Corsair iCue previously with all of these same issues, random black screens, PC needs reboot, Fans on 100%. I tried to RMA my Commander Pro thinking that was the problem but eventually could not reach an agreement with the Customer Support team at Corsair with regards to shipping and my PC potentially being down for about a week so I did a fresh windows install and haven't had any issues since. I had put up with these symptoms for almost 2 years
Hey! I finished watching the video. I wouldn't have thought it was the GPU. Maybe just you taking the GPU out and putting back in reseated the GPU because perhaps it wasn't making good contact with the PCIe connector. Just a thought, not sure if that was it. Good steps of troubleshooting too. I like these videos are on the internet that in future if any of us IT people need to know something, this channel/series servces that purpose! Please keep this series going because it's a good learning and educational experience for all of us!
Haven't seen the past the start yet but my EVGA GTX 970 used to do this, card seemed to overheat for some reason once it passed 69-73C. Custom fan curve in MSI Afterburner helped mitigate it.
i had this problem a long time ago, just 2 days before a vacation for a month, this problem happened, i did not have time to fix it, so i left the pc like that, i come back next month to see test it out then the issue was miraculously fixed XD