pretty wacky week last week, it snowed a ton here, may or may not have bedbugs although if we do for now they seem contained(?) so keeping an eye on that

  • Doombot1@lemmy.one
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 months ago

    Off to a fairly rough start, unfortunately :/

    Spent seven hours today trying and failing to get docker to work with our Jenkins deployment at work, and on top of that, my brand new GPU keeps “falling off the bus” (Ubuntu, 4070 Ti Super, randomly screen freezes and need a reboot to fix - but PC still runs so I can SSH in & check dmesg and whatnot). Sometimes it’s every 12 hours or so, or even more, but sometimes (today, for instance), it feels like it’s every ten minutes. Which … sucks.

    Side note… if anybody knows how the heck to fix a GPU falling off the bus… please let me know, lol. It only happens when I’m using the PC (as in, if it’s on but the mouse ain’t moving, it doesn’t seem to happen), and I’m running the latest & greatest NVIDIA 550 drivers. Ubuntu 22.04. Reseated GPU, running a 1000W EVGA PSU and the Kill-a-watt attached to it never goes above 450 or so. And the crash never seems to happen when it’s under a huge amount of load, like doing AI stuff… only ever seems to happen when I’m browsing files and such. Anyone ever run into this before?? All of the google answers seem to say it’s a bad PSU or similar, but the PSU has been working just fine & dandy in other PCs, and this system wasn’t doing this at all with my old NVIDIA GPU (swapped last week)…

    • PenguinCoder@beehaw.org
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      Typically In my experience, what you describe is a Power/Wattage issue. Could be a powerdown, sleep issue, or something like either the GPU isn’t getting the power it needs when it needs it, or the PSU is just over taxed. Would really want to see DMESG logs and more hardware info (Do you have crashdumps?). Try disabling any powerdown or C-states for the GPU, prevent it from going to sleep.

      • Doombot1@lemmy.one
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        Appreciate the response! After many, many hours of research, I came to the same conclusion. I tried a whole multitude of solutions that worked for others and none of them seemed to work - except for a weird hacky “solution” to just permanently set the power state of the GPU to max. Unfortunately, that means it consumes ~50 watts idle instead of the 5-10 it managed beforehand… but the fact that it fixed the system lockups made it worth it. I think the issue was something having to do with the GPU not properly waking up from lower power modes - so I super appreciate the advice :)