GPU shutting down unless client minimized

Discussion in 'General' started by 0112358, Mar 26, 2021.

  1. 0112358

    0112358 Orc Centurion

    Messages:
    70
    Recently got a HP Z840 workstation:
    Dual Xeon E5-2690 V3 (24 real cores @ 2.6ghz, 48 including HT)
    128GB RAM
    512GB SSD OS drive (Windows 10 Pro)
    1125W PSU

    I put an EVGA GTX 960 SSC (4GB) in it for now. Will eventually upgrade graphics card, but hoping this can serve me until then.

    The good:
    I can load up 54 clients reasonably quickly and with no issues, as long as they stay minimized. Resources with 54 in game (but minimized) are around 10-12% CPU, 28% RAM, 30-40% GPU (3D meter).

    The problem:
    As soon as I maximize my main client (or any client that isn't set to viewport 1 1 1 1) the GPU 3D meter immediately spikes from ~35% to 100%, and that starts a timer of 3-5 seconds until a complete machine freeze occurs. During the complete freeze, the GPU drops from 100% to 0%.

    Quickly issuing a Win+D keypress will usually, after a variable period of time, bring control back when the GPU cycles back up and the client is minimized again. If too slow on issuing the minimize command, the freeze becomes unrecoverable and a hard machine reboot is required. If only one client is in game there is no freezing, but the issue appears with even a small number of clients.

    (Screenshots attached of what the performance meters in task manager show during these activities)


    Other notes:

    The 1125W PSU that came with the workstation does not have a 8-pin pcie power cable, only 6-pins. Currently using a Startech single 6 to single 8 pin adapter. Supposedly these commercial PSU's are carrying much more power through their 6-pin cables than consumer PSU's. I know another PEQ'er that uses this same video card with this adapter on a lower spec'd workstation (and he can run 48 no problem).

    Currently using the DirectX 12 version that is distributed with our RoF2 client.

    Updated drivers for graphics card, Intel chipset, flashed updated BIOS, etc. Issue persists. Have tried changing a great many of the graphics settings, and then changed back to default when they were unsuccessful.

    Also, I have several clients at various states of memory optimization per the usual guides available here and at EZ server. No level of tweaking (including greatly reducing figures in memory.ini) has fixed the issue (except having viewport set to 1 1 1 1, those clients only cause very minor movement to the GPU 3D pane).

    Anyone have ideas on other things I might try? Still hoping it's just some conflicting setting that is causing all of this frustration.

    Thanks,
    Ulmo


    -------------------------------

    More video card info:
    Driver version: 461.92 (also displayed as 27.21.14.6192)
    Driver type: DCH
    Direct3D feature level: 12_1
    CUDA Cores: 1024
    Graphics clock: 1278 Mhz
    Memory data rate: 7.01 Gbps
    Memory interface: 128-bit
    Memory bandwidth: 112.16 GB/s
    Total available graphics memory: 69585 MB
    Dedicated video memory: 4096 MB GDDR5
    System video memory: 0 MB
    Shared system memory: 65489 MB
    Video BIOS version: 84.06.26.00.61
    IRQ: not used
    Bus: PCI-E x16 Gen3


    System information:
    system_specs.jpg


    Idle (all clients minimized):
    idle.jpg


    Quick maximize and minimize of client (1-3 seconds open):
    quick open.jpg


    Short duration freeze (quick Win+D keypress as soon as video first hint of chop, 5-6 seconds after client maximized, 30+ seconds to come back from freeze):
    short freeze.jpg


    Medium duration freeze (Win+D keypress at first moment video completely frozen (7-8 seconds after client maximize, ~2 minutes to come back from freeze):
    s-med freeze.jpg
     
    Last edited: Mar 27, 2021
  2. 0112358

    0112358 Orc Centurion

    Messages:
    70
    Update:

    Ordered a Quadro K6000 (12GB) for my workstation. It's older technology, and the 3D benchmark is only about 33% better than the GTX 960. Hoping it performs better with the extra memory and higher wattage (2x6-pin, no adapters needed). Perhaps being designed for a higher workload and multitasking counts for something.

    Direct X 12 isn't supported by this card, only up to Direct X 11.1, this shouldn't be an issue for EQ, correct?

    I would still appreciate any tips toward figuring out the issue in the original post of this thread though, as it may help with performance of the new card. Again: 54 in game minimized = no problems, but if maximize one = GPU freeze/crash. Can run a small number of toons, <20, and not have the freezing issue when maximizing. Maybe the background/minimized clients are calling for GPU resources simultaneously when any single client is maximized?

    Thanks again for looking.
    - Ulmo
     
  3. mackal

    mackal Pyrilen Fireblade

    Messages:
    2,267
    looks like you're running out of VRAM (although, that is just report allocated, not used) from the graphs I guess. So I guess the Quadro should help, but you can try minimizing video memory usage. EQ is DirectX 9, the version of 11 or 12 doesn't matter at all.
     
  4. 0112358

    0112358 Orc Centurion

    Messages:
    70
    quadro k6000.jpg

    Quadro arrived. At a very small portion of V-RAM used, but the same issue persists. I had worried this might be the case. Had done some testing running 30 on the workstation and 24 on an old tower, which left just over 1/2 of the V-RAM used on the GTX960, but this continued to happen with that arrangement also.

    The specs on the Quadro are much better in almost every aspect than the 960, and to have the exact same issues at the exact same client numbers seems strangely coincidental.

    I can run 30 no problem on the old tower with a very early generation i7 and 2GB Radeon R9 270X (could maybe run more, haven't tried). Settings for clients are nearly the same between the two machines (copied over to new workstation, aside from a small handful of "tanks" set at a higher resolution).

    Could it be a setting (or combination of settings) that is causing a feedback loop or something that might cause it to use exceptionally high GPU resources?

    If someone else is running 30-54 on a machine with < or = specs, maybe I could try with a copy of your eqclient.ini's (and any other relevant config files) that might enable further troubleshooting. I appreciate any ideas.

    - Ulmo
     
  5. Rasek

    Rasek Orc Centurion

    Messages:
    50
    sent you this awhile ago on discord:
    seen your post that u updated drivers.
    [1:38 AM]
    switching to fullscreen mode crashes my clients I use a maximized window, I know you said maximized just wanted to mention it just in case.
    [1:38 AM]
    Are you using the mq2cpuload plugin? if so I would unload that.

    Going to full Fullscreen mode always crashed my bots on both machines, I've heard someone in ooc say they didnt have that problem but I never got rid of it. Might have something to do with certain chipsets, I've always kept in windowed mode. Again I know you said maximized not fullscreen so sorry if this is no help.

    Edit:
    I did carry the same GPU over to new desktop and have always ran 2 monitors
     
  6. 0112358

    0112358 Orc Centurion

    Messages:
    70
    MQ2CpuLoad plugin wasn't running, so that wasn't it. Had worried maybe it was conflicting FPS settings, so double checked that all my clients are set to unlimited while MQ2FPS controls that. No change there.

    With no small amount of troubleshooting this, it is feeling likely to me that the probable cause is the change to 4k resolution with the new curved display I bought. This may be a "duh, you should have told us that", but I was hoping to get by with the Quadro.

    I was stubborn in the principle of not spending hundreds above the new retail price for a used card that is years old. I could at least sell myself on the $550 Quadro K6000 because of how very expensive it was at launch. However, now I'm stuck enough to have gone ahead and spent $900 for a used GTX 1080 Ti.

    Will report back after that arrives...
     
  7. Rasek

    Rasek Orc Centurion

    Messages:
    50
    Kind of hope they outlaw bitcoin for that reason, that and I had 150k doge coins at one point and sold them for less then $1k when they were arguing over stimulus. now they would be worth around $15k lol
     
  8. 0112358

    0112358 Orc Centurion

    Messages:
    70
    1080ti_01.jpg

    The 1080 Ti partially solved the problem. Still can't play with all 54 in game. The GPU produces heavy chop that I would call short duration game freezes (1-2 seconds) continuously while the 54 toons are in motion and field of view. The card never flat out quits thus causing the full machine freezing as before though.

    Can now play smoothly by spreading it out: 30 clients on workstation + 24 toons on old tower
    Screenshot here shows generally what that usage looks like with a client maximized in many zones. Seems to be near max capability, but still managing to keep up. Some zones the graph stays much flatter right at 99-100%, while other zones it seems like it is able to stay closer to 75-85%.

    I'd guess the 4k display was the primary problem? The basic overhead of the clients using about 30-40% of the GPU (more with 54), even when minimized. Then a 3k-4k client opening up to draw onscreen was just pushing it too far over 100% that it would freeze and reboot itself?

    Someday, when I can find one under $1200, I'll upgrade again to a RTX 3080 or 3080 Ti (starting to look like might be released soon, though near impossibility of getting one is assured). Maybe that will allow the full 54 to play smoothly.

    In the mean time, certainly open to any ideas for reducing the base overhead of just having clients running but minimized.

    It does seem strange that the GTX 960 used less base overhead than the superior Quadro and 1080 Ti. Settings otherwise being more or less the same between card swap outs.
     
    Last edited: Apr 21, 2021
  9. Rasek

    Rasek Orc Centurion

    Messages:
    50
    Oh yeah I missed that your running at 4k. That is probably it lol. my bot windows are at 640x480 and I just maximize them if I need them but I'm running at 1080 on 2 monitors. But i'm also only at 36 toons. The size of bot chat windows and their filters seem to affect there background lag I noticed.
     
  10. 0112358

    0112358 Orc Centurion

    Messages:
    70
    Yeah, chat windows fairly small sized and well filtered on all except for main. Main is at 3840x1600, a few tanks at 3440x1440 (I notice no performance difference between these two options). The rest of the bots are between 800x600 and 1024x768, so no extravagance on them.