Talk:Systems: Difference between revisions
From Pumping Station One
→Updating: new section |
→members site incidents: new section |
||
| Line 71: | Line 71: | ||
Carl had originally put this in the content page, I'm moving it to the discussion page --[[User:Hef|Hef]] ([[User talk:Hef|talk]]) 09:31, 1 May 2015 (CDT) | Carl had originally put this in the content page, I'm moving it to the discussion page --[[User:Hef|Hef]] ([[User talk:Hef|talk]]) 09:31, 1 May 2015 (CDT) | ||
== members site incidents == | |||
=== Gateway timeout 2015/10/25 === | |||
here is a discussion from the systems list and irc back when this incident happened. [[User:Skm|Skm]] ([[User talk:Skm|talk]]) 15:28, 2 February 2016 (CST) | |||
'''initial email''' | |||
''the members site is timing out (502) and I notice the backup status on proxmox is still going. It started at midnight.'' | |||
''What's the SOP for this? kill it with fire?'' | |||
'''later email (after I played around with the proxmox ui)''' | |||
''I don't know how to kill it with fire, and I couldn't reboot it since the UI blocks it while a bootup is happening.'' | |||
'''later email, after some discussion on irc''' | |||
''On bob, Hef did:'' | |||
`qm unlock 123; qm stop 123; qm start 123` | |||
''The nfs server woke up after doing that.'' | |||
''I didn't realize bob was doing hte http stuff to members, I thought nginx was on the members box. For later reference, the 502 is coming from bob, not the ps1auth vm.'' | |||
''If no one gets around to it before I do, add a SOP section to one of the wiki pages (members? proxmox?).'' | |||
'''later clarification from hef''' | |||
''small clarification, the nfs server woke up after the 05 crash, not | |||
after running the qm commands'' | |||
''123 is the vmid of ps1auth'' | |||
''bob is the http gateway for all http/https services in the space. ps1auth is also running an nginx server for dealing with static files.'' | |||