summaryrefslogtreecommitdiff
path: root/vimwiki/Nodes.md
diff options
context:
space:
mode:
Diffstat (limited to 'vimwiki/Nodes.md')
-rw-r--r--vimwiki/Nodes.md24
1 files changed, 24 insertions, 0 deletions
diff --git a/vimwiki/Nodes.md b/vimwiki/Nodes.md
new file mode 100644
index 0000000..4555781
--- /dev/null
+++ b/vimwiki/Nodes.md
@@ -0,0 +1,24 @@
+__Ganglia__ (https://uhhpc.herts.ac.uk/ganglia/) can be useful to see the state of nodes.
+
+If a node goes down while a user’s job is running on it, the job will not terminate properly
+and may flood the user’s inbox with notifications. If `Ganglia` or `showstate` report a node
+is down, consider rebooting it with
+
+`sudo rebootnode.pl nodexxx`
+
+This will prompt you for the IDRAC password, which is `rianhs4b`. Once a node has been rebooted,
+wait a few minutes, then check that you can ssh into it as a normal user and view your home
+directory and /beegfs. If so, bring it back on line with
+
+`sudo pbsnodes –c nodexxx`
+
+If a node is misbehaving and you don’t want to/can’t reboot it, you can temporarily remove it
+from the pool used the job control system with
+
+`pbsnodes –o nodexxx`
+
+– also reversed by
+
+`pbsnodes –c`
+
+