The world should have blinkenlights on its computer systems. That’s a given.
I wrote a couple of things. One was a Python program that pinged the four machines forming the cluster, and displayed a red or green light on a UnicornHD HAT to show their status. It worked very nicely. Then I wrote Python code to form part of any program that would be run in parallel on the cluster, which would send a signal saying whether each core was busy or not. It worked nicely, and I now had a row of 16 LEDs, in red or green, so I could see what was going on. It was very pretty.
Unfortunately, as it worked by sending a file by FTP every time a processor core changed between running and idle, it created a very effective Denial of Service attack on our network. Oops.
Now that I have thought about it more carefully, I shall be constructing a much better monitoring system, which will be based on sockets. I’ve been avoiding learning how to use them for far too long, anyway…
Later:
I tried at least umpteen example programs using sockets, and the connections were all rejected, and I couldn’t work out how to fix that. Suggestions, anyone?
Using a Python program to query the cluster computers took nearly six seconds to look at the 16 cores, hardly blinkenlights… A quick hack of a bash script, astonishingly, took almost as long. Back to trying to get sockets to work, then…
Working sockets tutorial!
At last, I found a socket programming example that worked, here!
I wanted to give Zan a tiny donation, but sadly his GoFundMe page seems defunct, and possibly the message I tried to send him also failed…
Sadly, I was then unable to work out how to accept multiple connections from the cluster computers.
Threading sockets programs!
There’s another set of client-server demos on GitHub, here, that I tested with Marvin and two of the Oysters, to confirm that it can do what I want. I can hoik code from those while retaining the program logic, and maybe get all four Oysters to send their status to Marvin, for him to display. I am not at all bothered that I am writing control system code for the cluster, instead of getting round to some fun applications of parallelism