Re: Node/Head boot order?
On Wed, Mar 17, 2004 at 09:36:52AM -0600, Bogen, Patrick wrote:
I need to dig through my email backlog more often :)
> We have a cluster here at my work, and whenever the power goes out it turns
> itself on, but then it has to be rebooted manually, because the computation
> nodes came up before (or at the same time as) the head node.
same issue here.
> Is there some
> software mechanism that can be used to ensure this doesn't happen?
Microway's clusters ship a "slave" boot script that the compute nodes run
at bootup. On start, it waits until it can rsh to the master. (If you
don't have rsh installed, rsh = ssh, so it's all good :) You do need
passwordless r/ssh for root.
this is Microway's shell function, which looks twice as large as it needs to
be... I hate it when people write shell scripts that run way more commands
than they need.
echo -n "Waiting for $MASTER..."
declare -i count=10
found="no"
while [ $count -ge 1 -a $found != "yes" ]
do
if [ $count -le 9 ] ; then
echo
echo -n "Still not found.. retrying after 15 seconds..."
sleep 15s
fi
rsh $MASTER /bin/true >&/dev/null
if [ $? -eq 0 ] ; then found="yes" ; fi
count=$count-1
done
if [ $found == "yes" ] ; then
success ; RETVAL=0
else
failure ; RETVAL=1
fi
echo
return $RETVAL
I'd write it as:
#!/bin/sh
MASTER=foo
if [ "$1" != start ];then
exit
fi;
echo -n "Waiting for $MASTER..."
for i in $(seq 10);do
if rsh "$MASTER" /bin/true >&/dev/null;then
echo
exit
fi
echo -ne "\nStill not found.. retrying after 15 seconds..."
sleep 15s
done
echo
rsh "$MASTER" /bin/true >&/dev/null
See, much simpler. (An exit $? at the end would be clearer, but redundant.)
actually I might preserve the usual case statement structure of init
scripts, but you get the point.
--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cor , des.ca)
"The gods confound the man who first found out how to distinguish the hours!
Confound him, too, who in this place set up a sundial, to cut and hack
my day so wretchedly into small pieces!" -- Plautus, 200 BC
Reply to: