[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Node/Head boot order?



On Wed, Mar 17, 2004 at 09:36:52AM -0600, Bogen, Patrick wrote:

 I need to dig through my email backlog more often :)

> We have a cluster here at my work, and whenever the power goes out it turns
> itself on, but then it has to be rebooted manually, because the computation
> nodes came up before (or at the same time as) the head node.

 same issue here.

> Is there some
> software mechanism that can be used to ensure this doesn't happen?

 Microway's clusters ship a "slave" boot script that the compute nodes run
at bootup.  On start, it waits until it can rsh to the master.  (If you
don't have rsh installed, rsh = ssh, so it's all good :)  You do need
passwordless r/ssh for root.

this is Microway's shell function, which looks twice as large as it needs to
be... I hate it when people write shell scripts that run way more commands
than they need.
    echo -n "Waiting for $MASTER..."
    declare -i count=10
    found="no"
    while [ $count -ge 1 -a $found != "yes" ]
      do
        if [ $count -le 9 ] ; then
          echo
          echo -n "Still not found.. retrying after 15 seconds..."
          sleep 15s
        fi
        rsh $MASTER /bin/true >&/dev/null
        if [ $? -eq 0 ] ; then found="yes" ; fi
        count=$count-1
      done
    if [ $found == "yes" ] ; then 
        success ; RETVAL=0
    else 
        failure ; RETVAL=1
    fi
    echo
    return $RETVAL


I'd write it as:
#!/bin/sh

MASTER=foo

if [ "$1" != start ];then
   exit
fi;
echo -n "Waiting for $MASTER..."
for i in $(seq 10);do
      if rsh "$MASTER" /bin/true >&/dev/null;then
      	 echo
	 exit
      fi
      echo -ne "\nStill not found.. retrying after 15 seconds..."
      sleep 15s
done
echo
rsh "$MASTER" /bin/true >&/dev/null


 See, much simpler.  (An exit $? at the end would be clearer, but redundant.)

 actually I might preserve the usual case statement structure of init
scripts, but you get the point.

-- 
#define X(x,y) x##y
Peter Cordes ;  e-mail: X(peter@cor , des.ca)

"The gods confound the man who first found out how to distinguish the hours!
 Confound him, too, who in this place set up a sundial, to cut and hack
 my day so wretchedly into small pieces!" -- Plautus, 200 BC



Reply to: