vmsnet.networks.tcp-ip.multinet
[Top] [All Lists]

RE: Problem with Multinet cluster service names.

Subject: RE: Problem with Multinet cluster service names.
From: "Jackson, Craig (Gale)" <Craig.Jackson@xxxxxxxxxxx>
Date: Sat, 05 Jul 2008 09:28:10 -0400
Newsgroups: vmsnet.networks.tcp-ip.multinet

On a possibly related note, we've seen a number of occurrances where the whole
cluster service name system gets wedged under VMS 7.3-2 and MN 5.1.  This has
happened more frequently as we have added members to the cluster.

We have 7 cluster service names, and we have 10 cluster nodes participating.

What typically happens is that one node crashes, and this causes the exchange
of locks which implements the cluster service name mechanism to get hung up.
We typically find that the named process on one node is holding one of the
locks for a long time.  When we restart the named, it clears up.

I'm not sure if we've opened a case with Process, but I don't think so.

We're cutting down that pile of cluster service names anyway.  Some haven't
been used in years, and the function of others is being moved onto a hardware
load-balancing appliance.  (F5)

Craig Jackson

-----Original Message-----
From: Geoff Bryant [mailto:bryant@xxxxxxxxxxx]
Sent: Saturday, July 05, 2008 7:26 AM
To: info-multinet@xxxxxxxxxxx
Cc: bryant@xxxxxxxxxxx
Subject: Re: Problem with Multinet cluster service names.

I see that DE 10509 is listed as an open issue in the call tracking database.
It is listed as an open issue.

I am not familiar with the code involved, but I did find an internal discussion
of it and it mentions the NAMED-030_A052 as introducing the problem and
NAMED-040_A052 as resolving it.   Since it did not resolve it for you, I would
suggest falling back to the orginal NAMED images from MN 5.2 that were replaced
by ecos 030 and 040.

Unfortunately, that is all I can do for now, but will send this along for folks
to look at Monday.

info-multinet@xxxxxxxxxxx wrote:
>
>Remember this one?
>
>On Apr 10, 11:59 pm, Malcolm Dunnett <noth...@xxxxxxxxxxxxxxxxx>
>wrote:
>
>> I can't get multinet cluster service names to work reliably. It worked
>> just fine for years in older versions of multinet, but with multinet 5.2
>> the domain nameserver doesn't seem to be able to reliably see it. It was
>> working for a while today but then it quit, eg:
>
>> MALVM9> mu nslook vmscluster.mala.bc.ca localhost
>> Server:  LOCALHOST
>> Address:  127.0.0.1
>
>
>> *** LOCALHOST can't find VMSCLUSTER.MALA.BC.CA: Non-existent host/domain
>> MALVM9> mu netcon domain show
>> Connected to NETCONTROL server on "LOCALHOST"
>> < malvm9.mala.bc.ca Network Control V5.2(10) at Thu 10-Apr-2008 8:49PM-PDT
>> < Service VMSCLUSTER.MALA.BC.CA:
>> <   Nodename      Address      Rating
>> <   --------  ---------------  ------
>> <   MALVM3    142.25.103.73       169
>> <   MALVM9    142.25.103.71       169
>> < End of line
>
>> The nameserver does not show the translation of vmscluster even though
>> the mu netcon domain show displays that the two hosts are both
>> contributing entries.
>
>> The servers are Alphaservers running VMS 8.3
>
>> Anyone else seen this problem. Anyone successfully using Multinet
>> cluster service names with Multinet 5.2?
>
>And
>
>> Process Software has confirmed this is a bug. They've opened a defect
>> report (DE 10509).
>
>This continues to be a problem on IA64 with NAMED-040_A052 and
>UCXDRIVER-050_A052 installed - any suggestions? Should this still be
>failing?
>

<Prev in Thread] Current Thread [Next in Thread>