|
|
Hi James,
James Carlson wrote:
> Ellard Roush writes:
>> James Carlson wrote:
>>> That point in time is as soon as your application can start. It need
>>> not have any dependencies at all.
>>>
>> Here is the other point that needs to be clarified.
>> This is not an application.
>> Applications do not start until much later.
>> We have to get the cluster formed and cluster services established
>> before applications run.
>
> We probably have different definitions of that term. For networking,
> an "application" is something that uses the services provided by a
> transport or (for raw sockets) network layer protocol.
>
> I'm not talking about user applications; just things that use
> networking services in some way.
>
OK. Now I understand what you mean.
> Your program (whatever it is) should not need dependencies on
> networking in order to be successful. As I suggested before, it's
> sometimes helpful to listen to routing sockets (you can get hints
> there about when it might be a good time to shorten a retry timer, and
> thus make your program respond more quickly), but it's not really a
> dependency issue.
>
>> The internal interfaces that we had to use are not well documented.
>> Your explanation helps understand what is probably going on.
>
> It's hinted at in the documentation, but not as well-documented as it
> should be. man -s 3socket connect says:
>
> underlying transport provider. Generally, stream sockets can
> successfully connect() only once. Datagram sockets can use
> [..]
> ECONNREFUSED The attempt to connect was forcefully
> rejected. The calling program should
> close(2) the socket descriptor, and issue
> another socket(3SOCKET) call to obtain a
> new descriptor before attempting another
> connect() call.
>
> That "generally" is also true for most unsuccessful connect() calls
> and the advice under ECONNREFUSED is actually true for pretty much all
> failures. The exceptions are the non-failure "failures" -- EALREADY,
> EINPROGRESS, and EWOULDBLOCK. I think that issue is what the text is
> trying to dance around.
>
> You're partly connected (at least bound) after the real failures, and
> getting back to a clean state is easiest just by close() and trying
> again.
>
> The usual references (Stevens and others) have more detailed
> discussions. The underlying problem is that for much of the BSD
> world, the code *is* the documentation, so whatever sockets did, well,
> that's what they do.
>
> (For what it's worth, this isn't even one of the darker corners. Raw
> socket behavior, for example, varies in mysterious ways across OS
> platforms and even across releases of a given OS.)
>
Thanks for the explanation. Our Quorum Server uses the
approach that you suggested. We discovered it the hard way.
We are now attempting to use iSCSI devices as quorum devices.
I will share your insight with the iSCSI people.
Regards,
Ellard
_______________________________________________
zones-discuss mailing list
zones-discuss@xxxxxxxxxxxxxxx
|
|