Re: [DRAFT][MERGE] Cleanup comm outgoing connections in trunk

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 24 Jun 2010 14:09:25 -0600

On 05/24/2010 03:46 AM, Amos Jeffries wrote:
> Alex Rousskov wrote:
>> * Comm::Connection copy constructor does not copy peer correctly OR
>> Comm::Connection destructor does not clean peer correctly. It may be a
>> good idea to hide peer_ from public use to ensure it is always
>> locked/unlocked properly.
>
> Okay. Fixed the constructors and hidden the _peer variable behind
> accessors.
> One thing though, the set accessor can do cbdata ops fine, is it okay
> for the retrieve acessor to just return the ptr? leaving
> cbdataReference*( x.peer() ) to the calling code?

Yes, it is OK. The object storing a pointer should lock/unlock it as
needed. Accessors should not do that if they just return what was
already stored.

> Which is preferred in this case where the paths is only valid as
> long as FwdStateData exists? RefCounted or cbdataReference?

I prefer RefCounted if you do not need to separate "valid" from
"allocated". Looks like that is what your second revision implements.

> dialer makes sure the object being dialed still exists, or drops the
> call. Am I correct?

You are correct. However, when the Call object stores something, we need
to be careful that this "something" is still valid when the Call object
accesses it. For example, a Call or Dialer object should not print a
call argument for debugging if that argument may be invalid. This is a
general answer to a general question; not specific to your changes.

> I may have overlooked somewhere where FwdStateData could be deleted by
> operations parallel to FwdStateData::start(). Though I believe if that
> is possible we have a terribly huge race bug somewhere.

AsyncJobs should be started using AsyncStart() which never fails. Any
failures inside AsyncJob::start() should be no different from failures
inside any other part of the job that is executed asynchronously because
the initiator code does not really know when start() will be executed.

IIRC, we violate the AsyncStart rule in client_side*.cc (at least).

Again, general answer here, not specific to your code.

>> * Please remove ConnectStateData solo connections as
>> exceptional/different code path. It has already bitten you (see below).
>> One alternative is to create paths with a single connection instead
>> (inside the solo constructor which will be the only solo-specific code
>> then). Another alternative is to add path/solo access methods that hide
>> the difference from the rest of the ConnectStateData code without the
>> Paths creation overhead.
>
> This was the first model I started with. It runs onto severe problems
> with connections such as Ident, slow ACL, and DNS where there is no
> state object to hold the authoritive lifetime of the paths vector.

Neither of the two proposed solutions assume the presence of such an
object (other than ConnectStateData object itself).

> Its also terribly inefficient allocating a vector, adding the single
> Comm::Connection object and pushing it, having comm navigate through
> extra API layers to validate its presence then access it.

The second solution does not allocate a vector.

> You will have to straighten me out on this but I think it may be needed
> by ICAP Xaction when the ICAP is configured with a service name (DNS
> lookup plus vector of destinations assembled) instead of IP address
> (singleton).

Not sure. Do "paths" stand for multiple IP addresses with a single DNS
name? AFAICT, it is not documented or, to be more precise, the paths
members are documented as being ... paths.

>> * Whenever possible, please declare Pointer parameters and Pointer
>> return types as const references to avoid unnecessary locking/unlocking.
>>
>
> Unfortunately its not easy to set const on the volatile Comm::Connection
> objects in outbound connection handling. They get created as
> might-be-used objects and released on failures. Activating one will
> shortly result in Comm updating it's properties (transforming wildcard
> IP to specific on-link IPs etc).

I was not talking about that. I will try to be more specific in the next
review email.

>> * We now have ConnectStateData and ConnStateData public classes.
>> StateData suffix is already wonderfully descriptive, but this brings
>> confusion to the next level.
>
> Aye. I'm slowly working through comm.cc figuring out what bits are used
> by which side. ConnStateData will hopefully get a better descriptive
> name when its scope is clearly known.

Let's solve this by renaming ConnectStateData instead. I believe you
already defined ConnStateData scope. With some polishing touches I would
restate it as:

/// Async-opener of a Comm connection.
/// Can find the first "working" among multiple destinations.

If you agree with the above scope, let's s/ConnectStateData/ConnOpener/g

>> I think the primary reason for most of these problems is that the new
>> Connection class has dual personality. It is both holds the information
>> necessary to establish a connection _and_ the established connection
>> information. I suggest to split Connection into two classes:
>>
>> 1) Create a PeerPort or PeerAddress class that will hold information
>> necessary to connect to a peer. This can be a cache_peer, an origin
>> server, and ICAP server, and [in SMP branch] another Squid process.
>> FwdState and others that need to cycle through peers will create and
>> pass PeerPorts or PeerAddresses arrays to Comm::Connector class (see
>> below).
>>
>> 2) Connection object will have a peer member of type PeerPort or
>> PeerAddress. It will also have fd and other fields necessary for an
>> _established_ connection.
>>
>> 3) Rename ConnStateData to Connector and document it as "creates a
>> Connection to [one of] the supplied PeerPorts". This class will iterate
>> peer details, try opening connections, and will eventually succeed or
>> fail. When it succeeds, it will create the corresponding Connection
>> object and return it to the caller.
>>
>
> This is back to the model we are trying to get away from.

I disagree that what I am proposing is what we should get away from
(naturally).

The value of your changes is in moving the handling of connection
retries and multiple destinations from high-level code to a dedicated
ConnOpener class. I agree that it is the right thing to do. No argument
there.

I just think it should be done differently when it comes to handling
possible Connection destinations. I think it is wrong to store a vector
of unconnected Connections and then use the first element when we can
store a vector of Destinations and use one resulting Connection instead.

> Your PeerPort class == FwdServer (generated by peer select and handled
> by FwdStateData when calling ConnectStateData)

They are probably similar, but the current code is not using FwdServer
for all connection establishment calls. I think it is good to have a
single Connection Destination class(es) that all code is using in a
uniform fashion. Your code does that, but using the wrong class
(full-blown Connection instead of small and specific Destination).

>> * Will Comm::Connection be used for incoming connections as well? If
>> not, the name is probably too generic. Should be something like
>> PeerConnection. Or perhaps we do not need this class at all?! If it is
>> only used as an information holder during callback (after the above
>> PeerPort changes) then the callback data should be modified to carry
>> that information instead of adding a new class.
>>
> see comm flow description above.

I still think the above concern is valid. I hope that the problems with
the current implementation (a collection of unconnected connections) are
clear enough. I tried to clarify above that I am not proposing to going
back to "handle everything in high-level code" model. I only want to
change the connection destination type (solo and paths) from
"unconnected Connection" to "Connection Destination".

Review of your latest patch will follow in a few hours.

Thank you,

Alex.
Received on Fri Jun 25 2010 - 00:42:07 MDT

This archive was generated by hypermail 2.2.0 : Fri Jun 25 2010 - 12:00:08 MDT