Re: Squid-smp : Please discuss

From: Henrik Nordstrom <henrik_at_henriknordstrom.net>
Date: Mon, 14 Sep 2009 23:15:22 +0200

mån 2009-09-14 klockan 22:43 +1200 skrev Amos Jeffries:

> >>> on epoll ( select ) implementations in squid. It is found that epoll
> >>> is polling all the descriptors & processing them one by one. There is
> >>> an important FD used by http port which is always busy, but has to
> >>> wait for other descriptors in queue to be processed.

The http_port is far from that buzy. Unless you are under heavy overload
the http_port is not ready most poll loops. The only traffic seen on
that filedescriptor is accepting of new connections. Additionally with
the nature of HTTP data flow the timing between connection
establishement and processing the new connection is not very critical,
as long as it gets done within reasonable time. There is a large margin
thanks to the time it takes for the client to send the first request.

> >>> Then, I also found that it is possible to separateworking of all fd
> >>> handlers , e.g fd used by http port.(tried)
> >>> This can be done by making some changes in codes.................

The issue preventing this is how to deal with the access to all shared
data. Will need quite a bit of locks to be introduced if starting to
introduce threads.

> >> Special pseudo-thread handling is already hacked up in a pseudo-thread
> >> poller for DNS replies. Which is complicating the FD handling there.

Hmm.. is there? Where?

Checking the state of epoll in Squid-3 I notice it's in the same shape
2.5 was, missing the cleanups and generalizations done in Squid-2. In
squid-2 we do have a set of "incoming filedescriptors" which are polled
more frequently than the others if there is too much work in the select
loop. But users of high traffic servers have found that for epoll this
makes accepting new connections too agressive, and Squid runs better
(more smoothly) with this disabled.

> >> That allows making the whole select loop(s) happen in parallel to the rest
> >> of Squid. Simply accepts and spawns AsyncJob/AsyncCall entries into the
> >> main squid processing queue.
> >>
> >> Workable?

Yes, but i kind of doubt it will give any benefit at all.

When Squid is running under normal load epoll queue length is in the
range of 10-40, not more, and occationally the http_port fd's is marked
ready (far from always). The epoll in itself is not very heavy
operation, the heavy part is reading and processing the request/response
data.

But the up side is that this is a easy spot to attack to try adding and
experimenting with threading. The accept() is very isolated in the
amount of data it needs access to.

Regards
Henrik
Received on Mon Sep 14 2009 - 21:15:27 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 15 2009 - 12:00:04 MDT