Re: [RFC] cache architecture

From: Pieter De Wit <pieter_at_insync.za.net>
Date: Wed, 25 Jan 2012 06:03:10 +1300

> Hard to implement given the current "leg work" is already done ? How
> well does the current version of squid handle multicores and can this
> take advantage of cores ?
>
> Should be easy. We have not exactly checked and cocumented the DiskIO
> library API. But
> The current AIO handles SMP exactly as well as the system AIO library
> can, same for pthreads library behind DiskThreads.

Taken from iscsitarget - they have a "wthreads x" config option that
spawns x number of threads for write only I believe, not sure of the
reading. You can't control this in the AIO Lib. (I think ?) but perhaps
something like this could be useful for pthreads

<snip>

> The cache_dir can report this up to the top layer via their loading
> factor when they are not servicing requests. I was considering it to
> prioritise CLEAN builds before DIRTY ones or cache_dir by the speed of
> its storage type and loading factor.

It seems we are heading to naming cache_[dir|mem] otherwise the options
might become confusing ? Almost the same as "cache_peer name=bla" (While
writing this below I came up with another idea, storage weights which
might solve "my issue" with the double object store)

cache_dir name=sata /var/dir1 128G 128 128
cache_io_lib sata AIO
cache_rebuild_weight sata 1
cache_request_weight sata 100
cache_state sata readonly
cache_weight sata 100

cache_dir name=ssd /var/dir2 32G 128 128
cache_io_lib ssd pthreads
cache_rebuild_weight ssd 100
cache_request_weight ssd 1
cache_state ssd all
cache_weight ssd 10

cache_mem name=mem1 1G
cache_state mem1 all
cache_weight mem1 1

(I feel the memory one is "out of place" but perhaps someone else has
another idea/thought process - why would you need two cache_mem's ?)

What I wanted to show above was the use of "name=" in cache_dir, that
lead to another idea, "cache_weight". So we are happy that the options
are now settable per cache_dir :)

cache_weight will allow an admin to specify the "transit cost" of
objects in a cache. Squid starts up and wants to serve objects as
quickly as we can. Memory can be used without issues right away for
caching. Now we start to initialize the disk caches. In my example
above, the ssd cache should init before the sata giving us some storage
space. During the init of the "sata" cache, the memory allocation is
already filled up, squid starts expiring objects to the next cost, so an
object would travel from memory, to "ssd" (much the same as it does now ?)

Now, "sata" is still busy with init'ed, but "ssd" has also filled up, so
we are forced to retire an object in "ssd", much like we do now. Once
"sata" is done, it will join the queue, so objects will expire like:

mem1->ssd->sata (ignoring the fact that it's set to read-only for now)

If we have an object already in the "sata" cache that is new in "ssd" we
would expire that object as soon as the "sata" cache is done setting up.
We do how ever now have the overhead of reading an object, writing it
somewhere else (please please please admins - make it other spindles !!!
:) ), freeing the original space, then write the new object.

Another example:

"sata" init'ed before "ssd":

Before init: mem1->sata
After init: mem1->ssd->sata

Now we could have the problem of sata and ssd having the same object. We
would expire the higher cost one (the one in sata) since the object is
required more than we "thought" ? This is the *only* way object can
travel "up" the disk cost chain, otherwise we could be throwing objects
between cache's all day long.

Let's stop there while I am ahead :)

>
>>>
>>> For every x requests, action an "admin/clean up" request, unless
>>> "Queue 1" is empty, then drain "Queue 2"
>>>
>>> I am also thinking of a "third" queue, something like:
>>>
>>> Queue 1 - Write requests (depends on cache state, but has the most
>>> impact - writes are slow)
>>> Queue 2 - Read requests (as above, but less of an impact)
>>> Queue 3 - Admin/Clean up
>>>
>>> The only problem I have so far is Queue 1 is above Queue 2.....they
>>> might be swapped since you are reading more than writing ? Perhaps
>>> another config option.....
>>>
>>> cache_dir /var/dir1 128G 128 128 Q1=read Q2=write (cache_dir syntax
>>> wrong....)
>>> cache_dir /var/dir2 32G 128 128 Q1=write Q2=read (as above, but this
>>> might be on ssd)
>>>
>>> I think this might be going too far ?
>>>
>>> Cheers,
>>>
>>> Pieter
>>>

No comments on the three queue's per cache "space" ?
Received on Tue Jan 24 2012 - 17:03:19 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 25 2012 - 12:00:11 MST