RE: [squid-users] Squid Hardware requirements.

From: Stephan Viljoen <steph_at_gabswave.net>
Date: Sat, 15 Jun 2013 00:29:52 +0200

I was thinking of buying a super micro server with 8 to 16 drive bays and
fill it with (15K RPM SAS disks) as I need more disk i/o. I guess a pure
memory system will still be the fastest option but I'm looking for something
in between speeding up browsing and saving as much bandwidth as possible
without sacrificing to much speed.

-----Original Message-----
From: Marcus Kool [mailto:Marcus.Kool_at_urlfilterdb.com]
Sent: Friday, June 14, 2013 5:35 PM
To: csn233
Cc: Stephan Viljoen; squid-users_at_squid-cache.org; support and sales desk
URLfilterDB
Subject: Re: [squid-users] Squid Hardware requirements.

On Fri, Jun 14, 2013 at 09:53:20PM +0800, csn233 wrote:
> With YMMV in mind, I get different mileage:
>
> On Fri, Jun 14, 2013 at 7:41 PM, Marcus Kool
> <marcus.kool_at_urlfilterdb.com> wrote:
> > and if your network pipe has sufficient capacity, also fetching an
> > object again from the internet is can be faster than fetching from disk.
>
> Your network may be fast, but it doesn't imply a fast path between you
> and the origin server. In other words, it depends on other factors
> than just your own network pipe.

yes, mileage may vary and depends on many factors.
Overall, squid servers without disk cache can be faster than with disk
cache, so it is worth looking at it.

> > - more expensive (disks + battery-backed I/O controller)
>
> Expensive disks/battery-backed are over-kill. More/adequate spindles
> should do the job just as well. Why do you need a battery-backed
> controller? Squid is not a transaction-based system - if you lose the
> cache, tough, do "squid -z" and start again.

fast disks are good. multiple controllers and mutiple buses are good.
An EMC disk array is the most expensive and best option since Squid desires
a huge number of IOPS.
Battery-backed disk controllers are a good tradeoff: they are not so
expensive and give a reasonable performance boost.

> > - Squid uses more memory to index the disk cache (14 MB memory per
> > GB disk
> > cache)
>
> My memory allocation is only about 20-30% of that (formula), and
> paging/swapping metrics doesn't indicate there is a problem. General
> formulas may not always apply.

The 14 MB per GB is documented in the Squid wiki and based on the
observation that the avergae object size is 13 KB.
If you only have 20-30% of the formula you may have a larger average object
size or only use 20-30% of the confgured disk cache.

> > unless a redundant hot-swap RAID array is used, less downtime.
>
> Older versions has a problem if a cache_dir fails, I think. Has this
> changed with later versions, or in the pipeline to change, anyone?

The thread started with a web proxy for an ISP.
ISPs generally do not want to restart the proxy and/or rebuild the index.
It takes too long.

> > One can also redistribute budget:
> > - use the budget of the disk system to max out memory.
>
> The benefits of memory will plateau pretty quickly. Unless one
> regularly has a whole bunch of users wanting to access the same pages
> within a relatively short time, the benefit from more memory has its
> limits. Max-out could easily become wastage.

No, memory is by far the fastest cache media. Since memory is relatively
cheap it is the best option.

> > - put as much memory as possible.
>
> Disagree - see above. It depends.

Ok, I stated it a bit aggressive. It should read "Buy as much memory as your
budget allows".

> > - carefully size the disk cache; not too large since Squid keeps the
> > index
>
> Agree. If your hit-ratios don't increase, there's not much point in
> having larger cache_dir's. But I wouldn't go as far as "carefully".
> You just need enough or more, just not too much more.

That is your point of view. I prefer to be careful not to use more than
enough since it wastes memory.

> > - if using a disk cache, use fast disks and a very good caching I/O
> > controller to get maximum disk performance
>
> Up to a point only, as mentioned above. Local disk I/O may be fast,
> but it doesn't mean your internet access will be as well. Which means
> you end up spending money on hardware that does not deliver actual
> results.

Squid is hungry for a large number of IOPS. So get the best that your budget
can buy.
For low budgets this is a relatively cheap caching disk controller, for high
budgets it varies between low-end and high-end disk arrays (the ones that
have between 32 and 1000+ of spindles).

> As Amos said, get the fastest per-core GHz you can find, number of
> cores not important. And have enough disk spindles.
Received on Fri Jun 14 2013 - 22:29:58 MDT

This archive was generated by hypermail 2.2.0 : Sat Jun 15 2013 - 12:00:08 MDT