« MySQL query cache
» Keep the visitors on your site and reduce the bounce rate

Caching, MySQL

How to improve MySQL query cache hit ratio

2007-12-13 | Comment?

MySQL logoI have already mentioned one of the downside of MySQL query cache, query invalidation. Each time when you insert/update/delete records from the table, cached queries from that table will be invalidated even if the edited records are not in the cached subset.

The simple way to improve MySQL query cache hit ratio is not to update tables frequently :) That is of course impossible in most of occasions, but perhaps you can make that effect by using one table for read-only purposes.

I own news aggregator similar to Google News and Topix, but localized for my country. Database is updated constantly, crawler fetch news feeds and insert them into the database etc. Although there is a huge searchable archive of news article links, ~70% of traffic is targeted on homepage and news categories which contains daily news. Improving performance of daily news presentation would boost server performance, since that would be, as I said, ~70% of total traffic.

I have realized that for my site it is not a big problem if the new article doesn’t came up on the home page on the same minute when it was inserted in the database, so I have chosen performance over data consistency.

Queries that I usually use have couple of joins. To keep this post readable, I will present you simplified queries:

SELECT a.article_id, a.title, a.date, c.name, s.name
FROM articles a
INNER JOIN categories c ON (a.category_id=c.category_id)
INNER JOIN sources s ON (a.source_id=s.source_id)
WHERE a.date=CURDATE();

So, what I have done is to „cache“ daily news in one separated table, including all joined data. Ids and other staff are indexed in the daily_cache table.

INSERT INTO daily_cache (a.article_id, a.title, a.date, a.category, a.source) VALUES
SELECT a.article_id, a.title, a.date, c.name, s.name
FROM articles a
INNER JOIN categories c ON (a.category_id=c.category_id)
INNER JOIN sources s ON (a.source_id=s.source_id)
WHERE a.date=CURDATE();

After that, all queries that fetch daily news are directed to daily_cache table:

SELECT a.article_id, a.title, a.date, c.name, s.name FROM daily_cache;
or
SELECT a.article_id, a.title, a.date, c.name, s.name FROM daily_cache WHERE category_id=1;
etc.

What have I achieved with this:

  • MySQL and OS server would probably keep daily_cache table in working memory since the table is not so big and it is frequently used. I’m using regular MyISAM table type, but you can also consider MEMORY table type.
  • We have a flat records in daily_cache table with all necessary informations for displaying, so the performance will not be decreased because of joins.
  • daily_cache is updated on every 30 minutes (complete truncate and insert with rotation, so we never display data from an empty table) which means that MySQL query cache will have an easy job of keeping frequently used queries in cache for 30 minutes.
  • As I said, daily news pages get around 70% of total traffic, so this method significantly reduced server’s CPU usage and improved performances.

Conclusion

This is not the universal way of improving performance of your database server, but it might be useful in some occasions. Your data might be to critical not to be displayed on the homepage for 30 minutes. Or you have lots of data so the inserting records in cache table and index updating would take ages, etc.

I do believe that there is no universal solutions for every problem. For example, MySQL query cache tries to provide an universal solution, it will cache all your queries etc. and we have seen that it might not be the good idea and in many cases it won’t improve performances significantly.

It is up to you to make custom caching solution that will suits your needs. But, by reading other peoples case studies you can just get the better idea how can you improve your web site performance.

have your say

Add your comment below, or trackback from your own site. Subscribe to these comments.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

:

:


« MySQL query cache
» Keep the visitors on your site and reduce the bounce rate