Blazing-Fast Dynamic Caching for WordPress: Interview with SiteGround’s Hristo Pandjarov

SiteGround CachePress
hristo pandjarov | wpshoutThis week, we interview Hristo Pandjarov from SiteGround. Hristo is helping lead the development of SiteGround’s CachePress caching plugin—part of an innovative dynamic caching solution that uses Nginx as a reverse proxy to serve full pages from the server memory.

About SiteGround: it’s WPShout’s exclusive host, as well as the host for our agency site and our other web projects. All this is served—super-performant and 100% reliable—through one SiteGround GoGeek account.


Full Interview

Note: the video contains a screenshare-based walkthrough of the plugin’s code that we’ve omitted from the transcript. Enjoy!

Transcript

Fred: We’re joined by Hristo, a technical lead at SiteGround, to discuss the CachePress caching plugin.

Any WordPress install hosted on SiteGround gets SiteGround’s caching solution, SuperCacher. CachePress is a way to manage SuperCacher’s caching in the WordPress admin, without having to go into you hosting interface directly.

Hristo, please you tell us a bit about CachePress, the history there, and how long you’ve been working on it.

History of CachePress

Hristo: We’ve been doing a lot of stuff to improve the overall performance of WordPress, not just caching. It’s a very big effort to provide the fastest possible hosting environment for WordPress—everything from timeouts, configurations, to having the latest PHP version and Memcached. Everything is one big thing, pretty much connected onto everything else.

The caching is a big part of what we do to give our customers great loading speeds, especially for WordPress. We’re using Nginx as a reverse proxy, and we developed the plugin because we wanted to seamlessly integrate the service and the WordPress website. It’s really not that hard to completely cache a website, but it’s difficult to clear that cache whenever you have to. And it’s not that easy to make it work seamlessly without you having to purge the cache manually. So that’s where we started doing the plugin.

Nginx Reverse Proxy

Nginx as a reverse proxy is way faster than any caching plugin alone, because it saves cached content within the server’s memory.

David: Before we jump in, I have a naïve question: how is Nginx reverse proxying different or better than WP Super Cache or something like that?

Hristo: Nginx as a reverse proxy is way faster than any caching plugin alone that you can think of, simply because Nginx is a service and it runs on the entire server, and then that cached content is saved within the server’s memory. When your visitors hit the cached content, when they want to load a cached page, they get the page the fastest possible way: they get their record from the server’s RAM, even before they hit the webserver.

So we have two benefits: first, you get your content from the fastest possible place. There’s nothing faster than server memory. Second, you significantly lower the load that those visitors are causing to the webserver; because no PHP processors are hit, you don’t even reach the webserver.

What regular caching plugins like W3 Total Cache do is pretty similar, but what they do is that they pretty much save static files on the server’s hard drive. And if you compare the loading times to load something from a hard disk, it’s much faster to load from the server’s memory.

And again, when you’re using a plugin, your visitors hit the webserver, and then the webserver loads the file, and so you’re creating hits again. It’s still a static HTML file, but again it’s being processed by—in our case, by Apache, which stands behind Nginx.

David: So you have a new layer in front of Apache, whereas with other plugins you hit Apache and it has to do some processing and give back a file’s contents.

Hristo: Yes. And again, you’re loading that content from a file on the hard drive, whereas with our reverse proxy you’re loading it from the server’s memory.

Static, Dynamic, and Object Caching

Static caching caches CSS, JavaScript, and other static content; the Memcached object cache caches database queries; and the Nginx dynamic cache is full-page caching.

Fred: There’s three layers to this caching, right?

Hristo: Yes.

Fred: It would be great to know what each of them does.

Hristo: Static cache is enabled by default on all accounts. It’s pretty straightforward: it does static caching. It caches CSS, JavaScript, Flash, it searches for static content. Whether or not you use the SiteGround plugin, it will just work across the entire domain. You don’t even have to be using WordPress. It just caches static content, that’s it.

The dynamic cache is the interesting part; it’s the reverse proxy part I’ve been talking about so far.

The idea behind Memcached is to speed up your database connection. If you have certain queries that are often repeated to your database, it can help and speed up that process. The last tab is HHVM, but HHVM is only available on our cloud hosting accounts, so you won’t get it on the GoGeek.

So when you get content from Nginx, it’s full page caching: the complete HTML output. We also cache static content, like images, JavaScript, CSS files. With the HTML output, though, the mechanism is the same: it’s full page caching, not object caching.

David: I know that for the GoGeek shared plan we’re on, we do have object caching available, and it’s presented through the plugin.

Hristo: Yes, the plugin does you the ability to enable Memcached D2.

David: And that caches your transients in memory, right?

Hristo: Among other things, yes.

SiteGround CachePress

cPanel and User Interface

Make sure you’ve enabled the SuperCacher in cPanel. The plugin acts as a connector between WordPress and the service: if the dynamic cache is not enabled in cPanel, even if your plugin is activated, even if it provides the correct headers, the caching will not work. Part of the plugin’s newest functionality is to test whether you’re cached or not, because it’s important to actually make sure that the service is running.

We don’t serve cache content to logged-in users, the idea being that if you’re logged into your website, you need to see one-hundred percent dynamic content. This is one of those things that we’ve done especially for shopping carts and BuddyPress and people who want to have dynamic social networks or whatever. So if you’re logged in you’re going to see dynamic content every time.

Fred: Thinking from the standpoint of a developer who’s not all that savvy about caching, the resources I usually notice cached are usually image files and CSS files—those are the things that I have trouble purging the cache. It sounds like it’s really the static layer that is caching those resources.

Hristo: Yeah, the static one for sure.

Fred: So when you purge it, does it purge all three of those layers?

Hristo: Yes, it purges everything for that domain name. If you see your text updated and still you see your old image, this means that it is a static cache. If ever the entire page is old, this means that the dynamic cache is up too.

Auto-Purging and WordPress Hooks

One of the main things that the plugin does is the auto-purge or auto-flush functionality. We monitor for different events on your WordPress website. Then we purge the cache—unfortunately, so far, for the entire domain name—whenever we detect that you have updated a post or page, installed a plugin, or whatever. The plugin needs to know when the cache has to be cleared, so we’ve created a list of hooks:

public function assign_hooks_for_autoflush() {
	add_action( 'save_post', array( $this,'hook_add_post' ) );
	add_action( 'edit_post', array( $this,'hook_add_post' ) );
	add_action( 'publish_phone', array( $this,'hook_add_post' ) );
	add_action( 'publish_future_post', array( $this,'hook_add_post' ) );
	add_action( 'xmlrpc_publish_post', array( $this,'hook_add_post' ) );
	add_action( 'before_delete_post', array( $this,'hook_delete_post' ) );
	add_action( 'trash_post', array( $this,'hook_delete_post' ) );
	add_action( 'add_category', array( $this,'hook_add_category' ) );
	add_action( 'edit_category', array( $this,'hook_edit_category' ) );
	add_action( 'delete_category', array( $this,'hook_delete_category' ) );
	add_action( 'add_link', array( $this,'hook_add_link' ) );
	add_action( 'edit_link', array( $this,'hook_edit_link' ) );
	add_action( 'delete_link', array( $this,'hook_delete_link' ) );
	add_action( 'comment_post', array( $this,'hook_add_comment' ),10,2 );
	add_action( 'comment_unapproved_to_approved', array( $this,'hook_approve_unapprove_comment' ) );
	add_action( 'comment_approved_to_unapproved', array( $this,'hook_approve_unapprove_comment' ) );
	add_action( 'delete_comment', array( $this,'hook_delete_comment' ) );
	add_action( 'trash_comment', array( $this,'hook_delete_comment' ) );
	add_action( 'switch_theme', array( $this,'hook_switch_theme' ) );
	add_action( 'customize_save', array( $this,'hook_switch_theme' ) );
	add_action( 'automatic_updates_complete', array( $this,'hook_atomatic_update' ) );
	add_action( 'future_to_publish', array( $this,'scheduled_goes_live' ) );
	add_action( '_core_updated_successfully', array( $this,'core_update_hook' ) );

Fred: So this a list of the things that will cause the cache to purge?

Hristo: Yeah, those are pretty much the hooks that we monitor.

Caching and Cron Jobs

WordPress’s internal cron mechanisms are a fun thing to work with.

That was a funny story here: internal to WordPress, the internal cron mechanisms are a fun thing to work with. We had issues when people would schedule a post. What happened a year ago—we fixed it way back, but what used to happen is that, for example, you’re writing a new post, you complete your post, and then you schedule it to be published tomorrow. So then you just log off your site and you go on vacation. The internal WordPress cron works when there’s a hit on any page—the wp-cron.php file checks whether WordPress should or should not execute any scheduled tasks. They’ve done it that way because they can’t really rely on every hosting environment having cron jobs available.

So they rely on the fact that somebody will hit the page, and that the PHP file will be included and executed, and then WordPress will check whether your posts should go live or not. The thing is that when we cache your entire website, and when your visitors start loading those pages, they get the cached content of those pages. Their request never reaches the web server, and never reaches the PHP service. So the wp-cron.php file is never executed.

Fred: So people were scheduling posts that just never published.

Hristo: Yeah. Unless somebody posts a command which triggers the purge of the cache.

What we did is that we completely excluded the wp-cron.php file from the cache so it hits every time; and we added this hook which tells us when a post has reached its publish date. So those two things fixed this issue.

That’s one of those things we couldn’t have thought of initially when we started this. That’s why all the feedback we get is really important for us. We always monitor it. For example, the “Purge Cache” button in the logged-in bar was requested by one of our customers about two weeks before we published it, and we were like, “Yeah, that makes a lot of sense, Let me fit it somewhere in the schedule,” and the next week it was the new version.

purge_sg_cache

Sending Purge Commands to Nginx

Fred: How do you actually tell the cache to flush itself?

Hristo: You send a header to purge for that domain name and the cache. And Nginx gets that and clears it.

David: So, at the IP for the server, which is finding from a file path it—

Hristo: We rely on that to get the IP address of the server of the Varnish for Nginx. Recently we switched from Varnish to Nginx. We were using Varnish as a reverse proxy for the dynamic caching, but we went for Nginx for a few reasons, the biggest being that varnish doesn’t cache HTTPS requests. It doesn’t cache pages with SSL, or at least the free version of Varnish doesn’t do it. Nginx is handling that pretty well. That’s probably one of the main reasons.

Another is that eventually we’re going to switch from Varnish to Nginx. I’m not sure when that’ll happen, but it’s something we’re moving forward with.

Fred: Can you tell us about these fopen()-type functions?

Hristo: That’s how you talk to any server. You just open a socket connection and say, “Hey, bang this domain and purge that cache.”

Fred: What’s a socket?

Hristo: You should have invited someone who prefers to talk more about PHP—some important guy from Zend. [laughter]

It opens a socket connection and sends the command to the Nginx service. Everything works based on headers.

Building a Caching Plugin for WordPress’s Many Use Cases

I would love to be able to purge page-by-page, but there are too many edge cases in WordPress.

Fred: Why not purge one page at a time?

Hristo: There are many edge cases; people use WordPress in so many ways that you can’t cover everything. I would love to be able to purge the cache for just a single page, because if you update only one page and I clear the cache for that page, all other pages will still be served from cache and it will significantly reduce the load on the server. Your visitors will be happier, too, because they’ll load the content much faster.

However, there are millions of things that you have to take into consideration. For example, if you write a new post, I’m going to update your index page—that’s a no-brainer. I’m going to update your posts page—again, a no-brainer—but what if you have a “latest posts” widget? How am I going to detect on which URL you have that piece of content to change?

We’re getting into some more difficult stuff, like the DSI model where you can exclude different pieces of code from the cache using HTML comments. Again, that will require a lot of manual work; we can’t just enable it by default for everybody. But that’s something on our list that we’d love to be able to do in the future, to allow you to use a simple HTML comment in your index or whatever code, to say “exclude from cache.” So you’d have a cached page, then a completely dynamic widget, for example.

Development Team

Fred: Tell us a bit about the team that’s working on this.

Hristo: The service is handled by our operations team, which is the guys that code for server stuff: they make kernel patches, they code our own services to monitor backups, that kind of stuff. Those guys are really smart, and they’re really good developers. There’s seven or eight of them, and they’re not just working on the caching service—they work on everything on the server level. If there’s a new version of PHP, for example, it’s their job to make it available to you.

We recently changed the design of the plugin and made it responsive, and one of our designers helped with that. Then we have about ten developers who work only on our front-end functionality, internal systems like billing and so on. Anything more complicated, I can rely on them. Then it’s me.

So it’s a combined effort, it’s not just me. It’s many people contributing to the plugin.

Setting Smart Defaults

“Your plugin is as good as your defaults are.”

Fred: CachePress has transitioned to being one of the many things that I actually love about being hosted on SiteGround. I just think there’s been a huge arc of improvement in the UI for that over the past year.

Hristo: Thank you. We’ve put a lot of effort into that. Last year at WordCamp Europe, there was this really nice talk by Yoast. Generally he was saying that “Your plugin is as good as your defaults are.” When you work at a hosting company, you have this great view of so many issue that people hit everyday. A lot of people have problems configuring plugins with a lot of buttons, settings, check boxes, drop downs. If you provide a solution to a broad audience and you want them to make all sorts of configurations, you’re dramatically increasing the chance that they break something.

This is why, for example, when we use our one-click WordPress installer, the plugin is activated. We don’t add any other plugins, just the SuperCacher; it’s there, activated, the auto flush is enabled by default, and the dynamic cache is enable by default. One of our main goals to make it as simple as possible and still you get the performance boost.

Conclusion

David: This extra layer of Nginx in between adds some complexity to the purging logic. But once you guys have mastered that interface to it, it’s simple, and, as Fred was saying, there’s a nuke button. It’s a great experience, and it’s the fastest caching you could imagine for the price.

Hristo: Yeah, right now I can’t think of anything faster than loading a piece of static HTML from the server’s memory. The moment they come up with something faster than the RAM, we’ll consider caching content. The rest of the speed equation is the sheer volume of your content, the number of request, and the physical network connection between you and your server.

Fred: Hristo, thank you so much for walking us through the plugin and thank you for walking us through how you look at caching in general. It’s really fun to talk to you.

Hristo: Thank you, it’s been a pleasure being here with you guys.

Big thanks to SiteGround!

8 Responses

Comments

  • Craiger says:

    CachePress sounds very cool! You mentioned that wpshout is hosted on Siteground. Are you using CachePress at this time? I was curious, so I ran a site speed test on your home page. It looks a little slow and reports your content isn’t cached, and time to first byte is also slow. Here are the rest results. http://www.webpagetest.org/result/150818_GF_1AWT/

    • fredclaymeyer says:

      Thanks! Yes, site is cached, static and dynamic. Thanks a lot for test results – not sure the story with some of them, but did find and fix an overlarge PNG on the homepage based on the test.

Pingbacks