Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default caching strategies for navigations and page assets #338

Merged

Conversation

westonruter
Copy link
Collaborator

@westonruter westonruter commented Oct 10, 2020

This merges the Basic Site Caching extension plugin into PWA, as well as further iterating on it.

Fixes #265.
Fixes #176.
Fixes #264.

Navigation Caching

Previously navigation requests used NetworkOnly strategy by default, which meant that pages accessed while navigating would not be cached at all, so they would not be available for offline browsing. Instead, the offline page would have been served instead, which is not super helpful. This PR changes the default to the NetworkFirst caching strategy for navigation requests, which will allow for previously-visited pages to continue to be accessed while offline, which is they key benefit of PWA.

The strategy is also now configured by default as follows:

  • cache_name: 'navigations'. This is the cache used for storing navigated pages. Note that the PWA plugin will automatically prefix the cache name to be specific to the WP site, especially for the sake of subdirectory installs. For example: wp-/-navigations.
  • network_timeout_seconds: 2 seconds. When accessing a page, it will first attempt to fetch the most recent version from origin. If that takes longer than 2 seconds, then the previously-cached page will be served This hopes that already-cached page assets will result in a page taking <500ms to render, thus leading to a good LCP measurement.
  • expiration.max_entries: 10. The 10 most recently-visited pages are cached.

Previously, when wanting to configure the navigation caching strategy, you'd need to supply two filters:

// 👎 Obsolete example.
add_filter(
	'wp_service_worker_navigation_caching_strategy',
	function () {
		return \WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST;
	}
);

// 👎 Obsolete example.
add_filter(
	'wp_service_worker_navigation_caching_strategy_args',
	function ( $args ) {
		$args['cacheName'] = 'navigations';
		$args['networkTimeoutSeconds'] = 2;
		$args['plugins']['expiration']['maxEntries'] = 20;
		return $args;
	}
);

To avoid needing to use two separate filters, this PR introduces a new wp_service_worker_navigation_caching filter which is passed an array that can be modified to specify both the strategy and its args. With that, the wp_service_worker_navigation_caching_strategy and wp_service_worker_navigation_caching_strategy_args filters will be deprecated. You can disable the caching of navigation responses by filtering to return an empty array like so: add_filter( 'wp_service_worker_navigation_caching', '__return_empty_array' );.

Note also how in the above (obsolete) examples the strategy args are in camelCase instead of snake_case, which isn't the PHP-WordPressy way manipulating arrays, even though these are what Workbox.js itself is expecting. So you now provide the keys as snake_case instead of camelCase, though the old camelCase names will still work as well (but you should start to only use snake_case going forward). The keys are converted to camelCase before passing into Workbox.js.

There is also this plugins array key which is essentially how Workbox.js is passed the configuration information to these Workbox plugins, but in the WordPress context this is confusing and unnecessary. So the array is flattened so that the plugin names can be supplied as top-level keys, so $args['expiration'] instead of $args['plugins']['expiration'].

To summarize these changes:

  1. The wp_service_worker_navigation_caching filter is used to make any changes to the strategy and its configuration.
  2. The filters wp_service_worker_navigation_caching_strategy and wp_service_worker_navigation_caching_strategy_args should no longer be used. Eventually usage will cause deprecation notices.
  3. The snake_case convention is used for all array keys in the configuration.
  4. The configurations for Workbox plugin are provided as top-level array keys rather than being nested under plugins. (Similar to RuntimeCachingEntry.)

Example

You can now configure the navigation caching strategy with much less code. For example, here's how to configure the service worker to:

  1. Use a StaleWhileRevalidate strategy for serving pages.
  2. Only hold onto those pages for 1 minute.
  3. Only cache responses which are OK and for non-authenticated users.
  4. Broadcast updates when pages are updated (so you can show a notification).

You can do so with the following plugin code:

// First, send a stale-while-revalidate Cache-Control header with pages for users that are not authenticated.
add_filter(
	'wp_headers',
	function ( $headers ) {
		if ( ! is_user_logged_in() && ! isset( $headers['Cache-Control'] ) ) {
			$headers['Cache-Control'] = 'max-age=1, stale-while-revalidate=59';
		}
		return $headers;
	}
);

// Now, configure the service worker to cache navigations to these pages.
add_filter(
	'wp_service_worker_navigation_caching',
	function ( $config ) {
		// Use the stale-while-revalidate strategy in the service worker.
		// Formerly specified via the wp_service_worker_navigation_caching_strategy filter.
		$config['strategy'] = WP_Service_Worker_Caching_Routes::STRATEGY_STALE_WHILE_REVALIDATE;

		// Formerly needing to have a parent 'plugins' key.
		$config['expiration'] = [
			// Only hold on to cached pages for 1 minute.
			// Formerly needing to be maxAgeSeconds.
			'max_age_seconds' => 60,
		];

		$config['cacheable_response'] = [
			// Only cache responses that are 200 OK.
			'statuses' => [ 200 ],
			// Only cache responses for unauthenticated users per wp_headers filter above.
			'headers'  => [
				'Cache-Control' => 'max-age=1, stale-while-revalidate=59', // See <https://web.dev/stale-while-revalidate/>.
			],
		];

		// Notify the page via postMessage when a stale page has been updated in the background.
		$config['broadcast_update'] = [];

		return $config;
	}
);

Site Asset Caching

In addition to enabling the NetworkFirst caching strategy by default for navigation requests, this PR also enables the NetworkFirst caching strategy for theme, plugin, and core (wp-includes) assets. This strategy is used as opposed to StaleWhileRevaliate for the sake of development and since serving stale assets can leave the page in bad state. In any case, assets should already have far-future expiration in browser cache anyway.

The expiration max_entries for each are informed by the 75th percentile on HTTP Archive, see #265 (comment):

  • For themes, maximum asset entries are capped at 34. This includes ~622KB.
  • For plugins, maximum asset entries are capped at 44. This includes ~449KB.
  • For core (wp-includes), maximum asset entries capped at 14. This includes ~178KB.

Three new filters are introduced for customizing the caching strategy configuration for these three categories:

  • wp_service_worker_theme_asset_caching
  • wp_service_worker_plugin_asset_caching
  • wp_service_worker_core_asset_caching

These filters are passed a configuration array that looks the same as the array passed to wp_service_worker_navigation_caching for navigation requests, with one addition: there is a route array key which includes a regular expression pattern to match the asset URL routes to be included in caching.

Uploaded Image Caching

Lastly, this PR adds a StaleWhileRevalidate caching strategy for uploaded images. This includes the files located in the site's uploads directory which have the file extensions returned by wp_get_ext_types()['image']. Cache expiration is currently set to 1 month, with limiting the number of cached images to 100 entries.

As with configuring caching strategies for navigations, core assets, theme assets, and plugin assets, the uploaded images can be configured with the wp_service_worker_uploaded_image_caching filter.

Todo

  • Add testing.
  • Make sure existing camel case filtering is not overridden with defaults doe navigation requests.
  • Switch to the stale-while-revalidate strategy for uploaded files.
  • Add max_entries to uploaded files as a safeguard.
@westonruter westonruter added this to the 0.6 milestone Oct 10, 2020
@westonruter westonruter force-pushed the add/default-network-first-caching-strategy-and-args branch from 24fe7c5 to f07b2f0 Compare October 10, 2020 04:54
Comment on lines 62 to 64
'theme_asset_caching' => new WP_Service_Worker_Theme_Asset_Caching_Component(),
'plugin_asset_caching' => new WP_Service_Worker_Plugin_Asset_Caching_Component(),
'uploaded_images__caching' => new WP_Service_Worker_Uploaded_Image_Caching_Component(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these should be “components”. They could rather be just global functions that are added to wp_front_service_worker in default-filters.php. This will allow them to easily be removed or replaced.

*
* @type string $cacheName Cache name to store navigation responses.
* @type int $networkTimeoutSeconds Network timeout seconds when NetworkFirst strategy is used.
* @type array $plugins Configuration for plugins, in particular expiration.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Workbox has this plugins level of nesting, I think it would make sense to promote the children up a level as it would be less confusing when extending in a WordPress context. So instead of there being plugins.expiration there could be just expiration. Naturally we'd need to move into a plugins array when passing over to Workbox, but we already have a set of such known plugins when we prepare_strategy_args_for_js_export:

$recognized_plugins = array(
'backgroundSync',
'broadcastUpdate',
'cacheableResponse',
'expiration',
'rangeRequests',
);

So that would be easy to do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in efb519d.

Comment on lines 100 to 137
$caching_strategy = apply_filters( 'wp_service_worker_navigation_caching_strategy', WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_ONLY );
$caching_strategy = apply_filters( 'wp_service_worker_navigation_caching_strategy', WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST );

$caching_strategy_args = array(
'cache_name' => WP_Service_Worker_Caching_Routes::NAVIGATIONS_CACHE_NAME,
);
if ( WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST === $caching_strategy ) {
/*
* The value of 2 seconds is informed by the Largest Contentful Paint (LCP) metric, of which Time to
* First Byte (TTFB) is a major component. As long as all assets on a page are cached, then this allows
* for the service worker to serve a previously-cached page and then for LCP to occur before 2.5s and
* so remain within the good threshold.
*/
$caching_strategy_args['network_timeout_seconds'] = 2;
}

/*
* By default cache only the last 10 pages visited. This may end up being too high as it seems likely that
* most site visitors will view one page and then maybe a couple others.
*/
$caching_strategy_args['plugins']['expiration']['max_entries'] = 10;

/**
* Filters the caching strategy args used for frontend navigation requests.
*
* @since 0.2
* @since 0.6 Added $caching_strategy param and initial array has default values provided.
* @see WP_Service_Worker_Caching_Routes::register()
*
* @param array $caching_strategy_args Caching strategy args.
* @param array $caching_strategy_args {
* Caching strategy args.
*
* @type string $cache_name Cache name to store navigation responses.
* @type int $network_timeout_seconds Network timeout seconds when NetworkFirst strategy is used.
* @type array $plugins Configuration for plugins, in particular expiration.
* }
* @param string $caching_strategy Caching strategy being used.
*/
$caching_strategy_args = apply_filters( 'wp_service_worker_navigation_caching_strategy_args', $caching_strategy_args, $caching_strategy );
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wp_service_worker_navigation_caching_strategy and wp_service_worker_navigation_caching_strategy_args filters can be deprecated in favor of the one wp_service_worker_navigation_caching filter.

Comment on lines 41 to 49
/*
* Note that the path alone is used because CDN plugins may load from another domain. For example, given an
* uploaded image located at:
* https://example.com/wp-content/uploads/2020/04/foo.png
* Jetpack can change rewrite the URL to be:
* https://i2.wp.com/example.com/wp-content/uploads/2020/04/foo.png?fit=900%2C832&ssl=1
* Therefore, the following will include any URL ending in an image file extension which also is also
* preceded by '/wp-content/uploads/'.
*/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to work. Specifically, when a CacheFirst strategy is used, if the third-party domain is not responding with CORS headers then Workbox will refuse to cache the opaque responses by default: https://developers.google.com/web/tools/workbox/guides/handle-third-party-requests

We should probably just assume that an image CDN will already have the assets cached with a far-future expires header anyway and not worry about explicitly caching the responses by the service worker beyond the browser cache.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, it would be good to determine if WP sites generally have correct Cache-Control response headers being served for uploaded files. If so, and browsers are caching them for a long time, then it would not be necessary to add caching for uploaded images

@westonruter westonruter force-pushed the add/default-network-first-caching-strategy-and-args branch from 37f3243 to f6f2ef4 Compare October 18, 2020 01:52
@swissspidy swissspidy self-requested a review October 29, 2020 15:49
Copy link
Collaborator

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks solid to me, that's gonna be a massive enhancement! Just some feedback and suggestions.

* @type array|null $plugins Deprecated. Array of plugins with configuration. The key of each plugin in the array must match the plugin's name.
* This is deprecated in favor of defining the plugins in the top-level.
* See <https://developers.google.com/web/tools/workbox/guides/using-plugins#workbox_plugins>.
* @return bool Whether the registration was successful.
* }
*/
public function register( $route, $args = array() ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes made here indicate that it might be more intuitive to make route part of $args here too and have that as the only parameter. The strategy argument is required anyway, so I think this array overall should be required.

We could change the method signature to something like register( $args, $deprecated = array() ), and then if $args is a string, assume that it is actually $route and then $deprecated would be $args.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually went down that “route” in the course of this PR, but then I decided against it. The reason I decided to keep two arguments is that it keeps the method signature consistent with WP_Service_Worker_Precaching_Routes::register(). Both then take a URL/route and then args. In both cases, the args can be a string. For precaching, the $args can be a string as a shortcut to be the revision. For WP_Service_Worker_Caching_Routes::register(), the $args as a string is a shortcut to provide the strategy.

It's kinda handy to be able to do:

$scripts->caching_routes()->register( '/foo', WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST ); 

Rather than:

$scripts->caching_routes()->register( '/foo', array(
    'strategy' => WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST,
) ); 

So that's the intention here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's a fair point. However, then I'd suggest to give both of these methods three parameters: The URL/route and the strategy/revision are both required, so I think it's counterintuitive to have these in $args. IMO $args should always be an optional thing for more "advanced" customizations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying for caching routes it should be:

$scripts->caching_routes()->register( 
    '/foo', 
    WP_Service_Worker_Caching_Routes::STRATEGY_NETWORK_FIRST,
    array( /* Additional optional args. */ )
); 

And for precaching routes (which is already supported):

$scripts->precaching_routes()->register( 
    '/foo', 
    $revision
); 

Oh, but actually revision is not required in the precache list. But it's the only argument that precaching currently supports. There are no other arguments. So this is fine too:

$scripts->precaching_routes()->register( '/foo' ); 

This isn't the case for caching routes, where the strategy must be provided. So in that case, register for caching routes would have 3 arguments but register for precaching routes would have only 2?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the signatures you put as examples look good to me. Regarding $revision not being required, I think that's okay, in this case we can make it optional. Now, overall I think optional arguments make sense to go into an $args array, but if $revision is the only one, I don't think that would be needed here - so I suggest to make it an optional string parameter and not have $args for precaching routes at all.

If at some point we need additional optional arguments for precaching routes, we can transform the second parameter to an optional $args and keep backward-compatibility by still supporting a string (and then assuming that is the revision).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, problem: a method signature params of ( $route, $strategy, $args = array() ) will violate the implementation of WP_Service_Worker_Registry::register interface, which has ( $handle, $args = array() ). Ditch the interface?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Yeah, I think that makes sense. We should then remove WP_Service_Worker_Registry_Aware as well.

wp-includes/service-workers.php Show resolved Hide resolved
pwa.php Show resolved Hide resolved
Copy link
Collaborator

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Just one follow-up comment about IMO more intuitive register method signatures in #338 (comment).

* Eliminate WP_Service_Worker_Registry and WP_Service_Worker_Registry_Aware interfaces.
* Let second argument to WP_Service_Worker_Caching_Routes::register() be strategy name and third argument be strategy args.
@westonruter
Copy link
Collaborator Author

I've implemented that change in 5f1c00a but pulling that thread caused several other necessary changes. IMO the classes can further be consolidated and made more wordpressy now. It doesn't seem like the classes are helping so much as much as getting in the way at the moment. For example, the various WP_Service_Worker_Component classes could just as well be registered as functions on the wp_front_service_worker action. There doesn't seem to be a need for the component abstraction, especially for WP_Service_Worker_Caching_Routes and WP_Service_Worker_Precaching_Routes.

@felixarntz
Copy link
Collaborator

@westonruter The changes you made look good to me. IMO keeping the component classes still makes sense over functions at least because they allow splitting up into multiple private methods rather than giant functions or "private" functions which aren't possible.

…/default-network-first-caching-strategy-and-args

* 'develop' of github.com:GoogleChromeLabs/pwa-wp: (28 commits)
  Revert "Bump workbox-cli from 5.1.4 to 6.0.2"
  Bump eslint from 7.14.0 to 7.15.0
  Bump workbox-cli from 5.1.4 to 6.0.2
  Bump requires at least to 5.5 and tested up to 5.6
  Replace Travis badge with GHA badge in readme.md
  Run composer update again
  Add PHP 8 to require, fix WP_TESTS_DIR
  Run composer update
  Update PHP 8 composer.json patch
  Add pre-commit script to composer.json
  Add config to composer.json
  Update test matrix
  Fix linting errors for package.json
  Fix test_register_manifest_rest_route after #349
  Account for s/new_whitelist_options/new_allowed_options/
  Add name for default testsuite
  Run composer update
  Install phpunit as dependency
  Incorporate GitHub actions from AMP plugin
  Remove .travis.yml
  ...
@codecov-io
Copy link

Codecov Report

Merging #338 (5e69c71) into develop (f7f4cfb) will increase coverage by 5.91%.
The diff coverage is 66.52%.

Impacted file tree graph

@@              Coverage Diff              @@
##             develop     #338      +/-   ##
=============================================
+ Coverage      19.62%   25.54%   +5.91%     
- Complexity       314      362      +48     
=============================================
  Files             50       54       +4     
  Lines           1620     1785     +165     
=============================================
+ Hits             318      456     +138     
- Misses          1302     1329      +27     
Flag Coverage Δ Complexity Δ
php 25.54% <66.52%> (+5.91%) 0.00 <60.00> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
pwa.php 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
wp-includes/class-wp.php 14.28% <ø> (ø) 0.00 <0.00> (ø)
...ass-wp-service-worker-caching-routes-component.php 0.00% <0.00%> (ø) 5.00 <1.00> (-1.00)
...wp-service-worker-navigation-routing-component.php 1.50% <0.00%> (+1.50%) 33.00 <0.00> (+7.00)
...-wp-service-worker-precaching-routes-component.php 0.00% <0.00%> (ø) 5.00 <1.00> (-1.00)
...ents/class-wp-service-worker-precaching-routes.php 50.00% <ø> (ø) 10.00 <0.00> (ø)
wp-includes/service-workers.php 20.75% <ø> (ø) 0.00 <0.00> (ø)
wp-includes/class-wp-service-worker-scripts.php 29.62% <50.00%> (-0.86%) 28.00 <3.00> (-2.00)
...ponents/class-wp-service-worker-caching-routes.php 72.26% <70.09%> (+29.41%) 39.00 <32.00> (+22.00)
...p-service-worker-theme-asset-caching-component.php 88.88% <88.88%> (ø) 6.00 <6.00> (?)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f7f4cfb...5e69c71. Read the comment docs.

Copy link
Collaborator

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@NicoHood
Copy link

NicoHood commented Mar 7, 2022

I just came here from:
https://github.com/GoogleChromeLabs/pwa-wp/wiki/Service-Worker#page-navigation-caching

My issue is, that when I load the app (after a few days), the front page is still loading the cached version and not the most up to date information. I'd like to solve this.

I read that I can:
a) Disable caching
b) Set caching timeout to 5 seconds
c) Use StaleWhileRevalidate Strategy

But then I also see that time has passed and this PR was merged and I am curious if the wiki is still up to date. What would you recommend me to solve the issue?

@westonruter
Copy link
Collaborator Author

@NicoHood Sorry for the delay. So the issue is that the NetworkFirst strategy is timing out at 2 seconds, and so then it serves the cached version?

First of all, you should really try to improve your server response times so that the responses are served in 600ms or less. Usually a good page caching plugin will get you what you need.

If this is not possible, increasing the TTL from the default of 2 seconds to 5 seconds should do the trick:

add_filter( 'wp_service_worker_navigation_caching', static function ( $config ) {
	$config['network_timeout_seconds'] = 5;
	return $config;
} );

It is possible to use a StaleWhileRevalidate strategy as well with this code:

add_filter( 'wp_service_worker_navigation_caching', static function ( $config ) {
	$config['strategy'] = 'StaleWhileRevalidate';
	return $config;
} );

However, this will guarantee that stale responses will be served, which I don't think you want.

@NicoHood
Copy link

Thanks!

I moved to another server which I hope will respond faster. Google Page Speed at least tells me that its better.

But what about mobile networks? Is 2 seconds the time the server will a) answer or b) transmit the whole webpage? Maybe 2 seconds is still a bit short for mobile networks, isn't it? Who set 2 seconds as a default? Is it just a best guess, or a specification default, etc?

@westonruter
Copy link
Collaborator Author

The value of 2 seconds is informed by Core Web Vitals. To achieve a good LCP metric, the largest contentful paint needs to happen within 2.5 seconds of the page loading. So by having a 2-second TTL for loading a page from the network, the browser then has 500ms to render the page if it was retrieved from cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment