Website Performance Boot Camp

Jon Jensen
End Point Corporation
endpoint.com

Utah Open Source Conference, May 3–5, 2012

Why care?

Less-visible reasons

Architecture impact

Some optimization techniques
lead to a cleaner architecture.

Others make things more complicated,
so maybe should be delayed.

General principles

What to watch

An example

Using webpagetest.org:

479popcorn.com before tuning

An example

479popcorn.com after tuning

Web software stack layers

Browser layer

Ideal goal

If nothing is faster than nothing,
ideally we’d like to serve nothing at all.

How?

Full-page caching

Cache the entire page and every asset on it
entirely in the browser
for as long as the user is using the site.

If we succeed

The browser won’t make
any more requests to the server
except e.g. Google Analytics tracking code.
And that’s fast and not our scalability problem.

Dynamic content

Probably your site has some dynamic components.

Use JavaScript

Put into cookies or HTML5 LocalStorage:

Display with JavaScript.

If so, the page is still fully cacheable.

Cache until login

Still a win to cache the entire page
only for unauthenticated users.
May cover a lot of traffic.

Cache high in the stack

It’s worth the work to cache the whole page. Why?

Caching high > caching low

If you can cache the whole page,
everything lower in the stack
is included — job done.

Ajax?

If only a small part of the page must be dynamic,
maybe fetch only that via Ajax so
everything else is statically cached.

Leverage existing caches

Get JavaScript libraries & web fonts
from Google’s CDN or others.

Embedded assets

Many HTTP requests for embedded assets are slow
due to TCP round trips,
even with HTTP keepalive.
Worse on mobile networks.

Combining assets

Optimize assets

“Edge” web server layer

Commercial CDN

Subdomain advantages

e.g. c.yourdomain.com:

Homegrown CDN

Why not Apache httpd?

“Origin” web server

Even if you’re only using Apache,
some simple tuning can help a lot.

Enable HTTP keepalive

KeepAlive On KeepAliveTimeout 2

Compress with gzip or deflate

Enable mod_deflate and set:

AddOutputFilterByType DEFLATE \ text/html text/plain text/xml text/css application/json application/x-javascript

Where to compress?

If using a caching reverse proxy, you may be better off
instead letting the cache store a single uncompressed copy
and gzip it to the clients (or not) as needed.

On modern CPUs gzip is really fast.

HTTP Cache-Control & Expires

Mark static assets as cacheable by both
the reverse proxy and the user’s browser.

2-hour example

Cache-Control: max-age=7200 Expires: Wed, 02 May 2012 06:06:18 GMT

For 1-day reverse proxy cache

Cache-Control: max-age=7200,s-maxage=86400

Set cache headers in Apache

ExpiresActive On ExpiresDefault "access plus 2 hours"

Why 2 hours?

Contra advice about a month or even a year…

URL changes, mistakes, tradeoffs, new/old app.

Compromise of 1–8 hours seems good.

ETags?

Apache generates unique ETags per server.

That’s ok if you have only one server. :)

Worth keeping if done right.

App server caching

Caching in memcached or files (local or NFS):

Disable development helps

Statelessness

Sessions

Work with your framework

Most web frameworks have conventions for these things.

Optimize for what matters

Which pages does everyone go through?

Which functions are busiest?

Database tools

Slow queries

But also tons of fast queries
that collectively bog down the system.

Use query log analysis tools.

Sneaky bottlenecks

But wouldn’t it be nice

… for the database speed not to matter much?

Cache at the highest layer of the stack possible,
and work down as needed.

It’s a win

if you never have to tune your database
because it is so rarely hit.

It’s a win

if you rarely use server-side sessions.

It’s a win

if your app server is bored.

It’s a win

if your main web server
is bored because a CDN
handled most traffic.

It’s a win

if the CDN is bored once visitors
have warmed their caches.

It’s a win

if their cache comes pre-warmed
with same jQuery from Google that
other sites use.

Perfection isn’t possible

Things will always be changing.

Measure and do what you can.

Small steps

executed over time add up.

Big plans never executed
are a total waste.

Final esoterica

Firewall state tables

Firewalls can run out of space in their state tables!

DNS problems

Test forward & reverse DNS resolution on each host.

Logging shouldn’t use DNS, but maybe it is!

Investigate

Be curious.

Investigate your hunches.

The “impossible” often isn’t,
both for good and bad.

Front-side tools

Tools suggested by attendees

Questions?

Twitter: @jonjensen0

jon@endpoint.com

Check out our Liquid Galaxy demo
here at the conference!