Jump to content

Apache 101: 0-WordPress in 15 minutes


Karlston

Recommended Posts

Apache 101: 0-WordPress in 15 minutes

Apache gets an undeserved bad rep from outdated guides—learn to set it up right.

Hellfire missiles not included.
Enlarge / Hellfire missiles not included.

 

Recently, we took a look at the Caddy Web server. Today, we're going to back things up a little bit and look at the A from the classic LAMP stack: the Apache Web server.

 

Apache has a bad reputation for being old, crusty, and low-performance—but this idea mostly stems from the persistence of ancient guides that still show users how to set it up in extremely antiquated ways. In this guide, we're going to set up an Ubuntu 20.04 droplet at Digital Ocean with an Apache Web server set up properly and capable of handling serious levels of traffic.

Installation

After spinning up a new $5/mo VM (Digital Ocean calls them "droplets"), the first thing we'll do is what anyone should do with any brand-new Linux server. We check for and then install upgrades, and—since one of them was a new Linux kernel version—reboot the server.

root@apache:~# apt update
root@apache:~# apt dist-upgrade
root@apache:~# shutdown -r now

With that bit of minor housekeeping out of the way, it's time to install Apache itself and the PHP language that most Web applications require.

root@apache:~# apt install apache2 php-fpm

Friends don’t let friends use mod_php inappropriately

I want to make incredibly clear what we have not installed—we did not, and will not, install the mod_php Apache module.

root@apache:~# apt policy libapache2-mod-php
libapache2-mod-php:
Installed: (none)
Candidate: 2:7.4+75
Version table:
2:7.4+75 500
500 http://mirrors.digitalocean.com/ubuntu focal/main amd64 Packages

The mod_php module was, once upon a time, the favored way to integrate PHP support into your Web server. It, by and large, replaced the older CGI (Common Gateway Interface) method, which passed the files with specified extensions off to a different application to process—the most common in those days being Perl.

 

Mod_php does things differently—instead of a separate PHP executable handling PHP code, the PHP language is embedded directly into the Apache process itself. This is an extremely efficient way to process PHP code—but it absolutely sucks for a server expected to handle non-PHP content, because every single Apache process must bring an entire PHP execution environment with it, sharply limiting the number of total Apache processes available due to memory bloat.

 

Installing mod_php also means requiring Apache to run with the elderly prefork MPM (Multi Process Module), which doesn't scale to as many available worker processes as the modern default MPM, event. The reason mod_php—and prefork—are still around at all is that they are very good for a pure application service, 100-percent PHP workload, with all CSS, static HTML, images, and so forth offloaded to a different server or servers.

Php-fpm is the right choice for a multi-purpose Web server

Instead, we installed the php-fm, the PHP FastCGI Process Manager. In this model, Apache doesn't bring PHP handling capabilities into the Apache processes themselves—instead, Apache hands its code execution needs off to a pool of dedicated PHP workers, which in turn pass results back to Apache.

 

Offloading PHP execution duties to a set of dedicated PHP worker threads enables Apache to use its more modern and better scaling event MPM handler. It also means each individual Apache thread can be spun up without the bulk of a PHP execution environment, drastically reducing the necessary amount of RAM for each thread.

GTMetrix reports are an invaluable tool for Web admins looking to optimize delivery of their sites.
Enlarge / GTMetrix reports are an invaluable tool for Web admins looking to optimize delivery of their sites.

The front page of my personal blog entails 31 separate HTTPS requests. Ten of those are to other domains—fonts.googleapis.com, fonts.gstatic.com, and my own Matomo instance. Index.php itself is another, and the remaining twenty are static files delivered from the same server.

RAM is easily the most precious resource on that VM—and since we now know that I'm serving static files at about a 20:1 ratio to dynamic pages, I obviously shouldn't be wasting RAM on a full PHP environment for every Apache worker process!

Enabling php-fpm and installing remaining support packages

Most real-world Web applications will want a bunch of additional PHP modules installed. If we wanted to install WordPress, for instance, we'd want the following laundry list of PHP extensions:

root@apache:~# apt install php-fpm php-common php-mbstring php-xmlrpc php-soap php-gd php-mysql \
                           php-xml php-intl php-mysql php-cli php-ldap php-zip php-curl

Whew. If you're not familiar with the use of the backslash there, it's a way of forcing a line break in the terminal without affecting code execution: so that's really all just one big line, installing all of the additional PHP extensions WordPress will want.

With those installed, we need to enable php-fpm itself, with the following commands:

root@apache:~# a2enmod proxy_fcgi
root@apache:~# a2enconf php7.4-fpm.conf
root@apache:~# systemctl restart apache2

That's it—we've now created our full, WordPress-ready Web server environment. The next step is creating a MySQL database for WordPress, which looks like this:

root@apache:~# mysql -u debian-sys-maint -p
mysql> create database wordpress;
mysql> create user 'wordpress'@'localhost' identified by 'supersecretpassword';
mysql> grant all on wordpress.* to 'wordpress'@'localhost';
mysql> quit;

Now we're ready to create a new vhost—virtual host—to contain the new WordPress site. We could just use the default vhost config, but we're not going to—we're going to do this like professionals and be ready to manage a multi-site environment.

Apache site, module, and configuration configuration (that’s not a typo!)

The thing I enjoy the most about using Apache rather than competing Web servers is the highly segmented approach it uses for configuration management. In the olden days—which I remember none too fondly—a server would have a single monolithic httpd.conf file that could easily be thousands of lines long and contain global configs for the server as well as all individual configs for every site on the server. Yuck!

 

Happily, Apache eventually introduced the Include directive, which allowed the main Apache config file to link in other config files—and, best of all, directories that could be expected to be full of config files. This allowed site admins to create an individual short config file for each site and—just by dumping it into the appropriate directory—have that site's configurations automatically added to the existing server config after a systemctl reload apache2 (or, on non-systemd machines, apache2ctl reload).

 

The fine folks at Debian, in turn, took that concept and ran with it. When you install Apache on a modern Debian-derived system such as Ubuntu, you get the following directories automatically created:

/etc/apache2
/etc/apache2/sites-available
/etc/apache2/sites-enabled
/etc/apache2/mods-available
/etc/apache2/mods-enabled
/etc/apache2/conf-available
/etc/apache2/conf-enabled

So let's say you want to add a module—like php-fpm itself—to Apache. You don't need to monkey around with the global config file in /etc/apache2/apache2.conf, because the php-fpm package just drops its configuration and load files into /etc/apache2/mods-available. They haven't actually taken effect yet because they're only in mods-available, not mods-enabled—but remember when we ran the command a2enmod proxy_fcgi in the last section?

root@apache:~# a2enmod proxy_fcgi
Considering dependency proxy for proxy_fcgi:
Enabling module proxy.
Enabling module proxy_fcgi.
To activate the new configuration, you need to run:
  systemctl restart apache2

What that command actually did was symlink the config file /etc/apache2/mods-available/proxy_fcgi.load to /etc/apache2/mods-enabled/proxy_fcgi.load. And when we next restart Apache as it's asking us to, Apache will Include all the files in mods_enabled—including our new friend, proxy_fcgi.load—and we'll therefore have the FastCGI proxy available.

 

If you remember, we did another command immediately after that one:

root@apache:~# a2enconf php7.4-fpm
Enabling conf php7.4-fpm.
To activate the new configuration, you need to run:
  systemctl reload apache2

That command symlinked /etc/apache2/conf-available/php7.4-fpm.conf to /etc/apache2/conf-enabled/php7.4-fpm.conf, and similarly, Apache will Include everything it finds in conf-enabledat each startup, so we've now got the following necessary configuration directives enabled:

root@apache:/etc/apache2/conf-available# cat php7.4-fpm.conf
# Redirect to local php-fpm if mod_php is not available

   
      # Enable http authorization headers
      
         SetEnvIfNoCase ^Authorization$ "(.+)" HTTP_AUTHORIZATION=$1
      

      <filesmatch ".+\.ph(ar|p|tml)$"="">
         SetHandler "proxy:unix:/run/php/php7.4-fpm.sock|fcgi://localhost"
      
      <filesmatch ".+\.phps$"="">
         # Deny access to raw php sources by default
         # To re-enable it's recommended to enable access to the files
         # only in specific virtual host or directory
         Require all denied
      
      # Deny access to files without filename (e.g. '.php')
      <filesmatch "^\.ph(ar|p|ps|tml)$"="">
         Require all denied
      
   

Now, if you don't see the beauty in that... I don't know what to tell you. If you're confused about exactly how Apache is handling PHP files when it encounters them, you have a single file where you can look to see those config stanzas, and only those config stanzas. You can view it without being confused and annoyed by hundreds or thousands of other lines of configs, and you can edit it if necessary without fearing accidentally screwing up those other hundreds or thousands of lines of configuration that you aren't touching, since you're only working in this single self-contained file.

 

This isn't only for system-provided configuration stanzas, either—nothing's stopping you from writing your own config stanzas for a particular purpose, dropping them in /etc/apache2/conf_available, and a2enconfing them as desired. Want to know all the modules that are enabled? ls /etc/apache2/mods_enabled. Want to see if more are available? mods_available. The same thing goes for configs in conf_enabled and conf_available, and site (vhost) configurations in sites_enabled and sites_available.

 

That makes my sysadmin heart sing, it really does.

 

Baby’s first vhost

Now that we understand the basic configuration layout Apache uses, we'll create a simple site configuration file and place it in /etc/apache2/sites_available/apachetest.tehinterweb.net.conf.

<VirtualHost *:80>
   ServerAdmin [email protected]

   ServerName apachetest.tehinterweb.net
   ServerAlias www.apachetest.tehinterweb.net

   DocumentRoot /var/www/apachetest.tehinterweb.net/www/public_html

   
      Options ExecCGI FollowSymLinks
      AllowOverride AuthConfig Limit FileInfo Options Indexes
      Order allow,deny
      allow from all
   

We've set an email address for the site owner, along with both a name and alias for the site itself—so either apachetest.tehinterweb.net or www.apachetest.tehinterweb.net will be served from the same vhost config here.

 

We also set a DocumentRoot beneath /var/www that's specific to the site itself—and we moved it a couple directories further down from there, too. Making the DocumentRoot be in www/public_html beneath the main site folder gives us room to set up a logs file for site-specific logs if we'd like to later, and maybe even a www/cgi-bin folder if we want to use some old-school Perl (or other) CGI executables that should be beneath the webroot.

 

Old-school CGI has fallen woefully out of fashion, but it's definitely nice to have a little bit of extra breathing room to put stuff obviously in the site's folder but beneath the webroot anyway. Aside from site logs, you might want to occasionally put database backups in /var/www/apachetest.tehinterweb.net/backups or whatever.

 

Anyway, now that we've got our site vhost configuration file, we need to enable it:

root@apache:~# a2ensite apachetest.tehinterweb.net
Enabling site apachetest.tehinterweb.net.
To activate the new configuration, you need to run:
systemctl reload apache2

root@apache:~# mkdir -p /var/www/apachetest.tehinterweb.net/www
root@apache:~# cd /var/www/apachetest.tehinterweb.net/www
root@apache:/var/www/apachetest.tehinterweb.net/www# wget https://wordpress.org/latest.tar.gz
root@apache:/var/www/apachetest.tehinterweb.net/www# tar zxf latest.tar.gz
root@apache:/var/www/apachetest.tehinterweb.net/www# cp -a wordpress public_html
root@apache:/var/www/apachetest.tehinterweb.net/www# chown -R www-data public_html

Boom, there's our site. Next step is installing Certbot and grabbing a LetsEncrypt TLS certificate:

root@apache:~# apt install python3-certbot-apache
root@apache:~# certbot

After Certbot asks us a couple of questions, it will automagically convert our vhost to use the new TLS certificate and enable HTTPS browsing, and optionally, forcibly redirect insecure HTTP browsing to HTTPS URIs. We're now ready to browse to https://apachetest.tehinterweb.net/ and start installing WordPress!

 

If we're going to do this kind of thing a lot, it pays to create a template file that you can keep in /etc/apache2/sites_available, and a script in /usr/local/bin that grabs a copy of the template, changes a few lines to match the new site name, then drops the copy into sites_available and a2ensites it for you, and makes a new directory structure under /var/www with some equally templated skeleton files—at which point adding an entirely new site to your Apache server is as simple as addsite new.site.com, and you're ready to go.

A little WordPress tuning before we get started

Before we can really get started tuning the new server, we need to take a couple steps to keep it from bottlenecking unnecessarily on MySQL when doing dynamic page loads. In WordPress' case, one way of doing this is to install the W3-Total-Cache plugin, along with the memcachedcache packages, and the php-memcache extension for php-fpm to use when accessing our new memcached.

Once that's done, we can go into W3-Total-Cache's settings, enable all the caches, and point them to memcached running on localhost. (Sorry for not giving you more detail and screenshots, but this is at least supposed to be an Apache guide, not a WordPress guide!)

Tuning Apache and php-fpm themselves

We're going to use a nifty combination of commands to get a bird's-eye view of what's going on with our little server:

root@apache:~# watch -n 1 'ps wwaux | head -n 1 ; ps wwaux | grep apache | grep -v grep \
               ; echo ; ps wwaux | grep php | grep -v grep ; echo ; free -m'

In order, we're:

  • Using watch -n 1 to refresh the rest of the command line every second
  • Doing a process list with all the trimmings but stripping it to the first line (which contains column headings)
  • Doing another process list with all the trimmings, limiting that to lines containing "apache" and not containing "grep"
  • Doing another process list with all the trimmings, limiting that to lines containing "php" and not containing "grep"
  • Checking the RAM statistics on the server

Now, from another server—to avoid contaminating our observations with the load caused by ab itself—we'll use Apachebench (gotten with apt install apache2-utils) to put load on our Web server.

 

We need to do that two ways: with a purely static workload and again with a purely dynamic workload. Observing the impact on our server in each case will give us some insight into how much RAM we need in order to fulfill each PHP page execution and each static file delivery. When we combine that with a waterfall chart to see about how many static files we'll need to deliver for each page load, we'll have a good idea of how to tune the server.

 

To test static page load, we'll use the command ab -c100 -t30 https://apachetest.tehinterweb.net/license.txt. This spawns 100 processes asking our server for the WordPress license, which is a reasonably small text file and needs no PHP processing.

 

To test dynamic page load, we'll use the command ab -c30 -t30 'https://apachetest.tehinterweb.net/?cat=1'. This request is for all blog posts in the WordPress category with id=1, which happens to fit the single "Hello world" post that WordPress came with out of the box—but asking for the whole category defeats a little of W3-Total-Cache's caching, so we still need to do significant PHP processing.

Kicking the tires with default configuration

Our new server has about 300MiB of its 1GiB total RAM sitting relatively idle, after we've messed around a little with manual browsing but otherwise left it alone.

 

First, let's see what happens when we force the server to do a lot of extra work delivering static files. We'll use the apachebench tool for this—and, importantly, we'll do that from a different system so as not to contaminate our analysis with memory or CPU usage of the ab process itself.

 

Not a whole lot seems to change whether the server is under load or not—we go from 294MiB free to 281MiB free under an all-static workload, and from 294MiB free to 277MiB free under an all-dynamic workload. Our Apachebench results tell us that we're serving 37.36 requests per second, with a median time of 739ms and a 99-percent time of 2886ms. And finally, our transfer rate is 1000.18 KiB/sec—so it doesn't look like we're bottlenecking on network throughput!

 

Now that we've got a baseline idea of how the system performs, it's time to look at how Apache and php-fpm are currently configured to behave.

Examining the default configuration

So, let's take a look at our current configurations. In Apache, we'll look at /etc/apache2/mods_available/mpm_event.conf:

root@apache:/etc/apache2/mods-available# cat mpm_event.conf
# event MPM
# StartServers: initial number of server processes to start
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestWorkers: maximum number of worker threads
# MaxConnectionsPerChild: maximum number of requests a server process serves

   StartServers 2
   MinSpareThreads 25
   MaxSpareThreads 75
   ThreadLimit 64
   ThreadsPerChild 25
   MaxRequestWorkers 150
   MaxConnectionsPerChild 0

The big thing we're looking at here is MaxRequestWorkers, which is set to 150. So no matter what else happens, our Apache isn't currently going to be willing to handle more than 150 concurrent requests. The rest of the directives here govern the nitty-gritty details of how we get to that many workers—how many Apache child processes, and how many threads each is allowed to spawn.

 

Now, let's consider php-fpm. Earlier, we guesstimated 8MiB or so—let's round that up to 10MiB—per typical php-fpm thread. How many total php-fpm children are allowed right now? To find the answer, we look in /etc/php/7.4/fpm/pool.d/www.conf. This file is absolutely littered with self-documenting comments that make it incredibly difficult to read, so we're going to use a trick with sed here to only see non-comment lines:

root@apache:~# cat /etc/php/7.4/fpm/pool.d/www.conf | sed '/^;.*$/d' | sed '/^$/d'

[www]
user = www-data
group = www-data
listen = /run/php/php7.4-fpm.sock
listen.owner = www-data
listen.group = www-data
pm = dynamic
pm.max_children = 5
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3

The crucial directive here is pm.max_children = 5. So Apache will be willing to spawn 150 workers at once—but only 5 of them can connect to php-fpm at once.

 

Drawing tentative conclusions

Right now, Apache is willing to open 150 connections, and php-fpm is only willing to serve five of them. That's a ratio of 30:1, and we're looking for more like 20:1. So we could use more workers in php-fpm to reduce queue times for our Apache threads.

 

It doesn't appear that Apache is really consuming any significant extra RAM when we ask it to spawn more threads—the individual threads are very lightweight since they don't have the PHP environment in them. So we can almost certainly afford to bump that up whenever we're ready. We're going to put that on hold for the time being, though, since we're primarily worried about our PHP execution time.

 

Php-fpm starts two servers and escalates to a maximum of five when under load. When we put the server under a heavy dynamic pageload, it went from 294MiB free to 277MiB free. So we can guess that each PHP worker is consuming about (294MiB-277MiB)/3 extra PHP workers==6MiB RAM, if we round it up a bit. Let's actually round it up a bit more for safety's sake—a good bit more, since PHP workers' RAM usage can climb significantly if they're doing different work.

 

Rounding our RAM budget per PHP worker up to 10MiB, we adjust our presumed free RAM with five workers down from 277MiB to 264MiB. We don't want to cut things too fine with our tiny little server—but we can probably get away with budgeting ourselves to only have 200MiB free instead of 264MiB free. That means an extra ten PHP workers above and beyond what we've got right now.

root@apache:~# sed -i 's/pm.max_children = 5/pm.max_children = 15/' /etc/php/7.4/fpm/pool.d/www.conf
root@apache:~# systemctl restart php7.4-fpm

Whew.

Testing our initial change

Now, we'll run our ab -c30 -t30 'https://apachetest.tehinterweb.net/?cat=1' again, while continuing to observe our live-refreshing CPU and RAM stats. We're also going to add another command to our little monitoring one-liner, though, to keep an eye on total CPU usage—because we're likely to be getting close to bottlenecking on CPU. So our new live-monitoring one-liner is:

root@apache:~# watch -n 1 'ps wwaux | head -n 1 ; ps wwaux | grep pache | grep -v grep ; echo ; \
               ps wwaux | grep php | grep -v grep ; echo ; free -m ; echo ; top -b -n 1 | head -n 3'

The good news is that we managed to slightly improve our ab scores: we went from 37.36 requests per second to 43.41, and better yet, we improved our median service time from 739ms to 462ms. We're definitely delivering pages faster and more consistently under heavy load! The bad news is that we're bottlenecking on CPU. Our baby $5/mo droplet only has a single core, and it's running at 90-percent utilization most of the time under our 30-concurrency tests.

 

So to do a better job, we need to shell out for a better VM, with another CPU core assigned to it. That'll bump our cost from $5/mo to $15/mo—but we'll get double the RAM out of the deal and three times as much network allotment per month, too. Back to the droplet manager we go for an upgrade.

 

A couple minutes later, we're up to two vCPUs and 2GiB of RAM, and we run the same test again: this time, we're not bottlenecking on CPU, and we've tremendously improved our ab results, with 54.3 requests per second, at a median service time of only 266ms. But look at all that free RAM! Our available RAM doubled along with our CPU—so it's time to grow into it.

root@apache:~# sed -i 's/pm.max_children = 15/pm.max_children = 25/' /etc/php/7.4/fpm/pool.d/www.conf
root@apache:~# sed -i 's/pm.start_servers = 2/pm.start_servers = 10/' /etc/php/7.4/fpm/pool.d/www.conf
root@apache:~# sed -i 's/pm.min_spare_servers = 1/pm.min_spare_servers = 10/' /etc/php/7.4/fpm/pool.d/www.conf
root@apache:~# sed -i 's/pm.max_spare_servers = 3/pm.max_spare_servers = 15/' /etc/php/7.4/fpm/pool.d/www.conf
root@apache:~# systemctl restart php7.4-fpm

Now, we've not only allowed ourselves up to 25 total PHP workers, we've minimized system thrash by requiring more of them to be running all the time, even if the server seems idle. When we run our test again, we see all the new workers—unfortunately, what we don't see is the actual throughput going up.

 

By installing iperf3 on our droplet and testing raw network throughput, we can see we're only getting around 10Mbps or so—which we were already hitting after upgrading our droplet to two vCPUs and 2GiB of RAM. We get similar results when testing from a home office with 200Mbps Spectrum cable Internet, or from a Linode datacenter a few hundred miles away—so, we're stuck. No amount of tweaking CPU, RAM, or settings will overcome this network bottleneck.

Conclusions

This isn't exactly the end we'd hoped for with this story—but it is constructive, and it demonstrates some real-world problems with cheap hosting. It also illustrates the reality that you always need to target your real bottleneck—which started out being our CPU and then shifted to our VM's Internet connection.

 

If we were doing this for real, the next step would be opening up a ticket with Digital Ocean and complaining about the poor throughput. The company doesn't outright promise any particular throughput anywhere, but it's not difficult to find support threads implying real-world speeds of 300+Mbps.

 

If that process failed, we'd be forced to start shopping around for cheap hosting elsewhere, hoping for a better result—which can be an extremely frustrating experience. Competitors in the space include but are not limited to Linode, Hetzner, OVH, Vultr, and Hostwinds.

 

If you're looking for a better guarantee on available bandwidth, there's always Amazon Web Services—but amateurs and those with shallow pockets beware. AWS bills per GiB of bandwidth used, and a Slashdotting can get very expensive very quickly.

 

 

Source: Apache 101: 0-WordPress in 15 minutes (Ars Technica)  

 

(To view the article's image galleries, please visit the above link)

Link to comment
Share on other sites


  • Views 1.2k
  • Created
  • Last Reply

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...