GNHLUG> Org Web>InternetServer>ServerOverloads (revision 1)EditAttach

Web Server Overloads

In 2023 December, after TWiki was upgraded to 6.0.1 and migrated from justice to petra, we started seeing kernel OOM (out-of-memory) warnings and resulting process kills.

Causes

Crawlers and robots

  • just blocking everything also blocks Google, etc., and we want to be found
  • I put the recommended /robots.txt in place to block edit/etc attempts
  • cfg RobotsAreWelcome is true, but it still puts nofollow tags on things when true
  • it seems both old and new sites are putting nofollow tags in
    • so apparently it's rude robots - asking nicely won't help
  • I have added the worst offenders to the Apache blacklist and that seems to be helping

Piggish TWiki pages

Some TWiki pages simply use tons of resources to render, typically because they have lots of content, or pull in lots of content from other pages. More complicated markup or plugins also use more resources to render.

TWikiHistory

TWikiDocumentation

Error messages (in multiple logs) included messages that looked like these:

Out of memory!
[cgid:error] End of script output before headers: view, referer:= =https://wiki.gnhlug.org/TWiki/WebTopicList
| 0023-12-26 - 12:00:38 | guest | view | TWiki.TWikiDocumentation | Mozilla | x.y.z.w |

Reduction

Looking into TWiki:Plugins.CacheAddOn might be a good idea (FIXME). It should help performance generally.

Prevention

Ideally, we'd like to allow TWiki to use what's available, but prevent it from using too much for too long.

Resource limits above Apache

These were set in the apache2.service systemd unit file:

MemoryHigh=500M
MemoryMax=700M

These limits apply to Apache itself, as well as all child processes, regardless of user changes. They apply cumulatively across all of those processes. This at least keeps Apache/TWiki from consuming so many resources it kills the entire host. Unfortunately, it is still possible for TWiki to use up so much memory that Apache may be effected.

Resource limits within Apache

  • setting RLimitNPROC to 50 results in many "can't fork" errors (100 works)
  • setting RLimitMEM to 50 MiB breaks TWikiHistory (75 MiB works)
  • 100 * 50 MiB = 5 GiB; this server only has 1 GiB RAM
  • what we really need is an aggregate limit (e.g., cgroup) for all of TWiki, but not other CGI
  • failing that, limiting all of Apache would at least limit the damage to web

Worker limits within Apache

FIXME

Resource limits below Apache

Can we do anything within Perl and/or TWiki to limit the resources they use? Just throwing ulimit(1) calls in a wrapper script is unlikely to help, but perhaps there is something more suitable?

Even better would be segregating CGI out of the Apache process tree entirely; see below.

Segregating CGI

Ideally, we run TWiki as a separate user, in a separate kernel cgroup, with limits specific to those things. But how?

  • running multiple httpd's would do it, but we'd need a reverse proxy and rewrites, yuck + suexec is not suitable
    • our TWiki is carefully locked down with file ownership and permissions
    • suexec insists that CGI scripts and directories be owned by running user, increases exposure
  • ditto for CGIWrap
  • mod_perl might save some resources
    • but I suspect no amount of added resources is sufficient
    • rude robots will keep using resources until we run out, no matter what
  • ditto for SpeedyCGI / PersistentPerl, plus it requires TWiki code changes
  • FastCGI, see below

FastCGI

  • FastCGI is a protocol for web servers to call out to separately-running external programs
  • fcgiwrap can run an arbitrary CGI script for FastFGI
  • Apache has two options, mod_proxy_fcgi and mod_fcgid
  • mod_proxy_fcgi
    • Already in-use on the same server for a PHP app
    • Has a known bug that breaks some scripts, but TWiki might be OK
    • Had lots of trouble getting it hooked into the Apache request chain
    • Started to see lots of Apache thread/process hangs
    • Gave up and removed it all
    • Still saw a few thread/process hangs so many that was unrelated or a bad diagnostic?
  • mod_fcgid
    • This is an evolution of mod_fastcgi, the original
    • Older, deprecated, not as well documented
    • Looked into t but could not see a way to hook it into fcgiwrap
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2023-12-29 - BenScott
 

All content is Copyright © 1999-2025 by, and the property of, the contributing authors.
Questions, comments, or concerns? Contact GNHLUG.
All use of this site subject to our Legal Notice (includes Terms of Service).