performance when using a .htaccess
keith smith
klsmith2020 at yahoo.com
Mon Oct 25 15:55:51 MST 2010
Thank you for all this info!
------------------------
Keith Smith
--- On Fri, 10/22/10, Lisa Kachold <lisakachold at obnosis.com> wrote:
From: Lisa Kachold <lisakachold at obnosis.com>
Subject: Re: performance when using a .htaccess
To: "Main PLUG discussion list" <plug-discuss at lists.plug.phoenix.az.us>
Date: Friday, October 22, 2010, 8:18 PM
On Fri, Oct 22, 2010 at 4:00 PM, Lisa Kachold <lisakachold at obnosis.com> wrote:
On Fri, Oct 22, 2010 at 2:18 PM, keith smith <klsmith2020 at yahoo.com> wrote:
Hi,
I have a question about performance when using a .htaccess file. I have read that having multiple .htaccess files can slow Apache. Meaning a .htaccess file in each directory.
We have moved a ton of content, upwards of 900 pages. About 600 of those have been moved from our blog which was located in the directory /blog. It was suggested to break the .htaccess into files that reflect the content moved. For example put a .htaccess file in the /blog directory that reflects all the content from the blog instead of one big .htaccess file in the doc root directory that would contain 900 redirects.
Well, that's better than FollowSymlinks?
The reason that multiple .htaccess file management can be slow and difficult is that Apache2 searches each TREE and .htaccess files are inherited from hierarchical directories.
A rewrite might actually be able to do exactly what you need? have you considered that? Rewrite overhead is not huge, especially if you are caching for this /blog URL?
You simply enable mod_rewrite in Apache2 (procedure varies depending on your distro/version).
A mod_rewrite solution is ONE line entry in your configuration file for that VirtualHost (for instance):
1) Here's a simple rewrite (provided your directory BLOG containing all of the 600 files can be trivially redirected to something like "newblog" ).
RewriteEngine on
RewriteBase /blog/
RewriteRule ^/newblog/ $R1
Rewrite all files from one URL "blog" with a R permanent redirect to /blogs/?
2) Use a RewriteMap which is loaded ONCE by Apache:
http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritemap
The RewriteMap directive defines a
Rewriting Map which can be used inside rule
substitution strings by the mapping-functions to
insert/substitute fields through a key lookup. The source of
this lookup can be of various types.
The MapName is
the name of the map and will be used to specify a
mapping-function for the substitution strings of a rewriting
rule via one of the following constructs:
${ MapName :
LookupKey }
${ MapName :
LookupKey | DefaultValue
}
When such a construct occurs, the map MapName is
consulted and the key LookupKey is looked-up. If the
key is found, the map-function construct is substituted by
SubstValue. If the key is not found then it is
substituted by DefaultValue or by the empty string
if no DefaultValue was specified.
For example, you might define a
RewriteMap as:
RewriteMap examplemap txt:/path/to/file/map.txt
You would then be able to use this map in a
RewriteRule as follows:
RewriteRule ^/ex/(.*) ${examplemap:$1}
3) Advanced Rewrites Filesystem Reorganization
Description:
This really is a hardcore example: a killer application
which heavily uses per-directory
RewriteRules to get a smooth look and feel
on the Web while its data structure is never touched or
adjusted.
drwxrwxr-x 2 netsw users 512 Aug 3 18:39 Audio/
drwxrwxr-x 2 netsw users 512 Jul 9 14:37 Benchmark/
drwxrwxr-x 12 netsw users 512 Jul 9 00:34 Crypto/
drwxrwxr-x 5 netsw users 512 Jul 9 00:41 Database/
drwxrwxr-x 4 netsw users 512 Jul 30 19:25 Dicts/
drwxrwxr-x 10 netsw users 512 Jul 9 01:54 Graphic/
drwxrwxr-x 5 netsw users 512 Jul 9 01:58 Hackers/
drwxrwxr-x 8 netsw users 512 Jul 9 03:19 InfoSys/
drwxrwxr-x 3 netsw users 512 Jul 9 03:21 Math/
drwxrwxr-x 3 netsw users 512 Jul 9 03:24 Misc/
drwxrwxr-x 9 netsw users 512 Aug 1 16:33 Network/
drwxrwxr-x 2 netsw users 512 Jul 9 05:53 Office/
drwxrwxr-x 7 netsw users 512 Jul 9 09:24 SoftEng/
drwxrwxr-x 7 netsw users 512 Jul 9 12:17 System/
drwxrwxr-x 12 netsw users 512 Aug 3 20:15 Typesetting/
drwxrwxr-x 10 netsw users 512 Jul 9 14:08 X11/
Solution:
The solution has two parts: The first is a set of CGI
scripts which create all the pages at all directory
levels on-the-fly. I put them under
/e/netsw/.www/ as follows:
-rw-r--r-- 1 netsw users 1318 Aug 1 18:10 .wwwacl
drwxr-xr-x 18 netsw users 512 Aug 5 15:51 DATA/
-rw-rw-rw- 1 netsw users 372982 Aug 5 16:35 LOGFILE
-rw-r--r-- 1 netsw users 659 Aug 4 09:27 TODO
-rw-r--r-- 1 netsw users 5697 Aug 1 18:01 netsw-about.html
-rwxr-xr-x 1 netsw users 579 Aug 2 10:33 netsw-access.pl
-rwxr-xr-x 1 netsw users 1532 Aug 1 17:35 netsw-changes.cgi
-rwxr-xr-x 1 netsw users 2866 Aug 5 14:49 netsw-home.cgi
drwxr-xr-x 2 netsw users 512 Jul 8 23:47 netsw-img/
-rwxr-xr-x 1 netsw users 24050 Aug 5 15:49 netsw-lsdir.cgi
-rwxr-xr-x 1 netsw users 1589 Aug 3 18:43 netsw-search.cgi
-rwxr-xr-x 1 netsw users 1885 Aug 1 17:41 netsw-tree.cgi
-rw-r--r-- 1 netsw users 234 Jul 30 16:35 netsw-unlimit.lst
The DATA/ subdirectory holds the above
directory structure, i.e. the real
net.sw stuff and gets
automatically updated via rdist from time to
time. The second part of the problem remains: how to link
these two structures together into one smooth-looking URL
tree? We want to hide the DATA/ directory
from the user while running the appropriate CGI scripts
for the various URLs. Here is the solution: first I put
the following into the per-directory configuration file
in the DocumentRoot
of the server to rewrite the announced URL
/net.sw/ to the internal path
/e/netsw:
RewriteRule ^net.sw$ net.sw/ [R]
RewriteRule ^net.sw/(.*)$ e/netsw/$1
The first rule is for requests which miss the trailing
slash! The second rule does the real thing. And then
comes the killer configuration which stays in the
per-directory config file
/e/netsw/.www/.wwwacl:
Options ExecCGI FollowSymLinks Includes MultiViews
RewriteEngine on
# we are reached via /net.sw/ prefix
RewriteBase /net.sw/
# first we rewrite the root dir to
# the handling cgi script
RewriteRule ^$ netsw-home.cgi [L]
RewriteRule ^index\.html$ netsw-home.cgi [L]
# strip out the subdirs when
# the browser requests us from perdir pages
RewriteRule ^.+/(netsw-[^/]+/.+)$ $1 [L]
# and now break the rewriting for local files
RewriteRule ^netsw-home\.cgi.* - [L]
RewriteRule ^netsw-changes\.cgi.* - [L]
RewriteRule ^netsw-search\.cgi.* - [L]
RewriteRule ^netsw-tree\.cgi$ - [L]
RewriteRule ^netsw-about\.html$ - [L]
RewriteRule ^netsw-img/.*$ - [L]
# anything else is a subdir which gets handled
# by another cgi script
RewriteRule !^netsw-lsdir\.cgi.* - [C]
RewriteRule (.*) netsw-lsdir.cgi/$1
Some hints for interpretation:
Notice the L (last) flag and no
substitution field ('-') in the forth partNotice the ! (not) character and
the C (chain) flag at the first rule
in the last partNotice the catch-all pattern in the last rule
Reference: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html (SEE also the excellent sections on blocking annoying robots, and other tricks).
4) I would consider organizing your blog files into some form of organization like say an Alphabetical new file structure where wildcard rewrites will reduce your toital number of rewrites.
With a large number of rewrites, especially where are permanent R1 redirect is used, I would ALWAYS USE HARD /etc/apache2 configuration files as an include statement. They are easier to backup manage, grep through and evaluate problems after a graceful restart to reinitialize new changes.
Thank you for your feedback.
------------------------
Keith Smith
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
--
Skype: 6022393392
ATT: 5037544452
GV: 6923073392
Phoenix Linux Security Team PLUG.PHOENIX.AZ.US
http://www.it-clowns.com
"Great things are not done by impulse but a series of small things brought together." -Van Gogh
--
Skype: 6022393392
ATT: 5037544452
GV: 6923073392
Phoenix Linux Security Team PLUG.PHOENIX.AZ.US
http://www.it-clowns.com
"Great things are not done by impulse but a series of small things brought together." -Van Gogh
-----Inline Attachment Follows-----
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20101025/bb047b66/attachment.html>
More information about the PLUG-discuss
mailing list