On Fri, Oct 22, 2010 at 4:00 PM, Lisa Kachold wrote: > > > On Fri, Oct 22, 2010 at 2:18 PM, keith smith wrote: > >> >> >> Hi, >> >> I have a question about performance when using a .htaccess file. I have >> read that having multiple .htaccess files can slow Apache. Meaning a >> .htaccess file in each directory. >> >> We have moved a ton of content, upwards of 900 pages. About 600 of those >> have been moved from our blog which was located in the directory /blog. It >> was suggested to break the .htaccess into files that reflect the content >> moved. For example put a .htaccess file in the /blog directory that >> reflects all the content from the blog instead of one big .htaccess file in >> the doc root directory that would contain 900 redirects. >> > > Well, that's better than FollowSymlinks? > > The reason that multiple .htaccess file management can be slow and > difficult is that Apache2 searches each TREE and .htaccess files are > inherited from hierarchical directories. > > A rewrite might actually be able to do exactly what you need? have you > considered that? Rewrite overhead is not huge, especially if you are > caching for this /blog URL? > > You simply enable mod_rewrite in Apache2 (procedure varies depending on your distro/version). A mod_rewrite solution is ONE line entry in your configuration file for that VirtualHost (for instance): 1) Here's a simple rewrite (provided your directory BLOG containing all of the 600 files can be trivially redirected to something like "newblog" ). RewriteEngine on RewriteBase /blog/ RewriteRule ^*/newblog/* $R1 Rewrite all files from one URL "blog" with a R permanent redirect to /blogs/? 2) Use a RewriteMap which is loaded ONCE by Apache: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritemap The RewriteMap directive defines a *Rewriting Map* which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types. The *MapName* is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs: *${ MapName : LookupKey } ${ MapName : LookupKey | DefaultValue }* When such a construct occurs, the map *MapName* is consulted and the key * LookupKey* is looked-up. If the key is found, the map-function construct is substituted by *SubstValue*. If the key is not found then it is substituted by *DefaultValue* or by the empty string if no *DefaultValue* was specified. For example, you might define a RewriteMap as: RewriteMap examplemap txt:/path/to/file/map.txt You would then be able to use this map in a RewriteRule as follows: RewriteRule ^/ex/(.*) ${examplemap:$1} 3) Advanced Rewrites Filesystem Reorganization Description: This really is a hardcore example: a killer application which heavily uses per-directory RewriteRules to get a smooth look and feel on the Web while its data structure is never touched or adjusted. drwxrwxr-x 2 netsw users 512 Aug 3 18:39 Audio/ drwxrwxr-x 2 netsw users 512 Jul 9 14:37 Benchmark/ drwxrwxr-x 12 netsw users 512 Jul 9 00:34 Crypto/ drwxrwxr-x 5 netsw users 512 Jul 9 00:41 Database/ drwxrwxr-x 4 netsw users 512 Jul 30 19:25 Dicts/ drwxrwxr-x 10 netsw users 512 Jul 9 01:54 Graphic/ drwxrwxr-x 5 netsw users 512 Jul 9 01:58 Hackers/ drwxrwxr-x 8 netsw users 512 Jul 9 03:19 InfoSys/ drwxrwxr-x 3 netsw users 512 Jul 9 03:21 Math/ drwxrwxr-x 3 netsw users 512 Jul 9 03:24 Misc/ drwxrwxr-x 9 netsw users 512 Aug 1 16:33 Network/ drwxrwxr-x 2 netsw users 512 Jul 9 05:53 Office/ drwxrwxr-x 7 netsw users 512 Jul 9 09:24 SoftEng/ drwxrwxr-x 7 netsw users 512 Jul 9 12:17 System/ drwxrwxr-x 12 netsw users 512 Aug 3 20:15 Typesetting/ drwxrwxr-x 10 netsw users 512 Jul 9 14:08 X11/ Solution: The solution has two parts: The first is a set of CGI scripts which create all the pages at all directory levels on-the-fly. I put them under /e/netsw/.www/ as follows: -rw-r--r-- 1 netsw users 1318 Aug 1 18:10 .wwwacl drwxr-xr-x 18 netsw users 512 Aug 5 15:51 DATA/ -rw-rw-rw- 1 netsw users 372982 Aug 5 16:35 LOGFILE -rw-r--r-- 1 netsw users 659 Aug 4 09:27 TODO -rw-r--r-- 1 netsw users 5697 Aug 1 18:01 netsw-about.html -rwxr-xr-x 1 netsw users 579 Aug 2 10:33 netsw-access.pl -rwxr-xr-x 1 netsw users 1532 Aug 1 17:35 netsw-changes.cgi -rwxr-xr-x 1 netsw users 2866 Aug 5 14:49 netsw-home.cgi drwxr-xr-x 2 netsw users 512 Jul 8 23:47 netsw-img/ -rwxr-xr-x 1 netsw users 24050 Aug 5 15:49 netsw-lsdir.cgi -rwxr-xr-x 1 netsw users 1589 Aug 3 18:43 netsw-search.cgi -rwxr-xr-x 1 netsw users 1885 Aug 1 17:41 netsw-tree.cgi -rw-r--r-- 1 netsw users 234 Jul 30 16:35 netsw-unlimit.lst The DATA/ subdirectory holds the above directory structure, i.e. the real * net.sw* stuff and gets automatically updated via rdist from time to time. The second part of the problem remains: how to link these two structures together into one smooth-looking URL tree? We want to hide the DATA/directory from the user while running the appropriate CGI scripts for the various URLs. Here is the solution: first I put the following into the per-directory configuration file in the DocumentRootof the server to rewrite the announced URL /net.sw/ to the internal path /e/netsw: RewriteRule ^net.sw$ net.sw/ [R] RewriteRule ^net.sw/(.*)$ e/netsw/$1 The first rule is for requests which miss the trailing slash! The second rule does the real thing. And then comes the killer configuration which stays in the per-directory config file /e/netsw/.www/.wwwacl: Options ExecCGI FollowSymLinks Includes MultiViews RewriteEngine on # we are reached via /net.sw/ prefix RewriteBase /net.sw/ # first we rewrite the root dir to # the handling cgi script RewriteRule ^$ netsw-home.cgi [L] RewriteRule ^index\.html$ netsw-home.cgi [L] # strip out the subdirs when # the browser requests us from perdir pages RewriteRule ^.+/(netsw-[^/]+/.+)$ $1 [L] # and now break the rewriting for local files RewriteRule ^netsw-home\.cgi.* - [L] RewriteRule ^netsw-changes\.cgi.* - [L] RewriteRule ^netsw-search\.cgi.* - [L] RewriteRule ^netsw-tree\.cgi$ - [L] RewriteRule ^netsw-about\.html$ - [L] RewriteRule ^netsw-img/.*$ - [L] # anything else is a subdir which gets handled # by another cgi script RewriteRule !^netsw-lsdir\.cgi.* - [C] RewriteRule (.*) netsw-lsdir.cgi/$1 Some hints for interpretation: 1. Notice the L (last) flag and no substitution field ('-') in the forth part 2. Notice the ! (not) character and the C (chain) flag at the first rule in the last part 3. Notice the catch-all pattern in the last rule Reference: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html (SEE also the excellent sections on blocking annoying robots, and other tricks). 4) I would consider organizing your blog files into some form of organization like say an Alphabetical new file structure where wildcard rewrites will reduce your toital number of rewrites. With a large number of rewrites, especially where are permanent R1 redirect is used, I would ALWAYS USE HARD /etc/apache2 configuration files as an include statement. They are easier to backup manage, grep through and evaluate problems after a graceful restart to reinitialize new changes. > >> Thank you for your feedback. >> >> ------------------------ >> Keith Smith >> >> --------------------------------------------------- >> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us >> To subscribe, unsubscribe, or to change your mail settings: >> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss >> > > > > -- > Skype: 6022393392 > ATT: 5037544452 > GV: 6923073392 > Phoenix Linux Security Team > PLUG.PHOENIX.AZ.US > http://www.it-clowns.com > *"Great things are not done by impulse but a series of small things > brought together." -Van Gogh* > > > > > > > > > > > > > > > > -- Skype: 6022393392 ATT: 5037544452 GV: 6923073392 Phoenix Linux Security Team PLUG.PHOENIX.AZ.US http://www.it-clowns.com *"Great things are not done by impulse but a series of small things brought together." -Van Gogh*