performance when using a .htaccess

keith smith klsmith2020 at yahoo.com
Mon Oct 25 15:55:51 MST 2010


Thank you for all this info!

------------------------

Keith Smith

--- On Fri, 10/22/10, Lisa Kachold <lisakachold at obnosis.com> wrote:

From: Lisa Kachold <lisakachold at obnosis.com>
Subject: Re: performance when using a .htaccess
To: "Main PLUG discussion list" <plug-discuss at lists.plug.phoenix.az.us>
Date: Friday, October 22, 2010, 8:18 PM



On Fri, Oct 22, 2010 at 4:00 PM, Lisa Kachold <lisakachold at obnosis.com> wrote:



On Fri, Oct 22, 2010 at 2:18 PM, keith smith <klsmith2020 at yahoo.com> wrote:




Hi,

I have a question about performance when using a .htaccess file.  I have read that having multiple .htaccess files can slow Apache.  Meaning a .htaccess file in each directory.



We have moved a ton of content, upwards of 900 pages.  About 600 of those have been moved from our blog which was located in the directory /blog.  It was suggested to break the .htaccess into files that reflect the content moved.  For example put a .htaccess file in the /blog directory that reflects all the content from the blog instead of one big .htaccess file in the doc root directory that would contain 900 redirects.



Well, that's better than FollowSymlinks?

The reason that multiple .htaccess file management can be slow and difficult is that Apache2 searches each TREE and .htaccess files are inherited from hierarchical directories.  



A rewrite might actually be able to do exactly what you need?  have you considered that?  Rewrite overhead is not huge, especially if you are caching for this /blog URL?
 
You simply enable mod_rewrite in Apache2 (procedure varies depending on your distro/version).

A mod_rewrite solution is ONE line entry in your configuration file for that VirtualHost (for instance):

1) Here's a simple rewrite (provided your directory BLOG containing all of the 600 files can be trivially redirected to something like "newblog" ).


RewriteEngine  on
RewriteBase    /blog/
RewriteRule    ^/newblog/ $R1
Rewrite all files from one URL "blog" with a R permanent redirect to /blogs/?

2) Use a RewriteMap which is loaded ONCE by Apache:

http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewritemap


The RewriteMap directive defines a
      Rewriting Map which can be used inside rule
      substitution strings by the mapping-functions to
      insert/substitute fields through a key lookup. The source of
      this lookup can be of various types.

      The MapName is
      the name of the map and will be used to specify a
      mapping-function for the substitution strings of a rewriting
      rule via one of the following constructs:

      
        ${ MapName :
        LookupKey }

         ${ MapName :
        LookupKey | DefaultValue
        }
      

      When such a construct occurs, the map MapName is
      consulted and the key LookupKey is looked-up. If the
      key is found, the map-function construct is substituted by
      SubstValue. If the key is not found then it is
      substituted by DefaultValue or by the empty string
      if no DefaultValue was specified.

      For example, you might define a
      RewriteMap as:

      
      RewriteMap examplemap txt:/path/to/file/map.txt
      

      You would then be able to use this map in a
      RewriteRule as follows:

      
      RewriteRule ^/ex/(.*) ${examplemap:$1}
      
3) Advanced Rewrites Filesystem Reorganization

      

      Description:
          This really is a hardcore example: a killer application
          which heavily uses per-directory
          RewriteRules to get a smooth look and feel
          on the Web while its data structure is never touched or
          adjusted.

drwxrwxr-x   2 netsw  users    512 Aug  3 18:39 Audio/
drwxrwxr-x   2 netsw  users    512 Jul  9 14:37 Benchmark/
drwxrwxr-x  12 netsw  users    512 Jul  9 00:34 Crypto/
drwxrwxr-x   5 netsw  users    512 Jul  9 00:41 Database/
drwxrwxr-x   4 netsw  users    512 Jul 30 19:25 Dicts/
drwxrwxr-x  10 netsw  users    512 Jul  9 01:54 Graphic/
drwxrwxr-x   5 netsw  users    512 Jul  9 01:58 Hackers/
drwxrwxr-x   8 netsw  users    512 Jul  9 03:19 InfoSys/
drwxrwxr-x   3 netsw  users    512 Jul  9 03:21 Math/
drwxrwxr-x   3 netsw  users    512 Jul  9 03:24 Misc/
drwxrwxr-x   9 netsw  users    512 Aug  1 16:33 Network/
drwxrwxr-x   2 netsw  users    512 Jul  9 05:53 Office/
drwxrwxr-x   7 netsw  users    512 Jul  9 09:24 SoftEng/
drwxrwxr-x   7 netsw  users    512 Jul  9 12:17 System/
drwxrwxr-x  12 netsw  users    512 Aug  3 20:15 Typesetting/
drwxrwxr-x  10 netsw  users    512 Jul  9 14:08 X11/


          
        Solution:
          The solution has two parts: The first is a set of CGI
          scripts which create all the pages at all directory
          levels on-the-fly. I put them under
          /e/netsw/.www/ as follows:

-rw-r--r--   1 netsw  users    1318 Aug  1 18:10 .wwwacl
drwxr-xr-x  18 netsw  users     512 Aug  5 15:51 DATA/
-rw-rw-rw-   1 netsw  users  372982 Aug  5 16:35 LOGFILE
-rw-r--r--   1 netsw  users     659 Aug  4 09:27 TODO
-rw-r--r--   1 netsw  users    5697 Aug  1 18:01 netsw-about.html
-rwxr-xr-x   1 netsw  users     579 Aug  2 10:33 netsw-access.pl
-rwxr-xr-x   1 netsw  users    1532 Aug  1 17:35 netsw-changes.cgi
-rwxr-xr-x   1 netsw  users    2866 Aug  5 14:49 netsw-home.cgi
drwxr-xr-x   2 netsw  users     512 Jul  8 23:47 netsw-img/
-rwxr-xr-x   1 netsw  users   24050 Aug  5 15:49 netsw-lsdir.cgi
-rwxr-xr-x   1 netsw  users    1589 Aug  3 18:43 netsw-search.cgi
-rwxr-xr-x   1 netsw  users    1885 Aug  1 17:41 netsw-tree.cgi
-rw-r--r--   1 netsw  users     234 Jul 30 16:35 netsw-unlimit.lst


          The DATA/ subdirectory holds the above
          directory structure, i.e. the real
          net.sw stuff and gets
          automatically updated via rdist from time to
          time. The second part of the problem remains: how to link
          these two structures together into one smooth-looking URL
          tree? We want to hide the DATA/ directory
          from the user while running the appropriate CGI scripts
          for the various URLs. Here is the solution: first I put
          the following into the per-directory configuration file
          in the DocumentRoot
          of the server to rewrite the announced URL
          /net.sw/ to the internal path
          /e/netsw:

RewriteRule  ^net.sw$       net.sw/        [R]
RewriteRule  ^net.sw/(.*)$  e/netsw/$1


          The first rule is for requests which miss the trailing
          slash! The second rule does the real thing. And then
          comes the killer configuration which stays in the
          per-directory config file
          /e/netsw/.www/.wwwacl:

Options       ExecCGI FollowSymLinks Includes MultiViews

RewriteEngine on

#  we are reached via /net.sw/ prefix
RewriteBase   /net.sw/

#  first we rewrite the root dir to
#  the handling cgi script
RewriteRule   ^$                       netsw-home.cgi     [L]
RewriteRule   ^index\.html$            netsw-home.cgi     [L]

#  strip out the subdirs when
#  the browser requests us from perdir pages
RewriteRule   ^.+/(netsw-[^/]+/.+)$    $1                 [L]

#  and now break the rewriting for local files
RewriteRule   ^netsw-home\.cgi.*       -                  [L]
RewriteRule   ^netsw-changes\.cgi.*    -                  [L]
RewriteRule   ^netsw-search\.cgi.*     -                  [L]
RewriteRule   ^netsw-tree\.cgi$        -                  [L]
RewriteRule   ^netsw-about\.html$      -                  [L]
RewriteRule   ^netsw-img/.*$           -                  [L]

#  anything else is a subdir which gets handled
#  by another cgi script
RewriteRule   !^netsw-lsdir\.cgi.*     -                  [C]
RewriteRule   (.*)                     netsw-lsdir.cgi/$1


          Some hints for interpretation:

          Notice the L (last) flag and no
            substitution field ('-') in the forth partNotice the ! (not) character and
            the C (chain) flag at the first rule
            in the last partNotice the catch-all pattern in the last rule
        Reference:  http://httpd.apache.org/docs/2.0/misc/rewriteguide.html  (SEE also the excellent sections on blocking annoying robots, and other tricks).


4) I would consider organizing your blog files into some form of organization like say an Alphabetical new file structure where wildcard rewrites will reduce your toital number of rewrites.

With a large number of rewrites, especially where are permanent R1 redirect is used, I would ALWAYS USE HARD /etc/apache2 configuration files as an include statement.  They are easier to backup manage, grep through and evaluate problems after a graceful restart to reinitialize new changes.


 



Thank you for your feedback. 

------------------------

Keith Smith


      
---------------------------------------------------

PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us

To subscribe, unsubscribe, or to change your mail settings:

http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss




-- 
Skype: 6022393392

ATT:     5037544452
GV:      6923073392
Phoenix Linux Security Team   PLUG.PHOENIX.AZ.US
http://www.it-clowns.com


"Great things are not done by impulse but a series of small things brought together." -Van Gogh




















-- 
Skype: 6022393392
ATT:     5037544452
GV:      6923073392
Phoenix Linux Security Team   PLUG.PHOENIX.AZ.US

http://www.it-clowns.com
"Great things are not done by impulse but a series of small things brought together." -Van Gogh


















-----Inline Attachment Follows-----

---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20101025/bb047b66/attachment.html>


More information about the PLUG-discuss mailing list