Hi Jim: On 7/13/09, Jim March <1.jim.march@gmail.com> wrote: > Folks, > > I have a friend who runs a website. Every night he looks at the logs > and checks to see where people are linking in from - usually > discussion forums. > > He's got a regular trickle of incoming from a website that doesn't > seem to exist: > > http://www.alchemistsrroom.us > > Drop one "r" from "rroom" and you do get a valid site, but it involves > aromatherapy. His site relates to a high-end handgun sight...that > would be an odd linkage. > > Something else: I didn't know this, but people who mess around with > homebrew explosives call themselves "alchemists", so there's obviously > more of a cross-linked interest THERE than with aromatherapy. > > I've run "whois" searches on "alchemistsrroom.us" plus tried to go to > the .com, .net, .org, .edu and even .gov versions of the same thing. Not all DNS queries will properly find the us. sites: The .US sites can be registered for free: An alternative is to register in one of the other domains - for U.S. residents the us top-level domain. Each country has its own top-level domain and may or may not charge fees. Non-US residents will have to research the details for their own country of residence. The US Domain is an official top-level domain in the Domain Name Service (DNS) of the Internet community. It is administered by the US Domain Registry at the Information Sciences Institute of the University of Southern California (ISI), under the Internet Assigned Numbers Authority (IANA). The levels under the us domain are geopolitically based. The hosts in California are in the second-level domain ca.us, the hosts in Minnesota are in mn.us, and so on. The geopolitical third-level domains are usually city names. For example, San Jose, CA translates to san-jose.ca.us. Abbreviations were allowed at one time and are supported for existing registered hosts. There are are other special third-level and fourth-level domains for schools (k12), city governments (ci), county governments (co), community colleges (cc), etc., but these usually don't apply to most personal Linux hosts. They are looked up differently: $ nslookup -query=ns us RFC 1480 explains Us domains more completely: http://ftp.isi.edu/in-notes/rfc1480.txt So the site/domain exists. ; <<>> DiG 9.4.3-P1 <<>> www.alchemistsrroom.us ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 5651 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.alchemistsrroom.us. IN A ;; AUTHORITY SECTION: us. 900 IN SOA a.gtld.biz. hostmaster.neustar.biz. 2003659686 900 900 604800 86400 ;; Query time: 147 msec ;; SERVER: 204.13.248.75#53(204.13.248.75) ;; WHEN: Mon Jul 13 18:48:38 2009 ;; MSG SIZE rcvd: 105 I do get a SOA section for neustar.biz - which would be probably where this traffic is originating. But it's A record is not resolving for some reason, probably never entered in the domain. A HTTPD spider or site scraper script can announce itself as coming from a bogus site. Nikto/Wikto and Metasploit do this well. What do your Apache logs report for it? http://www.webmasterworld.com/search_engine_spiders/3777664.htm If you are logging via IP, and then resolving to DNS, from that server, you should either be able to do a nslookup on the IP to match your Apache logs or find the originating IP address. You might have to noise up the log levels, however. It's a good probability that you are only seeing that hostname crafted into the header by the script; but I don't have complete information to be certain. Adding a good Apache log analysis tool like awstats is a good start. The HTTP Referer header may be legitimately missing for the following reasons: 1) The visitor typed in the address 2) The link was loaded by JavaScript on the referring page 3) The user is behind a corporate or ISP caching proxy (and has no choice about it) 4) The user is running "internet security" software which blocks referrers (often by default) 5) The user is behind a firewall (home or corporate) which blocks the referrer header 6) The client is a search engine robot 7) The user kicked off via a bookmark from the browser. (There are probably several other common cases.) HTTP/1.1 specifically prohibits sending a Referrer header unless a single originating HTTP-accessible URL can be provided. So the lack of a referrer is to be expected in all but cases 4 and 5 above. And in those cases, the referrer is being suppressed out of a concern (right or wrong) for security, and although we might not like that, it is best to remember why we call the requester the client, and call the machine that runs our Web site the server: On the Web, the client is in charge; They are the customers, and we are the waiters, and the trick is to give the bad customers the boot without offending --or even disturbing-- the good customers. Clearly, blocking by blank referrer is not a good idea. However, if both the referrer and the user-agent strings are blank, it's a good bet there's a problem (there are a few exceptions, but not many, and most involve favicon.ico requests). However, if either the referrer or the user-agent is a literal hyphen ("-"), then you can be sure the request is unfriendly; Some baddies send hyphens in these headers specifically to bypass the "blank referrer and UA" tests, but yet appear as otherwise-normal accesses in the server logs. This is because standard server logs show a quoted hyphen in the log entry if the HTTP Referrer or User-agent header is blank. So this is a cute trick, but one that's easy to detect. Jim, you might want to take a look at some of the other attributes of these requests. Look at all of the HTTP "Accept" headers and make sure they match the expected values for each claimed user-agent. Examining the "X-Forwarded-For" and "Via" proxy headers may be worthwhile as well. This can be done by adding a bit of PHP or SSI code to the pages hit by theses user-agents, since these headers are not normally logged in standard server access logs. And post the actual log please? > So...first question is, why is this guy's server logs telling him > links are coming in from a non-existent address? The HTTP referrer can be proxied, spoofed, or spidered, pharmed and phished. Check firewall logs for actual originating IP address at the time of the access. > Possibly related question: is there a way to mask alchemistsrroom.us > somehow, possibly by running a non-standard port > (http://alchemistsrroom.us:8081 or something?) If so, can we find it, > and possibly locate an underground bomb-maker's forum or something? > > :) > > Jim > --------------------------------------------------- > PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us > To subscribe, unsubscribe, or to change your mail settings: > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss > -- http://linuxgazette.net/164/kachold.html (623)239-3392 Skype: obn0sis (503)754-4452 www.obnosis.com --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss