Rain, Rain, More Rain…
This has been continually the case since about Saturday afternoon…very humid…and very rainy. Several (somewhat severe) storms have passed through the region since the 21st…and it doesn’t look to be stopping anytime soon (at least through the end of the week. This has made for a somewhat dull outlook on activities; however, a little rain must fall anyway.
Hopefully this will eventually subside for some time and come again when it’s necessary.
BAD Robots! BAD! BAD
After taking another peek at my usage statistics for this domain, I’m again somewhat upset at the TurnitinBot for its non-stop plagiarism quest on my website here. This bot, between its several brothers and sisters (of the same type, but from different origins at turnitin.com), manages to rack up several thousand hits each month, of random things…which aren’t necessary, IMHO.
So, after taking a closer look at the robots I dislike visiting me (I’ve had a robots.txt file set up for a long time, which has been quite successful in preventing access to certain things which really don’t need to be indexed (such as the differences feature of TWiki — which is really processor-intense action)), I decided to take some suggestions on mod_rewrite and add it to my list of ‘features’ when it comes to preventing weird things.
I’ve only perpetually banned four bots (who shall remain nameless) which consistently nag me for very strange requests (some of which are listed as ‘don’t look at’ in robots.txt)…most of which are TWiki related. Hopefully this works to save some bandwidth…and also works to make my life a little easier.
While I was doing this research, though, I also noticed and took action on another potentially Bad Thing. The stock TWiki Templates for viewing have the META tag of ‘robots’ equal to ‘noindex’. This is behavior I really do want in some areas, but not in others. For instance, now that I’ve moved all sorts of things over to the TWiki side (for my personal site and soon to be with the Python site), it would be a Bad Thing for the [good] crawlers to not index this new location and material, as it’d been indexed previously. Granted, the intention (and action) I’m taking with the move is to provide a redirect from the original URL directly to the TWiki WebHome of the new location (which will at least point the old links to the home of the new location), these old references eventually will die, and at that time (theoretically) also will die my crawler indexes.
So, to prevent this (since both this site and my Python site (once unveiled to the public) should really be ‘crawlable’ — and since each of these has its own template, unrelated to the stock view template), I’ve removed this noindex reference for good measure. It’s still there for the other templates (and even for the other customized ones like search — which I don’t want indexed) and for the stock templates (which is good because I really don’t need the TWiki web and Knowledge Base web indexed (they’re the same as every other TWiki installation)), but it should allow for the good ones to do their thing.
The Move to TWiki
Some more advances have been made with regards to the move to TWiki for the Monty Python site (now called Matt’s Monty Python Repository). I’ve quickly come up with a new logo for the site (the old one was just funny-looking text and was really out of date)…which isn’t the greatest, but will do for now. I’ve also set up the TWiki templates for the site and done the basic layout customization.
What this ultimately means is that I now just have to move the content over from the existing pages (no small task in itself) and convert the formatting to TWiki-friendly.
But, progress is being made…and that’s a Good Thing.
This post was upgraded to the MZ Online Blog on 8/20/07