Joanna About this site

About

How this site is organized and what it's for

Weblog start page

The start page contains the most recent 15 articles.

Home page
The main home page of my website, not my weblog. Currently not used.
------------------
Articles by month
Click here to get all the articles for a particular month.
This month's articles (if any)
Current month
Today's articles (if any)
Articles dated 2008/10/07 only

------------------
Subtopics

------------------
My email address
Site map
Search my weblog
Search for text on this site
You may have to use search
if I move files around!
Listing of all articles by date
Moving man
Flavours
There's more than one way to view this weblog; these links display the current page in other formats.
External links
These are a few of my favourite sites.
T E S T
Slashdot yesterday

Copyright © 2003-2007 Alternate Worlds Publishing, Boston MA USA


powered by blosxom -- www.blosxom.com
Wenhua dageming de zhongyao jiaoxun shi bixu fandui geren mixin
If I have been able to see further, it is because I am surrounded by midgets.
Never ascribe to stupidity that which can adequately be explained by malice.
"Your argument's repugnant and intriguing." "That's kinda my thing."

Danny's Weblog

Introduction

This weblog exists for the following reasons:

  • Vanity
  • Experimenting with website design features such as optimizing Google searches
  • It provides an annotated system of favorites links that I can access from anywhere

I hope you find it useful. You should be aware of the following hints on navigation:

  • Postings are presented in REVERSE chronological order
  • For a clickable list of topics, refer to the "Site map" in the left-hand navigation pane
  • If you click an upper-level topic, the system will *also* display lower-level topics, up to some limit
  • Changes in my site setup are listed under the "Chrome" topic.

I don't want people to download my entire site or hit the site frequently for any other reason. Do not set a newsreader, or any other robot, to hit the site more than once per day. Please consider the date-range links and subject links in the sidebar: you can probably get all you want in a single hit. If you are blocked you will probably not be re-enabled.


2007 Jan 10 [ Wed ]

Fixed resizing and print issues with MS Internet Explorer

I have now put in various kludges so that the screen display and printed output are reasonably clean in both Firefox and IE. I decided to make the banner image centered as that continues to look OK when the body text is resized, but I could not find a way to do that in .css and had to put in an HTML align=center; oh well.

Now that the print output via css is so good I will remove the link to create a printed version of each page, although I will prbably leave the code undisturbed.

I am also making good progress on the download limit feature.

2007 Jan 09 [ Tue ]

Some display and print improvements

I had never been quite satisfied with some aspects of the layout so over the last couple of days I have dinked with it quite a lot using css. Most elements are now controlled only by css, although I have not tried to replace the basic table structure.

Although it is still not perfect, the pages now seem to respond much better to resizing (except that for some reason under IE6 changing the text size seems to have no effect – wtf?), given that I do not like text to display with too many characters per line. For that reason, if you try to increase page width, you just get more space on the right margin.

Additionally, I have set up a separate print css which makes a big improvement in the print output. As well as getting rid of the sidebar, it again responds better to resizing, at least under Firefox. Also, it finally implements a feature I have always wanted: while the screen display shows a nice short version of a link (so that the text display is not messed up), the printed version prints the full url (because it does no good to hover the mouse over a printed page). At least, it prints the full url up to the page width.

– Oops – it doesn't work in IE; the URL is completely missing. Oh well; I can't address that tonight.

Several features are probably not implemented very cleanly yet. I really need to re-read the Blosxom docs on how to implement plugins, as I have scattered some of the code needed for the recent modifications all over the place (which makes it hard to install updated plugins or indeed Blosxom itself).

2007 Jan 07 [ Sun ]

Added meaningful titles to my pages

I heard recently that Google thinks that all pages with the same title must be the same page. Perhaps this explains why Google only shows two hits for my site. So I resolved to make the titles vary.

Now, if you view a single posting, the title is the path to the posting, plus the title of that posting.

If you view a folder, the title is just the path. It's pointless or misleading to title the whole string of postings by the title of a single one. Anyway, I tell Google not to log those pages.

If you view a date range, the title is blank – oh well.

Also, I have made considerable progress towards a much less lame download limit scheme: ie, I can at least track downloads on the fly, and will put in the limit when I can figure out something that makes sense.

2007 Jan 06 [ Sat ]

A few more fixes to my site's HTML

I made some more improvements. Last I checked, my main page shows 0 errors and my page for the whole of 2006 shows 4 errors. I probably won't do any more on this, unless I miraculously figure out a clean fix for my quote-handling problems. (If you see any pages with obvious quote problems please let me know.)

I fixed a "div inside p" problem with horizontal rules, and also cleaned up some display formatting which didn't actually work by converting to css.

Additionally, I checked pretty much the whole site for mistakes in quote handling, and fixed the original files (backdating them with touch -t so that they continue to show up on the same date) so that they don't trigger errors in my lame quote handling code.

Additionally, I had made a mistake yesterday in formatting my story headings just using styles. Doing so means that the pages could not be parsed for semantic content (if anyone cares) so I went back to using an h3 tag.

...and I just fixed the color of links, which I noticed had gone to black. I wonder when that started? I put it back to blue in the .css.

2007 Jan 05 [ Fri ]

Numerous minor fixes today

I've been intending to fix some of the bugs shown by the W3C validator: validator.w3.org [http://validator.w3.org/] for *years*, and today I finally had a hack attack.

The number of errors detected has fallen from over a hundred on some pages down to a handful. The following lists what I figured out so far:

1. Most of my effort was on handling the quotes problem. For a long time my quotes were not properly nested either inside or around paragraph tags. I've fixed most appearances of this bug but the fix is very lame and several conditions still cause it.

2. Additionally, I was just not aware that some tags are not allowed to be inside other tags. For instance, div inside p. The start of the div block causes an implicit close of the p block and then the dangling close-p causes an error. Maybe this is why a lot of people never close their p blocks.

Incidentally, this bug also afflicts code which I did not write. For instance, the plugin which changes a bunch of hyphens to a horizontal rule does so by inserting a div block, but that block gets wrapped in paragraph tags like everything else.

3. One of my Blosxom plugins (categories) was buggy and was inserting unnecessary ul tags. I installed a new version.

4. While initially setting up the templates I had thrown in a lot of tag parameters which W3C, not to mention the browser, does not like.

5. Also, the story template had numerous div/p problems.

The appearance of some pages has changed slightly, especially around quoted text.

In addition to the above, I have changed the title tag so that it varies from page to page. This apparently helps Google to realize that your website has more than one page...

I have also changed my shell (setting "ignoreeof") so that I can use ˆD (end of file) to set the end of expected parameters to a CGI program. (That took a surprising length of time to figure out; the tcsh does not seem to respond to ˆD according to spec but actually swallows it unless you are at an empty prompt. The workaround if you don't want to set ignoreeof is to put in an empty string on the command line.)

2006 Nov 17 [ Fri ]

Disabled trackbacks again

Trackbacks are a feature of most blogging software; when someone puts a link to your site on his site, he can click a trackback link that you provide on that page which somewhat automagically informs your software.

Unfortunately spammers like to use trackbacks as a way to make you host a link to their site, thus increasing their Google PageRank.

Although I had never implemented the feature of automatically adding trackbacks to my blog pages, I have been noticing a huge number of hits to my trackback pages (huge relative at least to my pitiful number of real hits). I assumed it would die down when spammers realized that not only does my software not publish trackbacks, but anyway my site has zero PageRank and therefore is useless to them, but then I noticed that the attempted comment spam to my writeback pages was random text; in other words, for whatever reason (and the intelligence agencies have a good motive) someone is just attempting to destroy the trackback feature.

Since as far as I can tell I have never received a *single* valid trackback, I have lazily decided to just disable them and I see no reason to ever re-enable them.

Wikipedia article on trackbacks: en.wikipedia.org [http://en.wikipedia.org/wiki/Trackback]

Discussion of trackback spam. Some of the posters make the point that the spammer may insert random text just to check whether the site is running a moderation filter, but I like the paranoid conspiracy idea much better: photomatt.net [http://photomatt.net/2005/01/05/trackback-spam/]

2006 Mar 03 [ Fri ]

Fixed width setting

A long time ago I was experimenting to try and fix long lines by setting the size of table elements. (It turns out browsers don't do what I wanted).

Anyway, I absent-midedly left a "width" spec in that sometimes caused problems. Fixed – I hope that doesn't break anything.

Incidentally, I also changed the top banner as according to my logs people were clicking on it too much. Now it doesn't look so clickable.

2006 Feb 08 [ Wed ]

Fixed layout problem in the Thai-language folder

I'm happy to be getting a bunch of hits to my Thai-language folder, but when I checked what people were seeing I realized there was a layout problem: a couple of the files had wide lines, and the browser dutifully forces a wide screen display for the entire page.

I fixed the guilty pages and I hope the people who saw the wide version weren't too put off.

2006 Feb 07 [ Tue ]

Fixed bug with my link to articles for current month

I've had a link that says it produces all the articles for the current month for some time.

A couple of days ago I noticed it was actually producing the 50 most recent articles. I've fixed that now (probably).

Note that "current month" means "dated this calendar month" not "dated over the last 30 days".

2005 Oct 13 [ Thu ]

Change in link display: now only sitename is shown

Up till now I have displayed the full URL for every link. This tends to mess up the screen display when a link is very long: because it has no whitespace, the browser refuses to wrap it and forces a very wide text column. I did this for two reasons:

1. It allows you to read the link directly off a printed page

2. I was not sure how to implement sensing whether the page was being displayed in print mode

3. Even when you are viewing the page onscreen, it can be nice to see all the links without having to mouse over them (eg Lynx).

I finally decided to change the display because the excessively wide text column was really bugging me and probably very few people ever print pages out. I would still like to implement a feature where it would sense print mode, butr actually I'm not sure what to do then even if I can sense it. Really I would like to implement print mode in css anyway.

2005 Oct 08 [ Sat ]

Another attempt to put spam on my site

Someone who has evidently not checked my Google PageRank decided to post some spam. Little did he know I would immediately detect it.

He hit the following writeback pages, with the comments below.

Oh well. I suppose I'd better disable writebacks for a while.

Notes:

1. Clearly his comments are completely generic. They are also partly nonsensical and ungrammatical.

2. They include various links – the whole point of the spam. My system presents links from writebacks as plain text anyway, so having these nonfunctional links on my pages wouldn't help him even if my PageRank were ten times better than it is.

3. The really interesting thing, as you can see, is that he links to *multiple websites* and posts from *multiple ips*. At a guess, he is a script kiddie who has used a vulnerability to install his software on those websites, and has used a vulnerability to take over multiple user machines. However he is actually clueless about the internet, as his attempt to use my absolutely negligible SiteRank clearly shows!

4. His pattern of posting is a little strange. I don't understand why he hits some pages over and over again. I also don't understand why the user agent is different each time. Maybe his software automatically flips it for each new posting, in order to make the pattern harder to see.

5. Indeed, on a more heavily trafficked site (a site with *any* traffic) his attempted defacement would have been hard to spot.

6. *Do not follow* any of his links. Although this type of spam is usually used to increase the PageRank of the linked sites, he could well have installed exploits on those pages which will turn your browser into his bitch.

Asia/Cambodia/Miscellaneous/visitor01.wb
Asia/Cambodia/Miscellaneous/wetbathrooms01.wb
Asia/Cambodia/Miscellaneous/ppareas01.wb
Asia/Cambodia/Miscellaneous/oldvc01.wb
Asia/Cambodia/Miscellaneous/capital02.wb

name: Jordan Chapman skys.jp/blog/archives/200504/06-1228.php title: Jordan Chapman comment: I really liked your comments here. I hope you're going to update your s ite soon. bring heavy cream just to a boil: www.snowhill.org/weblog/Jason /000940.html , I finished the 6th ball excerpt: blog_name:

name: Christian Jones www.cosmicbuddha.com/blog/archives/ 001169.html title: Christian Jones comment: Excellent! I enjoyed reading your material. hours drive from where: www.hookt-up.com/wordpress/?p=567 , Small brain blog excerpt: blog_name:

name: austin adams www.wnyprogressreport.wnymedia.net/ ?p=2 title: benjamin armstrong comment: very interesting! i liked it! amazing 3d effect: skys.jp/blog/ar chives/200504/06-1228.php , port abuayar excerpt: blog_name:

name: Adam Baumann www.allucher.com/sato_blog/archives/2005/04/ post_110.html title: Sean Cole comment: It's been a long time since I so enjoyed reading posts in the net. Two thumbs up! So without further delays: www.allucher.com/sato_blog/archives /2005/04/post_110.html , Small brain blog excerpt: blog_name:

name: Zachary Jones www.hookt-up.com/wordpress/ ?p=567 title: Christian Adams comment: Just letting you know - your site is fantastic! bring heavy cream just to a boil: mooshoopork.net/ pork/index.php?p=154 , So without further dela ys excerpt: blog_name:

2005-10-08| 02:04:00| 222.107.19.143| Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/visitor01.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 02:04:13| 222.107.19.143| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 3.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/visitor01.writeback| 2005-10-08| 02:06:59| 219.250.217.228| Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/wetbathrooms01.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 02:07:03| 219.250.217.228| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/wetbathrooms01.writeback| 2005-10-08| 02:17:29| 219.93.174.106| Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/ppareas01.writeback| 2005-10-08| 02:17:41| 198.20.55.71| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/ppareas01.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 02:17:47| 198.20.55.71| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/ppareas01.writeback| 2005-10-08| 02:35:03| 216.75.82.242| Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/oldvc01.writeback| 2005-10-08| 02:35:06| 210.0.200.2| Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/oldvc01.writeback| www.pani x.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 02:35:08| 210.0.200.2| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/oldvc01.writeback| 2005-10-08| 02:42:46| 80.58.4.107| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/capital02.writeback| 2005-10-08| 02:43:14| 204.196.142.41| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/capital02.writeback| 2005-10-08| 02:43:24| 80.58.11.42| Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/capital02.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 02:43:57| 211.116.211.86| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/capital02.writeback| 2005-10-08| 02:44:01| 203.83.75.26| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/capital02.writeback| 2005-10-08| 03:04:33| 222.119.57.208| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/barsigns02.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 03:04:36| 222.119.57.208| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/barsigns02.writeba ck| 2005-10-08| 03:12:12| 162.40.91.34| Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback| 2005-10-08| 03:12:19| 220.70.4.93| Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback| 2005-10-08| 03:12:33| 216.75.82.242| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback| www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ 2005-10-08| 03:13:21| 216.75.82.242| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)| /%7Edannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback| 2005-10-08| 03:14:26| 220.73.107.241| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback| 2005-10-08| 03:15:19| 207.248.240.118| Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)| /~dannyw/weblog/Asia/Cambodia/Miscellaneous/bags01.writeback|

2005 Sep 24 [ Sat ]

Oops! Had a problem with my .../index.rss output -- fixed

A few weeks ago I hacked my main blosxom.cgi file so that the rss story template code included a call to the "foreshortened" plugin, instead of the main $body variable – so that the .rss "description" field contained just the start of the story, not the entire body.

Today I used wget from the command line, and the full body was still there!

Musing about time travel and aliens, I then realized that if you get to the .../index.rss page from the main .html page, the .html is actually a link to "...//index.rss" (note the two forward slashes). Normally my .htaccess file intercepts a call to the main page and transfers it to a special "short" version (for the latest 15 stories, instead of 50), but the regex does *not* match *two* slashes. So when I tested .../index.rss from the main page, it always went to the ordinary blosxom.cgi, which was fixed, not the "short" version, which was unchanged...

Anyhow I imagine most people will have been receiving the long (full-text) version of my .rss up till now – but now they will get the nice short version (just 5k).

2005 Sep 18 [ Sun ]

I deleted my links page -- oops -- got it back

A couple of days ago I absent-mindedly deleted my links page: www.panix.com [http://www.panix.com/~dannyw/weblog/nolist/links01.html]

In my filesystem it's actually called links01.txt, and I had been editing a *separate* file *also* called links01.txt. Bad Danny.

Panix has a file server with the "snapshot" feature, so I when I realized the next day what had happened I thought I could find it easily – but I couldn't find it!

When I begged Panix for help, they promptly explained that the snapshot system stores the symbolic link to public_html as *a link to the current version of public_html*, not as a link to the snapshot of public_html! They were able to cd to the right directory (somehow... as I think about it I'm less and less sure how that actually works...) and get me back the file.

It reminds me of my old web host where if you were in public_html and went up a directory, you landed in the users directory of the webserver, not your own home directory. Confusion ensued.

2005 Aug 16 [ Tue ]

Changed format of my rss feed

Since I set up this site, I have left the format of my rss feed as the default provided by Blosxom: ie the contents of the "description" field were *the entire contents of the posting*. To me, this never seemed like the way rss is really supposed to work, but I let it go (I wasn't confident that I knew what people *wanted* as a feed, as for instance I have actually never set up an rss *client*, partly because I think polling is so stupid).

When Beth complained about getting error messages from my RSS a couple of days ago, I decided to clean up the RSS output. I installed the "foreshortened" plugin, which creates a variable containing the first sentence of the story, and put in a call to that plugin in blosxom.cgi (not blosxom-short.cgi, which handles requests for the blog homepage only, because it does not handle *.rss requests).

It now seems to be working – although I did have to munge the foreshortened plugin slightly, because I use a plugin which outputs some HTML character entities in the stories, so I just added a regex which deleted them.

I am hoping the cleaned-up XML will fix Beth's problem. I am also hoping it will save bandwidth both for me and for my users (both of you).

Incidentally, I'm guessing that most people who grabbed my RSS feed had *no idea* they actually *already* held the full text of my articles. Their clients probably trim the text in each description field *anyway*. They may have noticed that the feed could be as much as 100 kB, and just figured "oh well, XML is a bloated format" – that's what I always say.

2005 Aug 08 [ Mon ]

Added link for privacy policy to main pages

I have now put in the html for the link to my privacy policy, so that it should show up on all my .html pages. I still haven't checked it under IE, but we'll see...

2005 Aug 07 [ Sun ]

Added a Privacy Policy to the site for IE users

I still do not really understand how a "privacy policy" is supposed to work, except that it is intended to make it harder for the average user to implement effective security. Still, I went over to http://www.p3pwiz.com

Now I have installed the files they generated at (hopefully) the right places:

Human readable policy: www.panix.com [http://www.panix.com/~dannyw/privacy.html]

Machine-readable policy: www.panix.com [http://www.panix.com/~dannyw/weblog/w3c/p3p.xml]

I think I probably have to add a header to each page, along these lines, but have not done so yet:

 <link rel="P3Pv1" href="http://www.panix.com/~dannyw/weblog//w3c/p3p.xml"> 

The validator said there were no format errors: www.w3.org [http://www.w3.org/P3P/validator.html]

I haven't actually tried it with IE yet. I have a nasty feeling that this privacy policy stuff (like robots.txt) was intended to cover entire domains, not picayune user webpages like mine... but we'll see. Hopefully that link in the header will get people where they need to go.

Since I am not in fact a huge e-commerce site which sells all my users' info to the highest bidder, I don't think you have to worry too much. On the other hand, in order to cover my very minor efforts to play with cookies and track users I had to assent to privacy clauses which basically covered me for selling everything up to the spleen of your unborn child. It confirms my impression that the privacy policy scheme was intended to confuse and stupefy regular users – it certainly did me.

Incidentally, when I just rechecked the cookies stored in my browser, I see that "p3pprivacy.com", which sent me to p3pwiz, actually set a bunch of cookies with *no hostname set*. I wonder if they cover that in *their* privacy policy? Bwahaha.

Peculiar Blosxom cookie bug

I had left the Blosxom cookie module enabled in the plugins directory although it did not seem to be doing anything.

Today I realized that if I navigated around the website something was causing the browser to store cookies with the name of the current directory and a strange hex hash as data. After suspecting my own cookie generating code for a long time, it occurred to me to disable the cookies module, and the strange excess cookies stopped.

I might have left it in, except that those strange cookies would *never go away*. A user who extensively browsed the site might well exceed the maximum number of cookies.

Additionally – not that this was very important – the "path" was set to "/", meaning any *other* Panix user website could access my cookies off the user's browser – suboptimal.

I have added cookies to this site

In an attempt to figure out how cookies work I have just gone ahead and added them to this site, *without* figuring out IE's "privacy policy" thing, which delayed me for a long time: www.panix.com [http://www.panix.com/~dannyw/weblog/Chrome/cookies01.html]

I have not figured out how to use Blosxom's cookies plugin. It looks right now as if panix's Apache setup does not allow you to rewrite the header – which contains the cookies – without naming the script nph something. At any rate, any attempt to actually call the cookies plugin gets me a 500 error. So instead, I am using a completely separate Perl script that just gets called from any .html page.

I don't actually *use* them for anything much, yet. You can certainly still browse the site with cookies turned off – but your IE may complain about the absence of a privacy policy.

2005 Aug 02 [ Tue ]

Some minor changes to site

As well as fixing yet another .htaccess bug (you can't have multiple RewriteRules depending from a single RewriteCond!!) I tried to clean up the main template page to use css for font sizes and general colors, instead of inline definitions.

The pages now look slightly different but it is not significant.

I have not checked the pages in IE yet so they may look a little odd...

2005 Jul 29 [ Fri ]

Still trying to restrict robots

My previous attempts to restrict robots using .htaccess did not work: I guess I had gotten the syntax wrong so everything I tried made the server give a 500 error. I kept thinking I could fix it but after a while it was just too embarrassing, so I never admitted it before.

I recently found a much more helpful page than anything I had seen before: httpd.apache.org [http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html]

I had previously been searching for terms like "htaccess" because it seemed to me I was looking for how to use variables inside htaccess. I guess that such things are actually *only* used inside htaccess blocks relating to mod_rewrite. I may have glanced at the page before but there seems to be a lot of forbidding stuff before you get to the meat of the syntax, so I may have skipped right by it!

Anyhow, there are a lot of essential details, eg the difference between %1 and $1 variables, and how to do ORs and ANDs. At present my changes to .htaccess seem to be working OK (although there are a few more unwanted access modes that I still haven't nailed down). It does seem to be disallowing things called ".*bot.*" from viewing .*\.prn pages, for instance.

2005 Mar 25 [ Fri ]

I removed the trackbacks feature

Since I never understood it, and as far as I could tell nobody ever used it, I have disabled the trackbacks feature, which hopefully will cut down on Google hits by a factor of 25%.

2005 Mar 18 [ Fri ]

The reason why I'm getting so many hits to pppics02

My original article on this topic: www.panix.com [http://www.panix.com/~dannyw/weblog/Chrome/limit12.html]

I checked out the referer on one of the hits: icq.hot-news.org/fromwmv.html

This brought up a page where the text of my article had been scrambled together with many random words, apparently so it would be impossible for me to find the text by a Google search. It did include a link to my original page at this site. Then that page closed itself.

It took me to a page which displayed in a window with *no controls*. I couldn't resize it, or get to any other IE features. To give you some idea, the title of the window was: "Original amateur and swinger sex photos and videos".

When I hover over the tab in the taskbar, IE reveals: connect2cash.biz/new2/hta.php?account=adv367 (I intend this link to be non-clickable)

The page says eg "Les membres de Adult Friendfinder proche de Phom Penh" and "Find a real sex partner in Phnom Penh now!" There are many pictures that show a *lot* of skintone... none of it, as far as I can tell, Asian. I'm assuming that whoever did this probably added my text to many, many similar websites.

This is the sort of thing I feared. Some twerp uses my text to entice the hapless to his stupid sex website. In one way, it's cool that his site detects that this workstation is in Phnom Penh. On the other hand.. what kind of lamer lives in Phnom Penh and has to look for *internet* porn?? Also, he is very *unlikely* to be interested in my site when he haplessly arrives at it (even more than most people).

Incidentally, I wish Google would clamp down on this sort of thing. I suppose it tries: by checking the sample text surrounding the search term, you can usually be warned by text like:

gangbangers ferrari hot action two-on-one Windows XP SP2 problems freesex babes

Still, many search terms these days produce *dozens* of hits like that before you get to anything useful.

Later: when I closed the window with no controls (other than a close button) it brought me back to the original. For your delight, here is the text which somehow enticed hundreds of hits to my site:

My laptop how to remove drm from wmv seems how to remove drm from wmv underpowered and glitchy for video. I had a green day - boulevard of broken dreams lyrics badexperience with how to remove drm from wmv the Quicktime player which came how to remove drm from wmv with my Minoltadigital how to remove drm from wmv camera: it how to remove drm from wmv was amazingly intrusive, inappropriate lyrics search and when I triedto remove how to remove drm from wmv it, it kept reinstalling how to remove drm from wmv itself how to remove drm from wmv like how to remove drm from wmv malware.When how to remove drm from wmv I finally got how to remove drm from wmv rid of how to remove drm from wmv it, it had how to remove drm from wmv left so ask jeeves google agreement much poop in theregistry that I was toxic instrumental britney spears unable to play my camera's how to remove drm from wmv Quicktime how to remove drm from wmv recordingsfor how to remove drm from wmv several months. (I'm not even sure how to remove drm from wmv what how to remove drm from wmv fixed that problem how to remove drm from wmv . nursing jokes I how to remove drm from wmv thinkafter I installed a ebay auction builder service pack, something mapquest new zealand which had not fixedthe problem when I tried bittorrent sites it before finally worked.)I *still* don't how to remove drm from wmv have how to remove drm from wmv a way how to remove drm from wmv of converting Quicktime format how to remove drm from wmv toanything that my music lyrics editing how to remove drm from wmv

I guess a *lot* of people want to "remove drm from wmv".

2005 Mar 17 [ Thu ]

I'm rearranging the look of the site

As I promised, I am (lackadaisically) getting around to splitting out a lot of the features from the normal page format, such as the list of links which used to appear at the left of every page: www.panix.com [http://www.panix.com/~dannyw/weblog/nolist/links01.html]

My original intention in providing all this stuff on every page was to make it evident, for someone who reaches one of my pages via a search, that there is a lot more stuff on the site. As far as I can tell, however, these links are not used often enough to be worth the extra download time on every single page.

Eventually I intend to get rid of the left column entirely and just have a single clickable image for site navigation. However, I want to retain the feature that the site can be navigated in a text browser like lynx, so it may not be as clean as that.

Recent overloading at this site

Over the last few days the overload limit has triggered two or three times. Unusually, this seems to have been because of, not the usual Googlebot scanning, but actual people interested in ppics02 for some reason. I haven't figured out why because there seem to be many different referers. Also, I can't see why there would be a lot of interest in it. All I can think is that people are in general more interested in images than text – especially my opinions – and maybe I absent-mindedly named an image "cuteyoungcambodianboys.jpg"or something. Judge for yourself: www.panix.com [http://www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/pppics02.html]

I just got an email from panix saying they've increased the free download limit, so I'm going to go in an enable it.

I've noticed that Google does not seem to have a record of all my pages, presumably because it always gets cut off before it spiders my entire tree. Maybe the new limit will help.

2005 Mar 02 [ Wed ]

I've added the "hide" plugin for Blosxom

This plugin makes it easy to mark certain files and directories so that they are not displayed in the normal chronological order, although they are accessible directly. They also show up in the tree view of my site.

I want to use this feature for files which I need to display in Blosxom for the sake of consistency, but where their *date* may be misleading: eg files which I continuously *update* rather than filing a single time under a certain date. I want to move a lot of the sidebars which I currently display as part of *every* page onto separate pages, to save download time; in particular I want to move the external links section to a separate page so that I can arrange and describe them better.

2005 Feb 26 [ Sat ]

More Google overload gives me idea to restrict bot access

Over the last few days the Googlebot has downloaded enough pages to exceed my server's capacity limit twice. This is irritating enough, but as it happens most of the downloads were not to real information pages (well, *I* like to think of them as real information) but to ancillary stuff like the .trackback and .writeback features. Although these are marked "do not index", by the time Google sees that it's already done the download.

While fuming about that (and the way that Google, like the phone company, never responds to complaints), it occurred to me that there is a partial fix for that using .htaccess.

The .htaccess file is checked by Apache every time it gets a file request, and controls the response in many interesting ways. (I have referred to this before.)

I have now (attempted to) set up my .htaccess file so that whenever a request arrives [for a page which I do not want Google to index, such as the writeback and trackback features – 2005-02-27] and the "user-agent" is set to "*bot*", Apache just sends back a *very short* page saying that this page is not for bots. This should cut down on the bandwidth load considerably (not to mention the CPU time; although I am not charged for this, I do actually feel guilty about it, because if this server were all mine I would put a lot more effort into minimizing CPU load).

Good overview of .htaccess: apache-server.com [http://apache-server.com/tutorials/ATusing-htaccess.html]

Some hints on handy uses: www.edevcafe.com [http://www.edevcafe.com/viewdoc.php?eid=92]

I may well need to tinker with the .htaccess setup a bit so I apologize if anyone tries to use a feature which I have accidentally disabled.

A link to me exists which is not shown by Google

When I checked my (very short) server logfile today I noticed that someone had come in to this page at my site: www.panix.com [http://www.panix.com/~dannyw/weblog/2004/08/07#wxpkey01] from the following page: www.ccdigest.com [http://www.ccdigest.com/news/53930.html]

I took a look at the page. The page doesn't actually *show* that the material is ripped from my site; they just include a link to it. The page has a link to a Google AD url: so they want people to find "their" page with *my* info and get Google ad money for it.

I suppose I would be more agitated about this if most of *my* page didn't consist of something which *I* had ripped out of Slashdot.

It's quite interesting that Google does *not* reveal the existence of this link:

Your search - link:www.panix.com [http://www.panix.com/~dannyw/weblog/2004/08/07#wxpkey01] - did not match any documents.

(I also tried the search without the "#wxpkey01" – same answer.) I don't see why Google *wouldn't* return this page: after all the ccdigest.com site makes its money from people who come in via Google, so surely they would make the page searchable. Hmmm.

2005 Jan 19 [ Wed ]

The panix.com domain was recently hijacked

According to my logs I would probably have lost only 7.12 hits during the period, but for about a day the entire panix.com domain was hijacked by some sort of bad guy.

Link to Slashdot discussion: it.slashdot.org [http://it.slashdot.org/article.pl?sid=05/01/19/017229]

Link to a webpage produced by Panix to explain the situation: www.panix.com [http://www.panix.com/hijack-faq.html]

In general I was impressed by how many commentators made the point that panix is highly respected.

Someone on Slashdot correctly pointed out that since the bad guys owned the domain, they could have set up dummy mailservers which did nothing but record the username and password of people who attempted to download mail from them.

It is certainly possible to set your email client to do encrypted logins, but actually I don't know the details of what happens there and it may well be possible for a server which just is pointed to by the current fraudulent dns to grab your account info (ie, the encryption scheme may only encrypt the data on the link *between* your client and the mailserver).

Personally I would like to get *multiple* logins with any account, some of which can *only* be used for email, some for shell, etc.

In my own case I don't think I was compromised because I read email via ssh and *pay attention* to the warning strings that give the server ID. Hmmm... I wonder if an attacker can somehow replay those? Gulp...

2005 Jan 17 [ Mon ]

Peculiar problem with the link to Slashdot

For a long time I have had a link to Slashdot, the forum website for computer geeks, on my blog pages. Recently it stopped working for me. Instead of opening up a regular browser window, it opens up a grey warning box for "File Download", saying "Some files can harm your computer..." and offering to download the filename "slashdot" from domain "slashdot.org".

I wondered if I had been tightening security settings on the client too much but experimenting got nowhere. I figured out however that I *could* reach eg apple.slashdot.org correctly.

Then I noticed that this current machine shows slashdot.org as a *trusted* site. When I checked, there were a bunch of obvious trojan and adware sites in the "trusted" group. Even after I deleted them (and Slashdot was not among them) it still was shown as "trusted"!

I changed the settings under "trusted" to my usual suspicious level, but I think this machine (in an internet cafe) is hosed. However, I have been noticing this problem on several *other* machines. I saw nothing obviously wrong in the process list. "netstat -an" also showed nothing suspicious.

I changed the link to "main.slashdot.org" just for my own convenience. I'm guessing that on this machine, all page requests are going through some sort of redirector which is not correctly programmed to handle urls which have no machine name in front of the domain name.

2004 Nov 11 [ Thu ]

Only one article posted in a month

My faithful fan may be relieved to know that I was not enjoying an expenses-paid stay at a government institution. The explanation is this: two private projects – stuff I don't talk about on the weblog because even *I* think it's too dull to post – have been taking up all my attention.

I am starting to wind down on both these projects and hope to be back in a week or two posting as frequently as before. With a bit of luck, I may feel energized enough to do a major redesign of the website, too – I plan to make the pages faster to load, and cut down unnecessary hits by webcrawlers.

2004 Oct 02 [ Sat ]

I have started checking my page with w3c's "validator"

Among my "favorite links" is now a link which tells the w3c to validate my webpage. validator.w3.org [http://validator.w3.org/check?uri=http%3A%2F%2Fwww.panix.com/~dannyw/weblog/]

The first time I tried it I got an eye-popping 174 errors – more than Slashdot! A lot of them were mismatched p elements, which I already knew about, but most of them were just flubs which I hope to hack away. It's really amazing what IE manages to display apparently cleanly. Right now I'm at about 130 errors...

2004 Aug 07 [ Sat ]

Still bugs in my print format

The issue now is actually one which has puzzled me from the beginning: exactly what sets the displayed right margin?

It seems that IE refuses to wrap a word if it cannot find whitespace. So because I display very long URLs occasionally, they may force the right margin to extend farther than I thought I was setting, and once the right margin has been pushed out all the rest of the text expands to match, which allows too many words per line, and damages readability, as well as sometimes making it necessary to do a horizontal scroll. (For purists who feel that the user/browser should control layout parameters: I am struggling to do this in *stylesheets*, so it can be overrridden when desired.)

The problem can also be triggered by "pre": this HTML keyword seems to prevent IE from breaking the line even at spaces.

The problem is worse in the printable format, because while IE still refuses to break the line, it simply *throws away* text outside the right (physical) margin of the paper, with no error messages. As I often use "pre" to display code, this can be disastrous.

I am currently trying to find this issue addressed on the web, with no success, despite much tinkering with .css files. I apologize.

2004 Jul 31 [ Sat ]

Irritating problem with the HTML "pre" tag

I tried to create a simple table today using the "pre" tag, which my .css file defines as monospaced, but it just would not work.

For some reason the machine I have been using in this internet cafe displays monospaced using some sort of OCR font, not Courier. I wondered if this was because it's Thai Win98, and tried defining the language as "en-US" in the metatags, but that was no help.

Eventually I gave up and used a table, but of course this was no fun because it interacts with the code which automatically creates paragraphs out of plain text. Eventually it sorta worked. I need to see what happens with non-Thai setups.

Added feature to display writeback Google hits as .html

A couple of days ago I realized Google was indexing a lot of my writeback pages, and even causing users to prefer them (because they are smaller).

I don't like this because the writeback pages have no navigation. So I've added a gadget to my (now bloated) .htaccess file to rewrite incoming hits via Google from .writeback to .trackback. I haven't added a rule for ".trackback" yet: I'll wait and see if the rule seems to work for ".writeback".

2004 Jul 30 [ Fri ]

Checking how Google is indexing my site

While reading a Slashdot article about a book called "Google: The Missing Manual" it occurred to me that it would be useful to do a Google search which returned *all* Google's entries for my site.

The search term I used was this: "dannyw site:www.panix.com" (because Google does not seem to recognize the "~/danny" part with the "site:" operator).

When I ran it today it returned 713 hits when there are 699 separate articles. That might be OK – I want Google not to index compilation pages; but unfortunately Google are including ".writeback" pages in that total. Worse, I now see that at least one of those ".writeback" pages has "noindex,nofollow". Presumably that means I have the syntax wrong somehow.

Also, there are plenty of compilation and daterange pages in the list.

I think what I need to do is use Apache to automatically convert people coming in from Google to a .writeback page to go to the .html version. The heck with what Google says.

Since non-".html" pages are included in the total of 713, that means Google *still* has not indexed my entire site. Sheesh. I suppose it doesn't help that every time they decide to index the site it triggers the overload limit, but you'd think their recovery algorithm would handle that better. Looking at the logs though, I can't make out any plan to what pages they hit. In particular, they hit the same damn page over and over again. F'petesake.

2004 Jul 29 [ Thu ]

My site did indeed get spammed

A couple of days ago I noticed that some twerp had added half a dozen spam messages to this site. The messages were all essentially meaningless and thus did not have any relationship to the article they purported to comment on (or so I like to think). The common factor was that they all contained the url of a commercial site (all the same one).

This type of spam is usually done to add links back to the spamming website: Google thinks that that means this site is saying that site is interesting, so it (infinitesimally in my case) improves the search ranking of that site.

Here is some of my log output:

7/27/2004|19:57:35|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Soc iety/mmoore01.writeback| 7/27/2004|19:57:48|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Soc iety/mmoore01.writeback| 7/27/2004|20:2:28|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Poli tics/Iraq/saddam01.writeback| 7/27/2004|20:2:35|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Poli tics/Iraq/saddam01.writeback| 7/27/2004|20:3:1|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Polit ics/Iraq/thewarinusa02.writeback| 7/27/2004|20:3:4|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Opinions/Polit ics/Iraq/thewarinusa02.writeback| 7/27/2004|20:3:37|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Computers/Int ernet/sitecert01.writeback| 7/27/2004|20:3:40|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Computers/Int ernet/sitecert01.writeback| 7/27/2004|20:4:5|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Computers/Inte rnet/png03.writeback| 7/27/2004|20:4:9|151.38.236.213|libwww-perl/5.800|/~dannyw/weblog/Computers/Inte rnet/png03.writeback|

I remember being worried by seeing "libwww-perl" as soon as I saw it. The ip number looks up as " adsl-213-236.38-151.net24.it". I'm not going to quote the url he wanted to spam because a) I don't want to give him any more publicity and b) it could be a joe job, but maybe I just have a suspicious mind.

I don't know why he bothered because my site displays writebacks without any html: ie, the page does not have an active link to the url, just the *text* of that url. Presumably Google does not rate that very highly. Maybe he added the msgs before checking how they would be displayed.

It occurs to me that the perp probably used Google to search for "writeback" in order to find victims. I should probably rename the feature somehow to make it harder to search for.

He was clever enough to add writebacks to older pages that would not display on the current start page, so that I would not detect it, but as it happens I had written a little batch file to easily check for recent writebacks (in the vague hope that anyone was actually using it as intended) so I spotted the misuse as soon as I logged in. I left it for a while as the result was not much of a problem. Today I wrote a little batch file to snip out the spam and store it in a zip file (basically "zip -rtmT" – "zip" has a lot of handy features – although it took me a while to figure out you have to use "unzip -l" to list the contents).

It occurs to me that his bot could be set up to create spam on *every single* page that Google has indexed with writeback. Wow! On a low-volume site like mine that's easy to clean up but it would be a huge mess on an active site. Maybe I need to add a field to a posted writeback which contains the incoming ip, to make it easier to filter if necessary.

Again, it seems to me that this sort of issue needs to be addressed in some sort of overview documentation for the writeback feature.

2004 Jul 27 [ Tue ]

Pesky Google and my writeback pages

Although I recently set my writeback pages to "noindex", Google already indexed plenty of my writeback pages prior to that. So I've noticed in my logs that plenty of people are coming in directly to the writeback page.

The writeback page includes a simple text version of the article, so the user is probably happy. I think the user prefers to click the writeback version because Google seems to have a rule of displaying the writeback link as the main link and the regular page as the inset link. Presumably that's because the writeback link, being simple text, is about half the size in kB.

Of course the problem for me is that the writeback page has no navigation links to the rest of my site, so users who come in to the writeback page never check out anything else.

Hopefully this problem will subside as the indexed writeback pages slowly age out of Google. If only there were some automated way to *set* stuff like this in Google.

2004 Jul 26 [ Mon ]

Printing-format "flavor" seems OK now

A few days ago I implemented a fix for the print-format "flavor" feature. Mindful of my previous problems with this, I decided then not to say it was fixed until I'd tried it for a few days. It looks like it is OK now. I think I'll leave the link to it saying "testing" for a while yet though!

2004 Jul 25 [ Sun ]

Have set my individual postings to use the "nofollow" tag

Yesterday I changed the template for this site so that the single-story pages will include the following line:

<meta name="robots" content="index,nofollow">

Obviously the reason for this is to prevent robots like the Googlebot from following links on the page. The reason I need to do that is because I've added links to trackback/writeback features, and of course for Google to index those is a waste of time for me and Google.

This fix is not perfect, because *compilation pages* also include the writeback/trackback links, and I don't want to add "nofollow" to them because then Google would never reach my single-story pages.

The only clean way to do it would be to have no stories on the front page, but include a link to a page listing links to *all* the stories (maybe visible only to robots). Everything else would have "nofollow,noindex" except the individual stories which would be "nofollow,index". But I dislike home pages which are effectively just a "splash" page. And I'm pretty sure they discourage people from going further into the site.

Incidentally *most* of the tweaks I have applied to the site have been directed at robots. I haven't bothered listing them as they are (hopefully) invisible to users. I'm listing this one because I'm irritated that this issue isn't mentioned in the docs for the trackback/writeback module of Blosxom. For instance, I just realized as I was writing this that I *also* need to set "noindex,nofollow" in the writeback pages themselves.

2004 Jul 12 [ Mon ]

Another bug in my print-format flavour

My code does not grab the current path and filename properly, so the link provided when you click on "formatted for printing" is incorrect , unless you are looking at a topic rather than a single file or date range. (Guess what I originally tested it with.)

I took a shot at fixing it last night but I haven't sorted it out yet. Meanwhile, you can get the basic feature if you want just by changing the file extension on the normal URL to ".prn". For instance, if your current page is

www.panix.com [http://www.panix.com/~dannyw/weblog/Computers/Opsystems/Windows/filext01.html]

you can manually edit the end of that URL to make:

www.panix.com [http://www.panix.com/~dannyw/weblog/Computers/Opsystems/Windows/filext01.prn]

If there is no filename at the end, you can use "index.prn", eg:

www.panix.com [http://www.panix.com/~dannyw/weblog/2004/02/index.prn]

2004 Jul 10 [ Sat ]

Now added print format "flavour" as option for all blog pages

A few weeks ago I noticed that for some reason IE was truncating the right margin occasionally when I printed out pages from this site. It seemed to be a problem inside IE because the point of truncation was still within the printable area judging by the printed header/footer.

I have now created a ".prn" flavour for the pages. There is a link to this "flavour" in the lefthand sidebar. It seems to print out better than the standard format; at least it doesn't waste time on navigation elements.

It's not super-clever. In particular, I would like the code which expands URLs to know whether it is being called inside a regular (.html) page or within a .prn page and adjust appropriately.

Also, there's a slight bug. Because of the way I detect whether the displayed page is the home page or a different page, the home page in print mode displays differently from the home page in normal mode – at present the main difference is it shows 50 articles instead of 15.

2004 Jun 30 [ Wed ]

Some info on trackbacks

Because I never used or read blogs before I set up this one, I don't understand a lot of things about them. In particular I don't really get trackbacks.

Google brought up the following links:

Movable Type has a nicely-formatted explanation which is unfortunately very much aimed at their own product, so I found it quite opaque: www.movabletype.org [http://www.movabletype.org/trackback/beginners/]

More readable explanation, regrettably also oriented to Movable Type: www.cruftbox.com [http://www.cruftbox.com/cruft/docs/trackback.html]

More technical but far more informative explanation of how someone programmed it by himself (otoh, he says at the top dated 2003-03 that his system no longer works and he hasn't figured it out yet): www.hitormiss.org [http://www.hitormiss.org/projects/trackback/]

It would seem from the above link that one could send a ping to my weblog using an URL such as the following (warning: my basic display code automatically formats URLs to be clickable, so I have had to change http to dttp here):

[2005-10-25: added whitespace to avoid long-line issues] dttp://www.panix.com/~dannyw/weblog/Asia/Cambodia/Miscellaneous/ twobros01.trackback?dttp://www.panix.com/~dannyw/weblog/ &blog_name=danny+test&title=My+first+trackback+test

It returned the following XML page, which does not have an explicit error msg but is otherwise not very encouraging:

  <?xml version="1.0" encoding="iso-8859-1" ?> 
- <response>
  <error /> 
  <message /> 
  </response>

...Hmmm. It appears the creator of Blosxom has a page on this: www.raelity.org [http://www.raelity.org/archives/2002/09/06#computers/internet/weblogs/blosxom/trackbacks_in_blosxom]

Surprisingly, it appears to advise you to download code from Movable Type for this! And it only supports *receiving* trackback pings. Hmmm.

He says: MMmmm... you just gotta love that simplicity of integration.

I can't tell if he's joking or not. And how is it supposed to work with the existing trackback/writeback module??

This guy grumbles that trackbacks are *easy* to understand. Ten people provide comments that they're difficult, but he refuses to believe them: nslog.com [http://nslog.com/archives/2003/03/31/trackbacks_tough_to_understand.php]

The "hitormiss" link above includes the following link to a technical spec at Movable Type which makes far more sense than their overview above: www.movabletype.org [http://www.movabletype.org/docs/mttrackback.html]

2004 Jun 27 [ Sun ]

Writebacks enabled again

"Writebacks" are the name of the feature in the Blosxom software which runs this weblog which allows readers to add comments. Previously I ran into some problems, as well as security concerns, so I had left the feature disabled.

I took another shot at enabling writebacks – mainly because the feature is intertwined with trackbacks – and it seems to be working OK. However I am very nervous about allowing random twerps to add junk to the site when I'm perfectly happy with *my own* junk. So I may well disable it again.

Btw, when you click on "View/add responses" it shows the entire text of the article you're responding to, along with a form to fill in with your comment. You need to fill in the form and then click the "Post" button (way at the bottom). The system will send back the same page with your posting at the bottom of any previous postings.

I'm actually very vague on how the trackback feature works *at all*, so if you think it's not working, you're probably right. Hopefully you can now let me know.

2004 Jun 26 [ Sat ]

An introduction to Danny's Weblog

I realized recently that the top right-hand graphic of my weblog page – it reads "Danny's Weblog" with an arrow pointing to "Opinions, Languages, Links, Computing, Asia, Reviews" – may mislead people into thinking it's clickable, and when they try clicking on it and nothing happens, they may conclude the site is broken.

I considered adding a click action that just says "Please *don't* click here" but (unusually for me) reconsidered.

So now clicking that graphic brings up this: an intro to the site.

I have always provided the "Chrome" subtopic for info about the site:

www.panix.com [http://www.panix.com/~dannyw/weblog/Chrome/]

and you should look there also.

I didn't originally intend to make the graphic clickable because I mildly disapprove of using graphics to provide basic site navigation. (I am also too lazy to keep updating the graphic with new topics.) That's why I use the Blosxom plugin which provides a clickable text tree showing all the subtopics on the site (see "Site Map" in the left column).

Note that when you click on a subtopic, Blosxom also displays postings from *sub-sub-topics*,in reverse date order, up to some limit – currently fifty. The number of articles in each subtopic *and its sub-sub-topics* is displayed in parentheses next to the topic on the site map.

Many of my opinions may appear paranoid. My basic justification for that is the following thought experiment: If you have to choose one of 3 possible explanations A, B and C for something, and you assign probabilities of 15%, 10% and 5% to them, and the government is pushing "B", what is the most productive (minimax) response? It seems to me that most people choose one of two strategies: they actually *believe* the government explanation (because it causes least worry and trouble – I suppose the psychological process involved is rather like the interrogation sequence in Orwell's "1984"), or they choose no explanation at all.

I happen to think – in my paranoid way – that the government itself creates most of possibilities A and C, and D-K. So people who choose *no* explanation allow the government to continue to rule via a pyramid of lies.

"Our so-called leaders speak."

2004 May 17 [ Mon ]

No, I *didn't* fix the quotes problem

It turns out that my regex at least sometimes fails, apparently because the main regex which detects a para and wraps a para format around it *actually triggers at the end of the para*.

Grumble grumble... too lazy to fix this today.

2004 May 16 [ Sun ]

Fixed quotes in Mozilla?

A few days ago I noticed that Mozilla was not displaying quoted text from my site. It transpired that was my fault: the code which wraps paragraphs in a style was also wrapping a bare quote command, which meant that it did not next properly: as usual, that caused Mozilla to ignore the quote command completely, but IE tried to DWIM, and succeeded.

After some head-scratching, I've succeded in patching the perl. Most of my confusion was due to forgetting a /g at the end of the regex: at that point the entire file is in a single variable, so my regex would only work on the first quote command! Sheesh. I hope recruiters never read blogs.

I also hope my patch doesn't ding something else...

2004 May 14 [ Fri ]

Fixes I need to carry out

1. I still haven't fixed the busted quote problem in Mozilla. I found the right place to fix it in the code, but my regexes didn't work right – *blush* – will get back to this when I feel energetic (if ever).

2. I found why the "writeback" stuff was showing up in the HTML source, and in Mozilla – because I'd absent-mindedly left it in the story flavor file, duh.

3. On the other hand I'd like to add trackback features (not that I incestuously quote other blogs much, but it's neat conceptually). And they're intertwined with the "writeback" feature. So I may get back to that.

Have now set site to reject some robots

Although I am happy that people want to read my many insights of genius, it sometimes bugs me when people download my *entire* site. I have to wonder if they're just using my text to set up a plausible-looking fake website so that they can fill it with spamlinks. (Although everything on my site is of interest to *me*, there is very probably no other person in the solar system with the same *spectrum* of interests.) Additionally, I have to pay for traffic, so I don't like it when people hit the site too often.

Accordingly I have started to set up some triggers which may cause you to receive a "blocked warning" under some circumstances, such as:

1. You set your newsreader to check the site more than once per day

2. You hit the site more than ten times in 24 hours (under some circumstances)

For the latter condition, I intend to set up a rule which will actually allow normal browsing, under most circumstances, to proceed, but which will catch robots. I don't intend to describe this rule in detail.

2004 May 08 [ Sat ]

How to attract people to your site

Back in February I quoted a Slashdot poster who was talking about a security problem in Windows: because Windows by default hides file extensions, someone will eagerly click on boobies.jpg when it in fact is boobies.jpg.exe.

Now I find the following in my log file:

5/7/2004|10:6:20|12.36.152.153|Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0 ; FunWebProducts; .NET CLR 1.1.4322)|/~dannyw/weblog/Computers/Opsystems/Windows /filext01.html|www.google.com [http://www.google.com/search?q=boobies+.jpg&hl=en&lr=&ie=UTF-8&oe] =UTF-8&start=40&sa=N

which tends to suggest that the Slashdot poster was very correct. How desperate would someone have to be to click on Google's listing for *my* site for boobies? It must list around 612,497.

Btw, since *this* file has "boobies" in it 5 times, maybe it will attract more hapless victims. (Dr Evil laugh)

2004 May 06 [ Thu ]

Bug in this site's display code for comments using <q>

I must admit I hadn't bothered to check out whether my site looks OK in anything other than Lynx and IE 5/6. I assumed that since I was using rather basic formatiing not much could go wrong.

It turns out that Blosxom handles paragraph formatting in a way which does not allow my own formatting to nest. Specifically, at the beginning of each paragraph, Blosxom puts in a "<p class=story_para>" even though my own HTML may come after this. At the end of the para, Blosxom closes the <p>, violating the nesting rule for my own formatting!

I only use the "<q>" element for formatting more than a single line, so this bug only causes a problem with quoted text. And IE doesn't show the problem: as is generally the case, IE takes a shot at displaying defective code, whereas Netscape descendants like Mozilla etc just ignore the offending tags. Anyhow, if you use Linux, or a Mac, or you just prefer Mozilla, that's why quoted text always ends after a single para. At least until I figure out a fix. (I would much rather not have to screw with the source files, but I'm not looking forward to trying to get Blosxom's para formatting to safely intertwine with my own.)

Btw, when I looked at the Mozilla output, it also shows an empty <q class=writeback></q> pair. I don't know why that's in there: I thought I'd disabled this.



I hope this information was useful. There may be a great deal more information on this site that is relevant to what you need. Take a look at the "site map" display at left; you can click on a topic to see many recent items on that topic.

Debug: hittotal: 4 startban: 0 dancookie: endbandate: banned: 0 tempdate: tert: jse: jsno jsh: 4