Tuesday, 29 March 2005

Zoominfo people search

Zoominfo, a search engine for info about people, was launched a week ago. Its slogan is "people information summarised" and indeed, one may well take the p.i.s. Why? For starters, search results aren't exactly the best. And the way ZoomInfo have set out their site raises my "consumer protection" and "user freedom" hackles.

Try Zoominfo's people search tool out for yourself (it's free for personal use) and you'll see what I mean. For instance, here's what you get (if they don't they tweak it [added 1 April 2005: i.e. tweak their algorithms] after this post!) when you search for "Tony Blair" - who as you'll know, unless you've been trapped in the Dungeon Dimension for the last decade, happens to be Prime Minister here in the UK (at least for another month or so, anyway). There's just ONE entry in the results list:
Tony Blair at Salina Human Relations Commission
First Sergeant (Ret) Tony Blair enlisted in the Army in 1975 at Ft. Polk, LA where he attended Basic and Advanced Individual…

And when you click on the name, the summary on "Tony Blair" says:
Mr Tony Blair
Salina Human Relations Commission

Now try searching for "Anthony Blair" (by the way, whether you use quotes or not makes no difference to the search results, I've found, so you can save your fingers and omit them. Hey, every little helps...). The Salina bloke still comes up first (search results are ranked by "Web Popularity" - the number of mentions on the Web). Ah, but then next in the search results list we get:
Mr Anthony Charles Blair
Biographer and Political Editor
New Statesman magazine

whose "Additonal Current Employment" is listed as:
Labour Representation Committee Leader Youngest British Prime Minister

And he's also down against "WIP Inc" as "Prime Minister". That sure beats "Director" doesn't it, as a corporate executive's title. C'mon, wouldn't "Imperial Chemical Industries PLC, Prime Minister" sound much better than "Chairman"?

Funnier still, the third entry in the search results list is "Tony Blair at Number", which you figure out is the site for No. 10 Downing Street. At which "company", if you click his name in that list, he's said to be "Top Economist". His "Other Titles" are "Position In Office , Foreign and Commonwealth Office". And the extract from the top source for that web summary includes "Exploring Leonardo" and - wait for it - " Parliament- explained to children". If you'd like to explore further, there's also a summary of the company (No. 10's summary starts with "We have made some progress. Such as reducing payroll burdens for 1.2 million businesses through our plans to pay Working Tax Credit direct to individuals rather than through employers, in response to CBI concerns").

Links are provided to the web sources for these people/company summaries, and you can even forward a particular person's summary to a friend (the Tony one's probably worth a laugh or two), link to it, or send ZoomInfo feedback on it - though what action they'd take in response to feedback I don't know, especially if the person sending them feedback isn't the person summarised. [Added 1 April 2005: thanks to Brian Payea, Director of Corporate Communications at Zoominfo, who in his comment on this post explained what they use this kind of feedback for - to edit and improve their search and recombination algorithms.]

Gordon Brown however fares rather better, Tony's arch-rival coming up first in the "Gordon Brown" search results list (so clearly Tony B's a bit behind Gordon in the "Web Popularity" stakes) and being described as:
British Chancellor
United Kingdom

(But as for "Exchequer", sadly " No additional information is available about this company.")

Our Tone must be well pleased about that: some other "Tony Blair" more popular on the Web than him, and his summary not quite right, while Gordie B is the first Gordon Brown and moew accurately summarised?

The New Scientist article where I first read about Zoominfo also points out that "President George W Bush is listed as the British Prime Minister, the governor of Florida and the Governor of Massachusetts, as well as the president of the US." (And the UK is in Washington, didn't you know?) Quite. Not far off, actually, some might say. Don't believe that? See for yourself here.

I noticed that when I tried searching for his name, on clicking the name in the single search result all that came up were news item links and extracts (the sources for the automated summary). So there's an inconsistency in the search results display for different searches - when you click the name, you're supposed to get the summary. In George W Bush's case, you don't - you just get the "Sources" page, then from there you have to click the "View: Summary" link to see the summary. At first I thought there was no summary and that George W had actually got them to remove his entry, but not so.

Same with Michael Jackson. There's one and only one entry in the results: "Michael Joseph Jackson" (the "Michael Jackson" who used to be controller of the BBC doesn't even get a look in). Click the name, and you don't get the summary but you do get the Web sources, as of today mostly about the singer's molestation trial. Click to View Summary, and then you get:
Mr. Michael Joseph Jackson Sr.
District Attorney
Santa Barbara County

(Other Titles - Sherriff!)

So, you can tell I'm not too impressed so far. I found one friend accurately described at his company. Others didn't figure at all. My own name came up with nothing, even though I'm the first person on the list if you search for my real name on Google. "Improbulus" of course produced zero results on Zoominfo.

There's an Advanced Search page which lets you narrow your search down by the person's first or last name or by company (the search is quite fuzzy by the way, it seems to pick up different spellings of names or the closest it can find to what you enter). The advanced search page also sports features (in beta) such as finding employees of a particular company (lots of results if you try Microsoft for instance), or alumnis of a university - or people mentioned on a specific website. On trying my blog URL for that last one, I got "No Search Parameters Specified" - sniff - so I count as a zero to them do I, my blog URL immediately ignored in their search box as if it doesn't exist? My blog doesn't even seem to merit a "No Web Summaries were found for…" - which is what I got when I tried entering that obscure site, That's right. NO people are mentioned on, according to Zoominfo.

The setup of the advanced search page is a bit of a clue that they're really mainly interested in people associated with companies or universities and use corporate and official news sites as their primary sources. Which is not surprising, as they offer a paid-for service targeted at businesses, and tout its benefits for recruiting, sales intelligence and research etc: "The patented search technology continually scans millions of corporate Web sites, press releases, electronic news services, SEC filings and other online sources. Then, it intelligently compiles a concise summary about a specific individual or company." "Use ZoomInfo to instantly gather background information on business people and companies, prepare for meetings, assemble executive briefing books, or use it simply to build better business relationships." Their "professional" version offers the additional ability to search by job title, industry, and geography too plus ability to sort, save and export your results. They seem to have a pretty impressive list of customers too including Adobe, Amazon, Apple, Dell, Ebay, Google (?!), Microsoft and Yahoo.

While the searching is free for personal users, there are Google ads in the search results (which I don't mind). What I do mind is the extent of the restrictions they seek to impose on users. I like to control my own surfing, thank you very much. So, there is no FAQ on the front page, nor any Help link or Privacy Policy link. You only get offered those links AFTER you have started using their service, i.e. after you've carried out a search. Now that's a minor thing perhaps, but I want to be able to read and consider the info about a service before I decide to use it or not - and I mean info on its features, functions, privacy policy etc, not just bland marketing info. I guess the lack of those links wouldn't bother me so much if I hadn't found out that, when I tried to rightclick on the FAQ etc. links to open them in a background tab in Firefox (or a new window in IE) for convenience, it wouldn't let me - it just opened exactly the same page I was on before. That sort of thing really gets my goat, as you can tell. So here are the links - yes, they work if you go there directly, they just can't be rightclicked from the search results pages (or accessed at all from their home page): FAQ, Privacy Policy, Help.

Feedback and Contact etc links also work in the same way. But at least they do include all those links on the Advanced Search page (hey that's right, only people who do advanced searches might need help or a FAQ - though guess what, you still can't rightclick those links effectively from that page.) So maybe it's just an oversight that they don't include them on the front page or allow rightclicking, maybe I'm being unfair in disliking what they've done. Or maybe not.

Furthermore, did you know you can "edit" and consolidate the Web Summary of your own personal information on Zoominfo's site (and then link to your Web Summary page from elsewhere)? "ZoomInfo allows you to control your Web Identity by editing and securing your Web Summary. Now, when others search for you, you have some control of what the Web says about you." But you can register only after you've done a search (and been offered the chance to give them even more of your personal information like work and home addresses, your company's information, your work and education history, languages/skills, bio, phone numbers) - you can't register first, then try to update your Web Summary. At least they don't ask for mother's maiden name or birthdate! Oh, and if you're not on their database already, you can if you wish add yourself and give them your personal details. No thanks, I don't think so. Personal safety, identity theft, no I'm not taking the chance.

"Further, we give our users control of their Web Summaries through the My Web Summary. Once their identity is verified, anyone can edit their own Web Summary and thus control what others can view about them when searched." What this means is that you can't hijack your friend's identity and edit their Web Summary to produce some amusing results (not even with 1 April approaching). So there go my nefarious plans to pretend to be some famous person! No, what they do to get you to verify your identity is to ask for your credit card details, home address and other personal info - they say in their privacy policy that they won't charge your card and they'll only use the info to check you're who you say you are. And they also say they won't let your "update" go through until they've confirmed you are the person whose details you're trying to edit. So that's one good thing - at least they have some sort of "identity theft protection" mechanism in place, though I wouldn't be too comfortable about giving them my card details (even though their privacy policy says they don't store numbers given to them). I happily shop online, so I'm not sure why, it's probably the whole general issue about the privacy of my personal details.

Another good thing is that they allow you to request removal of your Web Summary:"If you want to remove your Web Summary completely, please send an email to" Though whether they'll accede to your request or not, who knows. And how are they going to verify your identity when you do that? Seems to me that the registration (by clicking Update Web Summary) and credit card method, much as it grates, may be the easiest way to get them to remove your Web Summary (though not the search results, of course, which come from public Web sources).

In summary, although to be fair some of it still seems to be in beta, Zoominfo's automated summarisation obviously has a long way to go. I don't think very much of it as a search/summarisation tool for us mere personal users, though it could be fun as an ego tool to see how it summarises you, or to while away a few minutes seeing how they describe your celebrity of the moment. It's more the combination of the unnecessarily controlling way they've set up their pages, plus the way they've also set it up to get info out of people first, that bugs me - even though it's not surprising, as that's what the site is for.

As you'll know if you read my ID cards posts (here and here), I feel quite strongly about privacy and data protection, and while the information that ZoomInfo searches and gathers together is public, I am not very happy about the idea of being encouraged to give them even more personal information (like home address, for goodness' sake! Gift for stalkers, helloooo?). Of course it will be up to the individual as to how much of their personal info they want to add to the site. I can see that if you are jobhunting, it might be useful to put at least your bio and work history on there. I guess I'm a little sceptical though about how much businesses would use ZoomInfo as a recruiting tool. And maybe if you're desperate for old schoolfriends to be able to find you, you might add a Web Summary for yourself, though there are better ways of doing that. Otherwise, though, I don't see why would people want to put their private details on there for anyone in the world to find.

Technorati Tags: , , , , , , , , , , , , , , , , , , ,

Sunday, 27 March 2005

Of apples and oysters

Can an apple a day keep the doctor away? Do oysters really act as aphrodisiacs? Well I happen to think there's a grain of truth in many folk remedies and old proverbs, particularly those relating to health, the body, the things that were important to people's lives and livelihoods (consider for instance "Red sky at night, shepherd's delight, red sky in morning, shepherd's warning").

Now the 26 March 2005 issue of my favourite read, New Scientist magazine, reports in two separate items, coincidentally both in the same issue, that apples do help prevent clogged arteries and reduce cholesterol, while clams (not unrelated to oysters) are rich in a chemical which boosts the sex drive by raising testosterone levels. Certainly, from my own experience, an apple (or at least half an apple) a day is the best thing for, well, keeping me comfortably regular, shall we say. Some people swear by apple cider vinegar for all sorts of ailments as recommended by their grandma.

Often this kind of folk wisdom has been dismissed as superstitious nonsense. But increasingly, when objective scientific research has been carried out to test the efficacy (or not) of, say, certain kinds of herbal medicine, traditional remedies, spices and the like, more often than not it's been found that there is in fact something to it. For example garlic, known for warding off vampires (symbolic for bad stuff?) and generally being good for you, has been shown to be an antibiotic (see this or this) and kills bad stomach bugs too while leaving the good guys alone. The anti-malarial drug artemether was derived from qinghaosu (wormwood extract) which has been used since 168 BC or earlier in traditional Chinese medicine, and has also been used in African medicine. Turmeric, a spice used in curry, is an antibiotic, reduces the risk of cancer and liver damage, and may help fight malaria, cystic fibrosis and leukaemia too - not to mention inflammatory bowel disease and Alzheimer's. And good ol' fashioned honey has been found to be better than conventional antibiotics e.g. in treating burns, and maybe even in fighting antibiotic-resistant superbugs.

None of that should be surprising. Our ancestors, who didn't have the benefit of modern science or medicine, had to learn the hard way what worked and what didn't, through pure trial and error, perhaps over generations - the ultimate field test, with human beings as the experimental subjects, often sick and facing the very real possibility of death if it didn't work, and desperate enough to try anything until something did. Now they may not have understood the theoretical basis for say cinchona bark (from which quinine is derived) curing malaria, but they sure as hell knew it did the job, and when it's a question of surviving or not surviving, that's really all they needed to know. There's a review in New Scientist of a book "Plants, People and Culture" by ethnobotanists Balix and Cox which covers amongst other things how many common drugs today were derived from folk remedies. These days "old wives' tales" is used in a negative, dismissive sense, but if people in those tough times survived long enough to become old wives, then I for one would think it would be well worth listening to what those old wives had learned.
As an aside, I am also very interested similarly in nursery rhymes - how they represent another form of oral tradition, and the surprising facts often embedded in them (see my previous post on the subject).

True, various ancient attempts to explain why something works don't accord with modern science (such as the traditional Chinese idea of things being "heating"or "cooling" and counteracting say a fever with "cooling" food; or the notion of chi energy, and energy paths in the body being disrupted by illness which acupuncture could help restore). But in my view, frankly that's just not good enough a reason to laugh off the underlying fact that there is clearly empirical evidence that it does work.

What scientists and doctors should be doing more of is collecting traditional wisdom from all cultures across the globe (yes, and checking out the medicinal properties of "natural remedies" that animals and birds use, too - we could also learn a thing or two from them, and there is even a new science called zoopharmacognosy devoted to animal self-medication); investigating what really works and what is simply wishful thinking (for I don't believe in gullibly taking it all as gospel either, I just feel that the scientific establishment should be more open-minded about these things); figuring out why and how it works, in modern terms - what chemicals or enzymes etc may be doing the trick, in the case of say folk remedies; and putting it to use in the here and now to help people, perhaps by synthesising drugs with similar effects. Many folk beliefs may be just superstition - but without meticulously checking them out, we're not going to be able to sort the wheat from the chaff (to use another old saying).

If I were a billionaire (helloooo, Billy G?), I'd fund that research, before it's too late. The increasing Westernisation of societies across this planet means that too much traditional knowledge is being lost, forgotten, devalued and dismissed by the Nintendo-playing descendants of those same people who, in some cases, literally gave their lives to discover the very real truths which are embodied in many of the sayings or proverbs which have hitherto been passed down through the generations. The loss saddens and frightens me - it seems such a waste of wisdom and lives.

I've looked from time to time for a cross-cultural collection of medical sayings or proverbs about health or illness, call them what you will, but the best I have been able to find on the Net is an (apparently much linked to) article on English medical proverbs (whose title, appropriately enough, starts with "An Apple a Day Keeps the Doctor Away..."). I think this kind of folklore would be a great starting point for investigations - though I don't know if the books and articles mentioned in that extract are still in print.

At least some people are now researching herbal medicine (there's potentially shedloads of money to be made by the drugs industry, so once they cottoned on, the pharmaceutical companies started weighing in; and there's even a book about the hunt for medicines in the rainforests, Earthly Goods: Medicine Hunting in the Rainforest by Christopher Joyce, reviewed in New Scientist). (One problematic area here is the ownership and exploitation of the intellectual property derived from traditional knowledge about the use of medicinal plants. It is important that indigenous communities receive acknowledgement for imparting that knowledge, and also that they share in the profits which the drugs companies make from them. Which could be the subject of several articles in itself, and has been - see this, for instance.)

The site for the National Center for Complementary and Alternative Medicine (CAM) has free info on clinical trials of alternative therapies and the like. I also found a database Herbmed which summarises research findings, clinical trials and traditional use etc. of selected plants. The alphabetical list is here. Free public access is limited to 40 herbs (though by searching "clinical" from the main page I found 80 I could view, go figure!) and the searching is odd in that it turns up items where the search term doesn't seem to feature, but it's interesting to browse through. Many of the herbs/plants and foods traditionally considered to be good for the health are covered in that database, like garlic, echinacea, and cranberry.

You'll notice I don't mention any of the zillions of sites on alternative therapies, complementary medicine etc which abound on the Net. That's because, as I mentioned above, I don't believe in complementary medicine just for the sake of it. I feel strongly that it's very well worth testing scientifically whether what people say works really does work, but I won't just take their word for it. I know for instance that Alexander Technique is effective, because I've tried it (STAT provides a list of teachers in the UK); I know research is being carried out on homeopathy and the "memory of water" but I'll reserve judgement until more is known, because it's never done a thing for me; and I know that acupuncture has been proven to work, though they are still not quite sure how according to Western medical theories.

I am a true Fortean at heart - I believe there are more things in heaven and earth etc, but I won't swallow something unthinkingly just because it uses the word "alternative" (or indeed "alien"!). I believe that keeping an open mind and looking at all the facts (not just the facts that fit a pet theory) is the true scientific method, not just in this area but in all areas; whereas, to paraphrase Fortean Times magazine, too many so-called "scientists" argue according to their own beliefs rather than the rules of evidence and ignore, suppress, discredit or explain away inconvenient data (which is quite different from explaining a thing). If more scientists and doctors were more Fortean and less dismissive in their attitudes, science and medicine might be further advanced.

Technorati Tags: , , , , , , , , , , , , , , , , , , , ,

Death of a language

I read in the print edition of Fortean Times recently that the last fluent writer and speaker of nushu died in autumn 2004.

This language - nushu means "women's writing" - was unique in that it was invented and used only by women in an area of Hunan, a province in China, who were denied the education available only to boys, and so learned from each other (there seem to have been links between the language and an interesting tradition of "sworn sisters" too). They could use nushu to communicate secretly in a way that men would not be able to follow (hey, I know some men say they can't understand women's conversations even now, but...!).

I've not heard of any other language which is specific to a gender, rather than an ethnic group. Some scholars fortunately recorded as much as they could from the few surviving users, but it seems to me particularly tragic when, with the death of one person, an entire once-living, rich and unique language also dies.

This page seems to be the most comprehensive, though quite academic, collection of articles etc. on the Web on nushu, and includes links to items which show the fluid pictograms used in nushu, such as this paper. A more accessible, readable article on nushu is here.

Technorati Tags: , , , , , , , , , ,

Nursery rhymes - and history

Nursery rhymes may seem mundane. But what is interesting to me is how they originate, what they really encapsulate, how they are passed down, their transmutation from warnings, stories or commentary on real (and often gruesome) historical events into children's games, or songs people use to sing their children to sleep - "Ring a ring a roses" and the Black Death being the most famous one. In a very entertaining and well written novel, The Bridge of Birds by Barry Hughart, the key to an ancient secret is even hidden in a children's nursery rhyme to ensure it doesn't get forgotten.

So this site ( I've found on the origins of individual nursery rhymes is particularly fascinating, and well worth a browse if you're interested in oral history and folklore.

Technorati Tags: , , , , , , ,

Tuesday, 22 March 2005

ID cards: more views against

You'll know if you read my previous post about identity cards in the UK that I think they'll be a waste of time and money, won't effectively fight terrorism and on the contrary will represent a huge threat to personal privacy and security.

"The Identity Project: an assessment of the UK Identity Cards Bill and its implications", a report published yesterday by the well-respected London School of Economics and Political Science, just confirms my views: "The consequences of the current proposals might include 'failure of systems, unforeseen financial costs, increased security threats and unacceptable imposition on citizens'", according to the LSE press release summarising the findings.

Coincidentally, but not surprisingly, on the same day the BBC reported that an MP had called for government guidelines to be issued about how individuals and businesses could protect themselves against identity theft, following a poll finding that "almost 80%" of Londoners fear identity theft. Now there might have been a degree of self-interest there as the poll was carried out by shredder manufacturer Fellowes (and I couldn't find info on their site about the poll, unless it was this one from way back in September 2004: warning, the link opens a Word document!). All the same, if identity fraud is a worry, then how much worse will the risk be if our important personal information is compulsorily kept in a government database (which no doubt organised crime will be able to access more easily than some government departments)?

(By the way - I had a Fellowes shredder - cross cut, of course! - and it gave up the ghost pretty quickly. I now have a much better, yet cheaper (on sale), one from Maplin. And it's all nice and light and silver meshy too, unlike the photo, which doesn't do it justice, awwwww.)

Technorati Tags: , , , , , , , , , , , , ,, , ,

Technorati: cosmos, and watchlists

This post is an introduction to "Cosmos" and "watchlists" as used by the blogosphere search engine Technorati, and how you can employ them in your own blog - you may have noticed the two extra links at the tops of my posts from a week or so ago. (I'm writing this as demand was overwhelmingly greatest for more on things Technorati according to my survey - which is still open by the way if anyone would like to influence my posting priorities!).

Technorati cosmos

I couldn't find "Cosmos" described on the Technorati help pages as such, but from the use of the term on the Technorati site as well as in blogs such as Technorati boss Dave Sifry's, I figured out what it means, and then I found some info about it in the Technorati developers' section.

"Cosmos" is Technorati speak for "the blogs which link to a particular URL". So the Technorati cosmos for your blog comprises all the blogs that link to your blog, found by searching for your blog's URL on Technorati. You can see the list for my blog, for example, by clicking the "Cosmos" link (the blue bubble icon) at the bottom of my sidebar, or by clicking here. Similarly, the cosmos for a post will be all the blog posts that link to that particular post, again found by searching on Technorati for the post's permalink (the unique URL for the post's individual webpage). Many posts may not get linked to as such, but for one example you can go to what's been my most visited (and linked to) post so far - my intro to Technorati tags - and click the Cosmos link at the top of that post.

Yes, you could find out a blog or post's cosmos by going to Technorati and typing the URL in the search box there, but isn't it much more convenient to provide a link so that people (including you!) can do that search with just one click from your main blog page or individual post page?

To add a Cosmos search for your blog, just put this code in your template wherever you want it (at the top of your main page maybe or, as in my case, in the sidebar). It's Blogger-specific but with other blogging platforms you can just change "<$BlogURL$>" to your blog's URL (or whatever platform-specific code represents your main blog URL).

[Added 18 August 2005 for new Technorati search format](New code from July 2005)
<a href="<$BlogURL$>" title="Cosmos - search Technorati for blog posts linking to this blog"><img src="" alt="Cosmos - search Technorati for blog posts linking to this blog" /></a>

(Old code)
<a href="<$BlogURL$>" title="Cosmos - search Technorati for blog posts linking to this blog"><img src="" alt="Cosmos - search Technorati for blog posts linking to this blog" /></a>

Obviously you can change the descriptive text ("Cosmos - search Technorati for blog posts linking to this blog") that pops up when someone hovers their mouse over the link, to anything you want. I just thought it would be useful to include that as not everyone is familiar with the "cosmos" term yet, or at least I wasn't. The same will apply for the rest of the code in this post, just change the "title=..." text to whatever you want.

To add a Cosmos search for an individual post use this code in your Blogger template between the <Blogger> and </Blogger> tags for main page, item page or archive page as you prefer (i.e. within the MainPage or MainOrArchivePage or ItemPage tags):

[Added 18 August 2005 for new Technorati search format](New code from July 2005)
<a href="<$BlogItemPermalinkURL$>" title="Search Technorati for blog posts linking to this post">Cosmos</a>

(Old code)
<a href="<$BlogItemPermalinkURL$>" title="Search Technorati for blog posts linking to this post">Cosmos</a>
I'm assuming whoever reads this understands Blogger's main page, item page etc conditional tags, but if not see this; if anyone wants more explanation please leave a comment or email me.

Again if you don't use Blogger change <$BlogItemPermalinkURL$> to whatever represents a post's permalink on your platform e.g. <$MTEntryPermalink$>for Movable Type (if I have that right from Nick Chase's MT adaptation of my language translation code).


A Technorati watchlist for a URL (or a search word or phrase) is an automatic periodic search on Technorati for that URL or search term, where the live updated search results are brought to you via an RSS feed, so that you don't have to keep going back to Technorati to repeat the search.
RSS feeds are beyond the scope of this post, most bloggers seem to know about them; I do plan to write a basic intro to feeds and their use for bloggers one day, sooner if there's demand from those who answer my survey...

To use Technorati watchlists, you have to become a Technorati member (and of course have a feed reader or aggregator - such as Firefox's Live Bookmarks, Bloglines, NetNewsWire (for Mac), Newsgator (which integrates with Outlook), MyYahoo! etc - and know how to use newsfeeds). Technorati has a page where you can manage your watchlists - add, delete etc. (The few test ones I set up were labelled "Free", which rather suggests there's a limit to how many watchlists you can have for free, but I haven't found anything further on that yet.)

It's free to join Technorati - you just need to give them your name and email address (there are other benefits to being a member, but I'll leave that to another post).

The standard way to create a watchlist is to search on Technorati for a particular URL, word or phrase; then, in the search results list, it'll give you an option via a "Make this a Watchlink" link to create a watchlist for that URL or search term (you'll be asked to sign in first if you haven't already). If you click that link, Technorati takes you to its "Add a Watchlist" page with your search term displayed in a box, and if you check it's correct and then click the Add button, it cleverly creates a unique RSS feed for that search and displays the URL for that feed (which will be something like this: ""). Just copy the feed URL given and paste it into your feed reader in the usual way, and there you go.

What I've done is to add code to my blog template so that with one click a reader can create a watchlist for a particular post or for my blog generally, if they wish, right from my blog page.

Here's the code to create a watchlist for a post which you can just copy and paste as is into your template (again for Blogger, but easy enough to adapt):

<a href="<$BlogItemPermalinkURL$>" title="Create Technorati watchlist to track conversations about this post">Create watchlist</a>

I put it in my template just after the code for the cosmos search, with a little "|" separator between them. (Again this code has to go between the <Blogger> and </Blogger> tags, and you can substitute the <$BlogItemPermalinkURL$> with whatever represents a post's permalink on your platform e.g. <$MTEntryPermalink$> for Movable Type).

And here's the code to paste into your template for a link to create a watchlist for your blog (Blogger-specific but adaptable, again):

<a href="<$BlogURL$>" title="Create Technorati watchlist to track conversations about this blog">Watchlist</a>

I suspect that watchlists are probably more useful to track bloggers' discussions about a particular topic (like ID cards, or latex) than to track what's said about a specific blog or post. Let's face it, most people will be more interested in a general subject than in one blog, so watchlists for blogs/posts are probably more of an ego tool for the author than anything else. But they do no harm, provide a handy shortcut for you (and anyone else interested) to see what others are saying about your blog or posts (if anything!), and the links take up little space, so I figure why not include them?

Coming up with the idea

This one's for Mud (I will get to the Delicious code post soon, I promise - I assume you were the one who voted for it in the survey?!).

I got the idea for the Cosmos links from seeing them on Dave Sifry's blog, clicking a few links to see what they did, noticing the pattern in the structure of the links, and just using that same pattern for the links in my own blog, substituting my blog or permalink URL in the appropriate place in the string. (Much the same way as I came up with my first ever blog trick, the code for a form to search Blogger profiles, which you can see in my sidebar and try out if you wish).

Similarly, when I tried out Technorati watchlists I saw and followed the pattern used in the URLs of the "Add a Watchlist" pages. That's all, no particular magic to it!

Technorati Tags: , , , , , , , , , , , , , , , , , , ,

Thursday, 17 March 2005

Future posts: you decide! (survey tool)

This week I entered an unusually busy period, both at work and outside it, hence I won't be tinkering with my blog or posting as frequently as I have been, until about mid-April. But there are lots of things on my to-do list for what I want to do with my blog and what I would like to write about.

I've rustled up a little survey of possible initial subjects at the end of this post - and if anyone would like to try out the survey tool (and help me decide what to focus on first in my limited time, too!), I would really appreciate it. (And of course I'll share thoughts on the tool, its use and how well it works).

My to-do list for my blog includes:
  • Blogroll, or at least a list of links to blogs I read or which link to me (or both) - so if you link to me, please let me know (I've spotted a few but I may have missed someone)
  • Popular posts list in my sidebar so the Technorati tags intro and others won't get lost
  • Restore Blogger comments (but using the Metempsychosis hack) with Haloscan trackback as I don't want to lose my comments after 4 months!
  • Recent posts list - if it can be automated in a dropdown menu, don't know...
  • Tag my old posts on as per unscathed's idea
  • Maybe, maybe, convert to a 3 column blog
  • Investigate Wordpress particularly given the recent Blogger access problems etc
And my to-do list (for now anyway) for blog posts is in the survey below (if you'd rather open it in a separate window where you don't have to scroll, please click here). All questions are optional, please feel free to skip anything you like.

Technorati Tags: , , , , , , , , ,

Saturday, 12 March 2005

The last shall be the first?

I'm always fascinated by simple rules of thumb you can use in everyday life which are based on scientific principles or research - such as the mathematically-proven Colley's Rule on how to make the best choice in situations where there is no going back on your decision.

Another possible rule of this kind is: in contests where each person takes a turn to do their thing (like a talent show), it's generally best to go last.

That seemed to be the case when Carnegie Mellon University researcher Wändi Bruine de Bruin looked at competition scores, finding that those who went later generally had incrementally improved chances of getting better scores - according to an article in the latest New Scientist magazine (12 March 2005). (I've found other articles online about this too, e.g. in the Telegraph, the Sunday Times). Correlation doesn't necessarily mean cause and effect, but there are possible explanations for these findings, e.g. that judges remember the performance of later contestants more clearly, or some other similar factor affecting their decision making.

I might perhaps be more inclined to start maneuvering for the last place in any contest I enter, if there's more research confirming this "rule" having studied results from a greater number of competitions of many more different types. In this case, they studied - wait for it - figure skating championships and the Eurovision Song Contest...

Technorati Tags: , , , ,, , , , , ,

Thursday, 10 March 2005

Parking in London

Badly planned. Badly signed. Appallingly enforced, and aimed at private sub-contractors making money rather than (as it should be) efficiency and safety. That, in my view, is parking in London.

The London Assembly are apparently consulting on parking in London, though I couldn't find anything about the consultation on their website itself.

Take part, if you can, while the survey is still open: the online survey is here and is only a couple pages long. Questions about how strict parking controls should be, and how well they are being enforced (including one about whether enforcement is over-zealous, duh and hah!).

It may make no difference what we say. We are only the little people. We are only taxpayers and voters. Why should they listen to us?

But if you don't say what you think, then your views definitely won't be taken into account.

So - it's only a relatively minor form of activism, but I think it's worth taking the trouble to answer these kinds of surveys and polls. And, of course, to vote.

Technorati Tags: , , , , , , , ,

Wednesday, 9 March 2005

Blogging - and your job

Bloggers have lost their jobs or got into trouble with their employers in the past, because of what they've said in their personal blogs.

The most recent kerfuffle involved a company I've been keeping an eye on because of my interest in their service: Technorati. But unlike in previous situations (which I haven't time to go into now, and they're well known - Apple etc), it seems to me that those involved have come out of this one smelling, if not exactly of roses, much better than anyone else has who's been in a similar situation before.

To paraphrase a comment I just left on David Sifry's blog post on what happened, it seems to me that all concerned have tried to deal with the situation promptly, with sensitivity and tact, the desire to explain fully the background, intentions and feelings of those involved, and the objective of trying to recognise and address the implications, understand all points of view and to strike a fair balance - which makes it much more likely that everyone will come out of this relatively intact, and certainly wiser and more aware.

That attitude, that approach, is all too rare in this world and, in my view anyway, almost unheard of in the world of work. It's a great example of how mistakes should be dealt with, all round (and I'm not just saying this out of some bias in favour of Technorati, in fact they still haven't responded to most of my questions about the original problems I had with tagging my posts ). If only more employers and employees behaved like that. If only more people behaved like that generally, period.

I think that the way that this situation has been tackled makes it much more likely that, with Technorati, the rest of the world will be able to move on, too. But having said I think others will be able to learn from it, I'm a little cynical about that - I think on reflection that most companies have their own particular culture, and if it isn't open and sympathetic and flexible, hearing that such an attitude has helped Technorati isn't going to make certain companies change the way they behave towards their employees. Sadly.

I do feel though that the incident, while handled as well as it could have been, does illustrate something I've always felt about blogging, indeed about any public activity. What you say in your blog reflects on you, and, whether you like it or not, can reflect on your employers, particularly if what you say is relevant to your work or to your position with your employers.

That's why I don't use my real name here (nor do I intend to). That's why, even when I don't use my real name, I am never ever going to talk about my job, my workplace, or my employers. And I intend to stick to those cardinal rules, which I think are the bare minimum for any personal blogger. I think it's entirely reasonable for companies to consider that what their employees do and say in their personal life can affect them, depending on the employee's role in the company. Look at the recent resignation of Harry Stonecipher the Boeing CEO, for instance. He had an affair with an executive. Maybe that's no big deal, these days - except that that had violated the company's code of conduct, of which he'd been the "staunchest supporter" and in relation to which he'd drawn a "bright line" that not even minor violations would be tolerated.

I've seen an interesting list of do's and don'ts about "corporate blogging" (though I don't agree with that term - in Technorati's case it wasn't a corporate blog, it was a personal one) and about dealing with the situation after the personal space has crossed over into the work space. I have to say I don't agree with the suggestion about getting into "real space" and doing an audio or video podcast. Written words if used well can be just as effective a means of communication as speech (and, not wanting to sound old fogeyish, but Shakespeare never needed a podcast did he?). Plus, good actors can convey fake sincerity probably better than I can convey real sincerity! I'm also uncertain about the value of "overcommunication", I think that saying too much too often can risk a "they doth protest too much" reaction plus communication fatigue if it's not carefully managed and well expressed.

Personally I just think, as I've said before, that rather than try to rescue things after the event, prevention is far better than cure. I feel it's risky to blog about your job or your employers in the first place, particularly if they're identifiable. That may take the fun out of it for some, but hey, that's me, better safe than sorry is my motto. I like this pithy statement of the point in Tony Pierce's blogging advice (much of which I agree with generally, too):
"25. dont use your real name. dont write about your work unless you dont care about getting fired."

Technorati Tags: , , , , ,, , ,

Tuesday, 8 March 2005

Posting code: the secret

OK. I've got it figured out from too much time spent trying to get the code in my last post to display properly. It's not only those blasted "less than" and "greater than" symbols, but also the ampersand.

So, a note while it's fresh - both for me and in case it may help anyone else.

1. Replace all the "&" with "&# 38;" (without the space) (or with "&amp;")
2. Replace all the "<" with "&lt;"
3. Replace all the ">" with "&gt;"

(I'm getting the sequence at the end of no. 1 to display properly by actually typing in this sequence into my post (without the spaces): "& amp; 38;")

The order is important - replace any "&" in the code FIRST (i.e. do step no. 1 first) otherwise you could inadvertently replace the "&" that you need as part of the "<" and ">" replacements.

I do that in Word, with a macro I've now whipped up. Then, just to get rid of any horrid hidden codes or changes introduced by Word (it does terrible things to double quotation marks), I copy/paste the changed code into Notepad. Then I copy/paste from Notepad to my post. (Yes, I know I need to get a proper text editor!).

Conversion tool

[Added 21 May 2005:] Found a site that provides an automatic converter, yay: Centricle.

You just copy/paste your code into the first box and hit "Encode" and it does a search and replace for you automatically of the tricky characters you have to encode in order for code to show up properly. Just copy the code from the box for pasting into your post or webpage. It also converts the other way round - i.e. it decodes as well as encodes.

Now if only I can get long code to wrap in my posts without mucking up its workings... ([added 21 May 2005:] which I now can thanks to redryder52, another yay!)

Technorati Tags: , , , , , , , , ,

Auto-translating Webpages/blogs: code

As requested after my previous post on “automatic” translation, in this post I set out the code you can include in your Web page or blog template which allows visitors, without leaving your site or blog, to translate your Webpages or blog posts from English into German, Spanish, French, Italian, Portuguese, Japanese, Korean or Chinese (using Google's language translation tools). I can't get it to display nicely and still work, so it's very long horizontally and cuts across the sidebar, but copy/paste should be fine. (Must get try to get a textarea working properly in Blogger!).

I’m very glad I introduced this for my own blog. Every day since I’ve done so, there have been at least one or two people who have used it, I have noticed from my logs. It’s all about making a blog or Website more accessible to a wider range of readers.

WARNING: Blogger keeps doing weird things to the code I included below, with extra "amp;amp;amp;" appearing after ampersands, and "SPECIAL REMOVE" stuff creeping in within the Korean, Japanese and Chinese code. Please remove them if they crop up... sorry, can't sort it, it keeps recurring...

Javascript version

I produced the Javascript version with some help from the inestimable redryder52. Any mistakes are of course mine alone. (There's also an HTML version, see later). Here's the Javascript to enable translation of the Webpage on which the code appears
<div style="border-style:none; font-size: 1; text-align: center">
<script type="text/javascript">
//By Improbulus,
//licensed under Creative Commons License
//with thanks to redryder52,
var Location = document.location;
document.write (
//German starts here
'<a href="'+Location+'&langpair=en%7Cde&hl=de&ie=UTF-8&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Deutsch</a>'
+' | '+ //spacer and divider
//German ends here
//Spanish starts here
'<a href="'+Location+'&langpair=en%7Ces&hl=es&ie=UTF-8&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Español</a>'
+' | '+ //spacer and divider
//Spanish ends here
//French starts here
'<a href="'+Location+'&langpair=en%7Cfr&hl=fr&ie=UTF-8&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Français</a>'
+' | '+ //spacer and divider
//French ends here
//Italian starts here
'<a href="'+Location+'&langpair=en%7Cit&hl=it&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Italiano</a>'
+' | '+ //spacer and divider
//Italian ends here
//Portuguese starts here
'<a href="'+Location+'&langpair=en%7Cpt&hl=pt&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Português</a>'
+' | '+ //spacer and divider
//Portuguese ends here
//Japanese starts here
'<a href="'+Location+'&langpair=en%7Cja&hl=ja&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">日本語lt;/a>'
+' | '+ //spacer and divider
//Japanese ends here
//Korean starts here
'<a href="'+Location+'&langpair=en%7Cko&hl=ko&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">한국어</a>'
+' | '+ //spacer and divider
//Korean ends here
//Chinese starts here
'<a href="'+Location+'&langpair=en%7Czh-CN&hl=zh&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">汉语</a>'
//Chinese ends here


To offer translation into those languages, all you need to do is copy and paste the code above into your HTML or blog template; you don't have to do anything else, except (if you wish) use standard HTML/CSS to style it to fit the design of your own site or blog, tweak the spacing before and after by using <br> or <p> tags, etc. (You can delete or amend the <div> and </div> lines if desired; they're just to set the style I decided to use for the translation links in my own blog: small font, centered etc. I use “1” (instead of “10px”) for font-size where the code is at the top of a page instead of under the title of a particular post, for instance). users can use that block of code for their main page, archive pages or item pages - it will work, as is, on all of them (if you're new to Blogger and haven't figured out conditional tags yet, read this). A good place to put it is just before the "<Blogger>" tags for the appropriate pages in the template. I've done that and as you can see this inserts the language translation links just under the blog description on my main page and my archive pages e.g. for January 2005.

Deleting or changing a language

If you don't want to offer translation into any of the languages listed, delete the section for that language (the section of code for a particular language is indicated by the note e.g. "//Japanese starts here" on the line just before that section, and "//Japanese ends here" on the line just after it). So, if I don't want to offer French, I'd delete these lines:

//French starts here
'<a href="'+Location+'&langpair=en%7Cfr&amp;amp;#38;hl=fr&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" target="_blank">Français</a>'
//French ends here

Notice that you should then also delete the "'+ | +' " just before or just after the deleted section, as appropriate, because (as indicated by the "//spacer and divider" comment) that's what produces the dividing " | " immediately before and after the names of the languages in the displayed list. (If you don't like "|" as a divider, replace each occurrence of " |" in the code with anything you like e.g. just "+' '+ " for a blank space).

To offer translations from languages other than English, I suggest you go to Google's language translation tools page, try doing some translations of Website or blog pages manually from and to the desired languages, and then view what is in the address bar of the results (it will look similar to parts of the above code). You can then use what’s there to adapt the code.

New window

The translation opens in a new browser window. If you want it to open in the same window, obviously you can delete each "target="_blank"" in the code.

HTML version

If you prefer to avoid Javascript (e.g. for speed of loading reasons), you can still offer translations of your home Web or blog page using HTML. In fact that's what I've done for my own blog, reserving the Javascript for my archive pages only. (I've just mentioned the Javascript version first in this post as it's the simplest to implement on any Webpage or blog.)

For HTML only, try the following code, changing "<$BlogURL$>" to the URL of your site or blog (e.g. “” in my case – without any quotation marks), or if you're on Blogger you can just use this code as is (again, just before the <Blogger> tag is a good place):

<!-- By Improbulus, licensed under Creative Commons License -->
<a target="_blank" href="<$BlogURL$>&langpair=en%7Cde&hl=de&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Deutsch</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Ces&hl=es&ie=UTF-8&amp;amp;#38;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Español</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Cfr&hl=fr&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Français</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Cit&hl=it&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Italiano</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Cpt&hl=pt&ie=UTF-8&amp;ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Português</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Cja&hl=ja&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools"> 日本語</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Cko&hl=ko&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools"> 한국어</a> | <a target="_blank" href="<$BlogURL$>&langpair=en%7Czh-CN&hl=zh&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools"> 汉语</a>

(You'll have to style it yourself, something goes weird with the closing /div tag in Blogger - or just use the same div tags as I included around the Javascript version).

This will work the same as the Javascript version above, except that it won't work on your archive pages or on Webpages other than your main page - it will only translate the page whose URL is inserted into the code. You could manually change the URLs in the code to that of the page where you've added the code, but it can be a pain to do that for every one of your site pages, which is why the Javascript version is more useful - it automatically fills in the URL of whatever page you're on.

Also the HTML version won't work on Blogger archive pages (unless someone else can figure out a way?) - you'll have to use the Javascript version for those pages.

Again you can style it as you wish, delete languages (hopefully after the explanation about the Javascript version you can figure out which chunks to delete now).

Twist for Blogger and other blogs

For pages with more than one post, e.g. the main blog page and archive pages, you can offer translations of just the individual post.

This could be useful especially if, as with my blog, the page is long (which means that the last section of the full page may not get translated, see the next section below).

To do this (for blogs), just add the above HTML code to your template, but change "URL", every time it appears in the in the code, to <$BlogItemPermalinkURL$>. (This makes use of the special <$BlogItemPermalinkURL$> Blogger template tag which pulls in the URLs of individual post pages. You may well be able to adapt the code for non-Blogger blogs but I'm not familiar with other platforms - [Added 13 March 2004:] Nick Chase has since provided copy and paste code for Movable Type). So for example the bit for Italian which reads

<a target="_blank" href="<$BlogURL$>&langpair=en%7Cit&hl=it&ie=UTF-8&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Italiano</a>

would be changed to read

<a target="_blank" href="<$BlogItemPermalinkURL$>&langpair=en%7Cit&hl=it&ie=UTF-8&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools">Italiano</a>

- and so on for the other languages.

The code should be inserted between the <Blogger> and </Blogger> tags for your main page or archive page (or main/archive page, depending on your template is set up). On my blog I have it just after the Blogger template tags for post title and date used in my template (the <$BlogItemTitle$> and <$BlogDateHeaderDate$> tags) – your template may be different.

As mentioned before, you can style it as you wish, delete the bits for any languages you don't want, and delete each "target="_blank"" if you prefer the translation to open in the same window.

I include that code on my archive pages even though the archive page (which I've tweaked in my blog to just list the titles of my posts for that month, rather than the full posts) can be translated as one - because it lets people go straight to the translation of a post they're interested in, without having to open the English version of the post first.


The code uses Google's translation tools, so their restrictions apply. It’s only a word for word translation so don’t expect perfection, it’s just to get a general idea (and some words will remain untranslated particularly slang).

Also it only translates a limited number of characters. As mentioned in my previous post, you can get round that by manually using Google to translate the remaining untranslated bit, breaking it into chunks and feeding one chunk at a time into Google's tool if the untranslated bit is itself too long.

Also, of course, if Google change the way their translation tool works, all bets will be off, the code above may stop working, and we'll all then have to figure out a way to use the new version...

Technorati Tags: , , , , , , , , , , , , , , , , , , , , ,

Monday, 7 March 2005

Google Desktop Search 1.0 officially launched

Today, Google officially released their Desktop Search program for searching files on your own computer, including Web history and chat files (combined, if you choose, with a general Google search of the Web). It's downloadable from

I've not had the chance to check it out properly yet so I don't know how it compares to the beta version which I'd been trying out and found quite erratic in its indexing, especially of Outlook email (see my previous post on GDS). Search results have often been numerous and hard to wade through, even when ranked by "relevance" rather than date, which sort of makes sense as, unlike their Net search, this tool can't (I assume) rank results according to the number of links made between internal documents on your own hard drive.

However, according to the press release it now searches PDFs too, and works with Firefox and Thunderbird at last. (I've certainly noticed that on going to the standard Google search page on the Net via Firefox as well as IE, there's a "Desktop" link over the search box. It would make sense if that link's only there for people who have GDS already installed.)

Google also say it now searches metadata stored with music, image and video files (title, artist etc). And there's a free standing search box you can put on your desktop.

Other very good news - "Google Desktop Search will also provide application programming interfaces (APIs) that enable software developers to create new and innovative applications using the desktop search product. Plug-ins developed with these APIs will be made available for download at, enabling users to search new content types such as Trillian chats and the full-text of scanned images, such as faxes. More information on the Desktop Search APIs can be found on the web at"

Sadly it's very Microsoft and PC-centric - it's not available for Macs or Linux, for instance.

There are of course serious security and privacy implications, as I mentioned in my previous post, and therefore some essential precautions you should take when installing it (which I intend to post more on once I've figured out the differences - some are listed in my previous post). At least password-protected documents are no longer automatically indexed. But when you delete a non-passworded document, it can still be recovered via GDS - which could be good, or could be bad, depending. Let's hope they've introduced an easy way to purge selected documents.

(It's interesting that the top 5 questions on the GDS help pages include "How can I uninstall Desktop Search"! Though that may be because you have to uninstall the beta before you can install 1.0, probably losing your old index in the process, which I won't be very happy about if true).

I'll report more when I've had the chance to see what version 1.0 now does and how it performs.

I do know that I want wildcard searching and tags, and I want to be able to save certain Web pages in the cache forever, not until the GDS cache fills up!

Technorati Tags: , , , , , , , , , , , ,

Sunday, 6 March 2005

Interesting names

More nominative determinism, a topic I never tire of.

From New Scientist again (generally its Feedback column and Letters are the best sources for reports on this topic) - a project at New Zealand's University of Otago on "The role of HCO3 in the secretory response of the human colon" is run by a Dr Butt; while one on "Cardiorespiratory and renal changes in acclimatization to high altitudes" includes as a supervisor a Dr Cragg.

And for a variation on the "great names" theme, Fortean Times mentions a report on an interestingly avian wedding where a Gemma Bird married Graham Robins (a vet), attended by best man William Finch and bridesmaid Stella Rook.

A news item last week on fly tipping was covered by a journalist called Heap, while I have heard that (in a reversal of the aptronym phenomenon) Thames Water numbers amongst its employees one Mr Leakey.

If anyone knows any other interestingly appropriate (or indeed inappropriate) names, I'd love to hear them!

Technorati Tags: , , , , , , , ,

Saturday, 5 March 2005

Technorati tags: "related tags", tag spamming, etc

[Added 10 April 2005: It looks like Technorati have officially launched their related tags now, see this post.]

Related Tags

I've just noticed that Technorati have now introduced a "Related Tags" feature, though I can't yet find anything on their site mentioning it.

After you search Technorati for a tag (for how, see e.g. my intro to Technorati tags), in your search results under "X posts from Y blogs match this tag" there's now a line that says:
Related Tags: a, b, c, d
(where a, b, c, and d are just clickable links to similar tags).

I mentioned the difficulty of finding tags on related subjects towards the end of my previous post (in the "Any downside..." section) - e.g. a search for "Humour" won't find things tagged with "Humor"; and I made the point that in my view Technorati needs some kind of thesaurus of synonyms or the like. It's good to know they've obviously been thinking along similar lines.

What I'd really be interested to find out is, how are those lists produced? Is there a Technorati human-maintained "thesaurus" of synonyms behind the scenes which gets looked up when you search for a tag? Or are Technorati running some whizzy software which looks at how people tag their own blogs, pics and bookmarks, and decides that if one person tags the item post with a, b, c and d, then those tags must be related - and then it takes into account what tags other people are using as associated tags for their own items too, to come up with some kind of average weighted by "popular vote" as to which tags are related to each other?

I suspect it may be something like the latter, because the list of related tags is variable. If you have an idle moment or two, it can be fun doing random searches and seeing what comes up as related tags.

For instance, when searching for the tag "Funny", the related tags are said to be "News, Politics, random, Humor, Humour, Pictures, Blog, Music, Web, Gaming." I kinda like the association between "Politics" and "Funny"!

What's even more interesting, if you search for one of the tags in the "Related Tags" list, you won't necessarily come up with (as you might expect with a pure thesaurus lookup system) the other words that were on the same list.

So, the tag for "Categories" gives you as related tags Tags, wordpress, Coding, Plugins, Wanted, Weblog Technology, PHP. But the tag for "Tags" brings up as related tags, not "Categories", wordpress, Coding, etc - but technorati,, folksonomy, Blogging, Blog, pivot, Taggerati.

It's all early days still, of course, but if I'm right in my guess as to how Technorati are coming up with those related tags lists, this may be the start of possibly the next stage of development in folksonomies and tagging - namely, synonyms generated automatically by looking at what people consider to be synonyms, weighted according to the number of people who make the same word associations (or maybe combined with some other kind of weighting, who knows?). Fascinating stuff.

Tagging old posts; tag spamming

While I'm on Technorati, just a warning note: I've still not heard back from Technorati on why my original attempts at using their tags failed. But from correspondence about the possibility of tagging old posts (if I have the energy and time!), and how to then get those tagged posts re-indexed, I've found out that (as you'd expect) Technorati do have some kind of mechanism in place to pick out possible tag spammers, or at least link spammers.

And while I don't know how that works, if you have lots of Technorati tags on the same page (e.g. your main blog page), you do run the risk of being tagged (if you'll forgive the pun) as a link spammer.

Now I know Technorati can't give away the details of how they suss out link spam, or else the spammers could use that to circumvent their system, but all the same I personally would like some rough guidelines from them as to what we legit bloggers should or shouldn't do, in very general terms, to avoid being considered spammers.

The only thing I know for sure is, the fewer tags per page, the less likely you are to be blacklisted.

Which is not very good news for people like me who, trying to get around the "lack of synonyms" issue, tag with singular and plural variations of the same word plus related words in order to increase the likelihood of people finding the right information.

Old posts - final note: do NOT try to tag old posts and then ping Technorati with the exact URL of each post or your archive directory, in an attempt to get them to re-index your newly-tagged old posts. They say that that will mess up your blog listing on Technorati as their spider may then think each old post page is a separate blog...

Technorati Tags: , , , , , , , , , ,, , , , , , , , , , , , , , ,