Updated: 24/9/05; 10:41:29

David Davies' Radio Weblog

 Saturday, March 29, 2003

David Wiley over at reusability.org is making up a guest list for a meeting to discuss the intersection between reusable learning objects and community. He's asked for nominations for the meeting. The rules are to list 5 people who should attend other then yourself. Hmm, only 5 people, eh? Who would be the 5 people I'd most like to meet? In no particular order...

Stephen Downes
David Carter-Tod
Sebastian Fiedler
Andy Powell and
D'Arcy Norman

There are of course loads of other people including Raymond Yee, Ben Toth, Oliver Wrede, George Siemens, Sébastien Paquet who are deep thinkers in this area and of course I'd love to have the chance to meet all of them at a meeting like this this, too.

Who would be on your list?

Posted 2:32:23 AM - comment []
 Monday, March 17, 2003

Here are some more RSS search engines that people have let me know about over the last 24 hours:

Feeder by Ben Nolan
rssSearch by François Schiettecatte
Snarf by Brady Gaster

I'm sure this list will grow quite rapidly.

As a service to the RSS community I'll happily maintain a list of these engines as an OPML file so that Dave can link it into the RSS directory. Please let me know if you are working on a search engine or know of one not yet on the list.

Postscript: Dave has created a new node in the directory.

Posted 3:14:24 PM - comment []

How do RSS search engines differ from 'conventional' HTML search engines such as Google? Ignore for now that Google can index other data sources such as PDF files.

The following are not criticisms of Feedster or any RSS search engine. Instead I see them as challenges and without challenges there'd be no progress. Some have easy solutions, others not. In this piece I've used Feedster as the canonical example of an RSS search engine. The questions raised are relevant to all RSS search engines but as Feedster is the nom du jour I've taken the liberty of using it as a convenient reference.

In no particular order, the significant order depends upon what you think is important:

1. Relevance ranking.

Compare these two searches for 'RSS search engines', firstly using Google then using Feedster. I know Feedster is just starting up but we need to know how it's searching its data and what the relevance ranking model is.

2. Finding and searching legacy data.

How will Feedster cope with legacy data? Take a look at my weblog home page RSS feed:

http://daviddavies.name/rss.xml

It lists only 3 items yet I've been writing this weblog for a couple of years. How will Feedster discover the rest of my written content? If the date of the first post listed in any one site's RSS file is taken as year 0 for that site, then fair enough, at least it'll be current. But as I write more for example then older posts will fall off the end of my RSS feed. Will these items be stored in Feedster's database or will the engine only ever search what's in the RSS feed at any one time? If the latter then a lot of relevant content will always be hidden from Feedster's gaze. (see also Walking the web - content scalability).

3. Walking the web - content scalability.

Google can find new content by walking the web. It picks up a link and walks it to find pages with yet more links. Eventually the whole web, at least in theory, can be walked as Google plays the ultimate six degrees of Kevin Bacon game (everything is related to everything else by a number of links).

Feedster can walk this walk to some extent using RSS autodiscovery but sooner or later (most likely sooner) it'll come up against a page with no associated RSS file and presumably the walk stops there. One day maybe all pages will be part of an RSS feed somewhere but not for a while yet I suspect.

Here's my weblog:

http://daviddavies.name/

I actually use my weblog as 3 separate weblogs, the home page and 2 categories (in Radio UserLand parlance). So in Feedster presumably my weblog has 3 instances:

http://daviddavies.name/rss.xml
http://daviddavies.name/categories/smsblog/rss.xml
http://daviddavies.name/categories/theviewfromhere/rss.xml

Now could Feedster guess that? Well yes, to some extent, it could walk my weblog and discover the RSS links from each page coming up with these 3 unique RSS URLs. But could Feedster use my weblog's domain name to discover more RSS feeds from other people's weblogs? Probably not. The domain name where my weblog resides is:

http://radio.weblogs.com/

How can you infer from this what other weblogs exists in this domain, let alone what their home URL is?

So my guess is that right now Feedster has a more complex database than Google. It probably has a table listing all unique RSS feed URLs then a larger database of each RSS item. The RSS items database is probably what gets searched. My guess is that Google doesn't maintain a separate table of top-level domains. is this a problem for either system? well, I guess it depends upon what you want to achieve. Feedster can use these extra data to its advantage in its advanced search. But so can Google.

Feedster can only discover content if it's part of an RSS feed. Google isn't limited in this respect and can find any page on the web providing it has a link to it from some page already in the Google database (you can of course suggest a link to both Google and Feedster to get a new site into their databases). Right now it would seem that RSS search engines are limited to a theoretically finite data-set of pages with an associated RSS feed. This may change with time (see also Finding and searching legacy data).

Posted 12:04:13 AM - comment []
 Sunday, March 16, 2003

Feedster is the latest RSS search service. As a friend of mine once said (when talking about girlfriends as it happens), this is only the latest and not the last. There were RSS feed search engines around some time ago. But the permanence of relationships aside, what makes a good RSS search engine, or any search engine for that matter? For me it's immediacy.

Google has a pretty quick turn around such that when a new web page is available Google will index it pretty quickly. But quick in this context is a few days (at least for most sites - Google probably scans some sources much more frequently e.g Google News sites). This is fine for many web pages but not for current news, or news as it happens. For example, a community of weblogs in Iraq could be a vital source of information providing news as it happens. Couple that to a picture or video weblog and you've got a powerful voice. I want to search for these kinds of events now, not in a day or so. Ok, I know I could bookmark a weblog and view it often if I wanted up-to-the-minute news but what if I didn't know a site existed and wanted to find it? Particularly just after an event has happened. Sometimes, even with more mundane searches, you want to find something that's just happened minutes ago.

Weblogs are a good source of up-to-the-minute news as well as RSS feeds. An advantage of searching an aggregation of RSS feeds is that prior art has created a system whereby when a weblog is updated it can ping a central server or servers to notify it that it has changed. A search engine that knows to index a weblog (or any other site) that has just changed will always be right up to date. Google uses other tricks to increase the relevance of a search result, for example by looking at how many links point to that page. An RSS search engine will need to give some thought about how to create a relevance ranking in the absence of a rich set of inbound links (maybe provenance of a site or weblog will increase ranking?). If Feedster uses the same ping protocols as weblogs.com then it's got the jump on other RSS search engines and search engines in general as it'll always have access to the most recent information, in theory information only minutes old. If it hasn't then maybe Scott should look into this because Google has bought Blogger so will likely be looking at something like this (and if not they should be!).

Good luck Feedster!

Posted 5:10:10 PM - comment []
 Thursday, March 6, 2003

I should have checked I suppose but Bluetooth has its own home page. There's a list of gadgets currently using Bluetooth though the cameras page is uselessly empty.
Posted 11:32:40 PM - comment []

Yeah, I know I'm probably way behind but I just got Bluetooth on my iMac. Now my phone talks to my Mac. Cool. The connection is pretty quick and the fact that it's wireless is of course the big deal. As I sit here typing this my desk drawer is full of cables for all the phones and cameras I've had in the past. I can now ditch them all. But I need a new digital camera as my Nikon Coolpix uses a Compact Flash card and although I have a card reader a Bluetooth camera would be terrific. Are there any Bluetooth digital cameras out yet?
Posted 10:42:30 PM - comment []
 Monday, March 3, 2003

How do I convince my friend that she really needs a Radio Userland weblog?
Posted 10:45:04 PM - comment []
 Sunday, March 2, 2003

It makes sense that mobile weblogs should take advantage of the current crop of phone technologies to enrich posts. They started with plain text. Then text plus pictures and now text plus video! The Nokia 7650 is not only a great phone but thanks to the amazing (and free) t-mobile video application it's also a video recorder. Sure, the videos are small, but who can remember the very first Quicktime video clips, not that much bigger than a postage stamp on screen, but how amazed were we? Truly breakthrough technology and look what Quicktime has developed into. Videos recorded on phones will get better and better in very short time.

I also predict that mobile weblogs will soon fade, at least in name, as weblogs in general become enhanced by all the technologies available for posting. Soon it'll make no more sense having a mobile weblog than it would be to call your regular weblog a stationary blog. You'll just have a weblog and how you post to it, and what you post to it, will depend upon where you are, what you're doing and what you want to say.

So I've made my first mobile video post. In days others will join in and very soon a new medium for personal expression will evolve. If the true value of weblogs is freedom of expression then come one, come all. In no time at all we'll look back and ask what all the fuss was about.

Note to eager m-bloggers: the assetManager tool now handles multimedia. An updated version will be available for you to download in a few days.

Posted 11:25:38 PM - comment []
 Saturday, March 1, 2003

The only way to keep up with Adam was to go out and buy a picture phone, so I got myself a Nokia 7650. Cool phone with lots of features.
Posted 11:03:52 PM - comment []