May 15, 2003The Doc Searls Weblog : Thursday, May 15, 2003
href="http://doc.weblogs.com/" "Web search is the most challenging field in computer science," says Flake. It calls on skills in operating systems, network architecture, artificial intelligence, linguistics, probability theory and fuzzy logic. A relevance ranking may evaluate relationships among words, page links, even a page's word count. He aspires to a search system with the ability to index 100 billion documents without falling apart.
Flake plans to add machine learning, which improves each search by drawing on past efforts. Data from human editors who currently review key words for their relevancy to Web pages will be keyed into the machine learning process. Interesting enough just this morning I was perusing a spreadsheet sent from Overture to show how the current 100+ keywords we maintain could be augmented by additional keywords. The spreadsheet was well done and included different options for appearing 1st, 2nd, and 3rd in the listings. It included some very savvy keyword combinations which were based on differing formulas revolving around the current group. It was a pitch to essentially double our keywords while getting them at a substantially lower cost than our current average. I was certain upon looking at it that this had been produced by machine logic. Others were certain this was done by hand by a real individual. I don't know for sure but Doc's quote makes me think I was probably right. Posted by filchyboy at May 15, 2003 12:00 AM | TrackBack |
|
colophon |
The tools used at any one point in time for this document are hard to pin down. The process of how I publish change on a regular basis as I publish from several different platforms and in many different contexts. I am slowly as I learn building a network of my own so that I can publish to my brain dump anytime, anywhere. The following list is a good stab at the tools and responsible parties: |
perl, php, rss, opml, radio, activeRenderer, blogger pro, netnewswire, google, apple, adobe, microsoft, winer, zerolag, cornerhost. |