五千年(敝帚自珍)

主题:google的挑战者:clusty -- 林小筑

共:💬9
全看树展主题 · 分页首页 上页
/ 1
下页 末页
家园 google的挑战者:clusty

http://clusty.com/

简单地说,这个搜索引擎的卖点在于自动的把搜索结果进行分类组织(clustering)。比如说搜索"java",他就把搜索结果自动分成一下类别。

⇨Technology (32)

⇨Open Source (16)

⇨FAQ, Java programming (16)

⇨JavaScript (22)

⇨Tutorials (14)

⇨Java Applets (17)

⇨Games (13)

⇨Download Java (6)

⇨Reviews (9)

⇨Class (8)

其中有些类还能展开,划分成跟小的类。比如把technology类展开,就成了下面这个样子。

Technology (32)

⇨Developer Forums (2)

⇨Mobile, Information Device Profile (3)

⇨Marketplace For Java Technology (2)

⇨Servlets, XML (3)

⇨Microsoft (2)

⇨Apple, Mac (2)

⇨Certification Java (2)

⇨Java Programming (3)

⇨Other Topics (13)

利用了人工智能技术做的,而不是人类进行的手工分类,所以结果当然不能尽善尽美。但这体现了一种崭新的思想:当网上信息量多到了泛滥的程度时该怎么办? 应该利用计算机来帮人类过滤和组织这些信息。

说起来,其实google也有了类似的东西,就是其新闻聚合器。http://news.google.com.hk/news?ned=cn&hl=zh-CN

http://clusty.com/

New Company Starts Up a Challenge to Google

September 30, 2004

By JOHN MARKOFF

SAN FRANCISCO, Sept. 29 - Google executives have long

conceded that one of their great fears is to be overtaken

by a more advanced Internet search technology. Vivisimo, a

company founded by three former Carnegie Mellon University

computer scientists, is hoping to prove that Google's

worries are well founded.

Four-year-old Vivisimo plans to start Clusty, a free,

consumer search service based on results from Yahoo's

Overture engine, Thursday.

Vivisimo already offers a search service for corporate

customers, which clusters results into categories to make

them easier to sort through. Search "swift boat," for

example, and Vivisimo returns 149 results - listing them

one by one, and also as a table of categories, like "Swift

Boat Veterans," "John Kerry" and "Patrol Craft Fast" on the

left-hand side of the Web page.

The new Clusty service for consumers, which will be free

and supported by advertising revenue, uses a similar

organizational structure. But it also presents a series of

tabs enabling the user to see results from sources besides

the general Web, including shopping information, yellow

pages, news, blogs, and images.

Vivisimo, which is privately held and is profitable,

according to its executives, has been selling its

clustering technology to corporations for research by their

employees. Now Vivisimo is making an effort to compete more

broadly by attracting consumers to its Web site,

clusty.com.

The service is meant to address the confusion that can be

created when search engines return huge lists. Clustering

is also intended to help users find related material they

may overlook when they employ services that utilize page

ranking methods. Such methods employ a variety of software

algorithms to rank Web pages by their perceived relevance

to a query.

Many search experts say that clustering offers a better way

of looking at information than Google's page ranking

system.

"As databases get larger, trying to pull the proverbial

needle out of the haystack gets tougher and tougher," said

Gary Price, a librarian who is also the news editor at

SearchEngineWatch, a Web site that covers the industry.

"Here, you're getting a bit of extra help."

Vivisimo's co-founder and chief executive, Raul

Valdes-Perez, was a protégé of Herbert A. Simon, a Nobel

laureate who was a pioneer in artificial intelligence

research. Before co-founding Vivisimo, Mr. Valdes-Perez was

a computer scientist at Carnegie Mellon University. He

professes that the way to deal with information overload is

with information "overlook" - techniques that strip away

extraneous information.

Clusty would generate money for Vivisimo by placing several

search-related advertisements from Overture on the

right-hand side of each page. Revenue from the ads would be

shared by Vivisimo and Overture.

Unlike many start-ups, which are launched with venture

capital financing, Vivisimo was created with help from a $1

million grant from the National Science Foundation Small

Business Innovation Research program, which is intended to

stimulate innovation by new companies.

Vivisimo is not the first to introduce clustering for Web

surfers. Northern Light, a search engine company founded in

1996, had offered a consumer service featuring what it

called "custom search folders." But that company is now

focused on corporate applications.

Google is also using clustering technology, but in a more

limited fashion: its news page provides links to topics

that appear on news sites.

Microsoft and Yahoo have been drawn into the search

business in part because of Google's profitability and

rapidly growing revenue - $962 million for the quarter that

ended in June, up from $389 million in the previous

quarter.

The introduction of Clusty comes two weeks after A9, a

subsidiary of Amazon.com, introduced a service focused on

organizing information retrieved during various Web

searches.

"Search will look more like the magazine business than the

soda market," said Oren Etzioni, a computer scientist at

University of Washington and an advisory board member of

Vivisimo. He predicts that users might select from a

variety of services, rather than from a few dominant

players.

"The competition has shifted from crawling the Web and

returning an answer quickly," Mr. Etzioni said, "to adding

value to the information that has been retrieved."

A Google spokesman declined to comment on the service.

Vivisimo's executives are betting that there is an audience

for providing a different view of Web search results.

"Google is excellent at crawling as much of the Web as they

can; we don't do that," said Mr. Valdes-Perez. Instead,

Vivisimo tackles the question, "How do you solve the

problem of information overload?"

http://www.nytimes.com/2004/09/30/technology/30search.html?ex=1097903707&ei=1&en=87e20490beecdd4b

家园 试了一下

感觉比Goolge更有条理,以后我们又多了一条枪

但不支持中文!

家园 不错,有一些新的想法!
家园 我也试试看。
家园 大概是Automated Categorization的real world

版. 根据介绍, 应该还是based on text/context, keyword的. 这方面的研究已经很久(其实比search engine早得多), 不过一直没有象Google一样在整个Web范围实践过.

最近的一个Project就是在做类似的事: 在一个search engine中增加categorization的选项.

家园 请指点相关介绍?

clusty上好像没看到技术介绍,你找到了么?

你提的那些技术,请指点资料好么?

家园 好像还是支持中文的

刚刚用中文作了搜索。中文的内容和古狗百度还是不能比的。而且没有网页快照。有些连接点击之后早就过时了。不象古狗百度还可以从网页快照里知道一些内容。

家园 试了一下,好象返回的结果没有google多

我觉得对于一般的dummy user来说,使用google就足够了. 这个新的搜索引擎对于想搞研究的人可能比较有用,比如说,可以通过分类对搜索内容进行thorough review.

家园 是啊,它其实是对Overture的再包装,所以先天不足

Overture的结果当然比不上google

这个clusty只是指了一个方向,因为信息实在太多了,所以在返回搜索结果前要让计算机作一些过滤和提炼,才会对人类更有用。这说明搜索引擎仍大有可为。

google的人才储备很强,要做和这个clusty差不多(大致应该用到自然语言处理和机器学习,都是google的强项)的应该不难,甚至应该做得更好。

全看树展主题 · 分页首页 上页
/ 1
下页 末页


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河