ВУЗ:
Составители:
Рубрика:
32
These tasks are becoming increasingly difficult as the Web grows. However,
hardware performance and cost have improved dramatically to partially offset the
difficulty. There are, however, several notable exceptions to this progress such as
disk seek time and operating system robustness. In designing Google, we have
considered both the rate of growth of the Web and technological changes. Google is
designed to scale well to extremely large data sets. It makes efficient use of storage
space to store the index. Its data structures are optimized for fast and efficient access .
Further, we expect that the cost to index and store text or HTML , will eventually
decline relative to the amount that will be available. This will result in favorable
scaling properties for centralized systems like Google.
TEXT 8
Divide the text into paragraphs. Express the main idea of each paragraph.
Google is designed to be a scalable search engine. The primary goal is to
provide high quality search results over a rapidly growing World Wide WebGoogle
employs a number of techniques to improve search quality including page rank,
anchor text, and proximity information. Furthermore Google is a complete
architecture for gathering web pages, indexing them, and performing search queries
over them. A large-scale web search engine is a complex system and much remains to
be done. Our immediate goals are to improve search efficiency and to scale to
approximately 100 million web pages. Some simple improvements to efficiency
include query caching, smart disk allocation and subindices. Another area which
requires much research is updates. We must have smart algorithms to decide what old
web pages should be recrawled and what new ones should be crawled. Work toward
this goal has been done in. One promising area of research is using proxy caches to
build search databases, since they are demand driven. We are planning to add simple
features supported by commercial search engines like boolean operators, negation,
and stemming. However, other features are just starting to be explored such as
relevance feedback and clustering (Google currently supports a simple hostname
based clustering). We also plan to support user context (like the user s location), and
result summarization. We are also working to extend the use of link structure and link
text. Simple experiments indicate PageRank can be personalized by increasing the
weight of a user's home page or bookmarks. As for link text, we are experimenting
with using text surrounding links in addition to the link text itself. A Web search
engine is a very rich environment for research ideas. We have far too many to list
here so we do not expect this Future Work section to become much shorter in the
near future. The biggest problem facing users of web search engines today is the
quality of the results they get back. While the results are often amusing and expand
users horizons, they are often frustrating and consume precious time. For example,
the top result for a search for Bill Clinton on one of the most popular commercial
search engines was the Bill Clinton Joke of the Day April 14, 1997. Google is
designed to provide higher quality search so as the Web continues to grow rapidly,
information can be found easily. In order to accomplish this Google makes heavy use
Страницы
- « первая
- ‹ предыдущая
- …
- 31
- 32
- 33
- 34
- 35
- …
- следующая ›
- последняя »