Commit b7f05445c00f has added WWW entries to port Makefiles based on WWW: lines in pkg-descr files. This commit removes the WWW: lines of moved-over URLs from these pkg-descr files. Approved by: portmgr (tcberner)
10 lines
450 B
Plaintext
10 lines
450 B
Plaintext
This module takes a list of documents (in English) and
|
|
builds a simple in-memory search engine using a vector
|
|
space model. Documents are stored as PDL objects, and
|
|
after the initial indexing phase, the search should be
|
|
very fast. This implementation applies a rudimentary
|
|
stop list to filter out very common words, and uses a
|
|
cosine measure to calculate document similarity.
|
|
All documents above a user-configurable similarity
|
|
threshold are returned.
|