Smallest unit of top-level text in an HTML page; that is, a token of text that lives outside of
html tags -- not a tag name, not an attribute, not part of a style, nor of javascript -- that
results from an html parse.
After a word has been stemmed, it can be retrieved by toString(),
or a reference to the internal buffer can be retrieved by getResultBuffer
and getResultLength (which is generally more efficient.)
toString() -
Method in class ecologylab.semantics.model.text.Term
Add the directory that this URL references to the traversable set; that is, to the bounding set
of path prefixes that we are willing to download from, given "limit traversal." This is called
automatically, as well as through traversable|; thus it parses to the directory level, removing
any filename portion of the URL.
TRAVERSABLE -
Static variable in class ecologylab.semantics.seeding.Seed