|
|
|
|
|
|
|
|
|
|
|
|
|
Representativity
of Web as Corpus
|
|
|
|
|
|
|
|
|
Much
ill-formed or fragmentary language
|
|
|
|
|
|
|
|
|
|
Domain
only a rough clue to provenance
|
|
|
|
|
|
|
|
|
|
Numbers
vs. Statistics
|
|
|
|
|
|
|
|
|
Search
engines number of pages matching
|
|
|
|
|
|
a query, not
actual citations
|
|
|
|
One
page may contain alternate usages
|
|
|
|
|
|
|
|
|
|
Narrower
filters may eliminate some pages
|
|
|
|
|
|
|
|