With the Internet query parser, users can search entire documents or parts of documents (zones and fields) entering words, phrases, and plain language similar to that used by many web search engines. ColdFusion supports two Internet query parsers in the cfsearch type attribute.
Internet: Uses standard, web-style query syntax. For more information, see Query syntax.
Internet_basic: Similar to Internet. This query parser enhances performance, but produces less accurate relevancy statistics.
In a search form enabled with the Internet query parser, users can enter words, phrases, and plain language. The Internet parser does not support the Verity query language (VQL).
To search for multiple words, separate them with spaces.
To search for an exact phrase, surround it with double-quotation marks. A string of capitalized words is assumed to be a name. Separate a series of names with commas. Commas aren't needed when the phrases are surrounded by quotation marks.
The following example searches for a document that contains the phrases "San Francisco" and "sourdough bread":
"San Francisco" "sourdough bread"
To search with plain language, enter a question or concept. The Internet Query Parser identifies the important words and searches for them. For example, enter a question such as:
Where is the sales office in San Francisco?
This query produces the same results as entering:
sales office San Francisco
You can limit searches by excluding or requiring search terms, or by limiting the areas of the document that are searched.
A minus sign (-) immediately preceding a search term (word or phrase) excludes documents containing the term.
A plus sign (+) immediately preceding a search term (word or phrase) means returned documents are guaranteed to contain the term.
If neither sign is associated with the search term, the results may include documents that do not contain the specified term as long as they meet other search criteria.
The Internet parser lets users perform field searches. The fields that are available for searching depend on field extraction rules based on the document type of the documents in the collection.
To search a document field, type the name of the field, a colon (:), and the search term with no spaces.
field:term
If you enter a minus sign (-) immediately preceding field, documents that contain the specified term are excluded from the search results. For example, if you enter -field:term, documents that contain the specified term in the specified field are excluded from the results of the search.
If you enter a plus sign (+) immediately proceeding the field search specification, such as +field:term, documents are included in the search results only if the search term is present in the specified field.
Field searches are enabled by the enableField parameter in a template file. This parameter, set to 0 by default, must be set to 1 to allow searching a document field.
The query syntax is very similar to the syntax that users expect to use on the web. Queries are interpreted according to the following rules:
cake recipes
"chocolate cake" recipe
cake recipes -rum
cake recipes NOT rum
cake recipes +chocolate
You can search fields or zones by specifying name: term, where:
name is the name of the field or zone
term is an individual search term or phrase
For example:
bakery city:"San Francisco" bakery city:Sunnyvale
For more information, see Refining your searches with zones and fields.
Search terms are passed through to the VDK-level and are interpreted as Verity Query Language (VQL) syntax. No issues arise if the terms contain only alphabetic or numeric characters. Other kinds of characters might be interpreted by the language you're using. If a term contains a character that is not handled by the specified language, it might be interpreted as VQL. For example, a search term that includes an asterisk (*) might be interpreted as a wildcard.
The configurable Internet query parser uses its own stop-word list, qp_inet.stp, to specify terms to ignore for natural language processing.
For example, the following stop words are provided in the query parser's stop-word file for the English (Basic) template:
a |
did |
i |
or |
what |
also |
do |
i'm |
should |
when |
an |
does |
if |
so |
where |
and |
find |
in |
than |
whether |
any |
for |
is |
that |
which |
am |
from |
it |
the |
who |
are |
get |
its |
there |
whose |
as |
got |
it's |
to |
why |
at |
had |
like |
too |
will |
be |
has |
not |
want |
with |
but |
have |
of |
was |
would |
can |
how |
on |
were |
<or> |
Verity provides a populated stop-word file for the English and English (Advanced) languages. You do not need to modify the qp_inet.stp file for these languages. If you use the configurable Internet query parser for another language, you must provide your own qp_inet.stp file that contains the stop words that you want to ignore in that language. This stop-word file must contain, at a minimum, the language-equivalent words for or and <or>.