The following sections describe the Verity Spider path and URL options.
-auth path_and_filename
Specifies an authorization file to support authentication for secure paths.
Use the -auth option to specify the authorization file. The file contains one record per line. Each line consists of server, realm, username, and password, separated by whitespace.
The following is a sample authorization file:
# This is the Authorization file for HTTP's Basic Authentication #server realm username password doleary MACR my_username my_password
Web crawling only
Lets you index URLs containing query strings. That is, a question mark (?) followed by additional information. This typically means that the URL leads to a CGI or other processing program.
The return document produced by the web server is indexed and parsed for document links, which are followed and in turn indexed and parsed. However, if the web server does not return a page, perhaps because the URL is missing parameters that are required for processing in order to produce a page, nothing happens. There is no page to index and parse.
The following is a URL without parameters:
http://server.com/cgi-bin/program?
If you include parameters in the URL to be indexed, as specified with the -start option, those parameters are processed and any resulting pages are indexed and parsed.
By default, a URL with a question mark (?) is skipped.
Web crawling only
-domain name_1 [name_n] ...
Limits indexing to the specified domain(s). You must use only complete text strings for domains. You cannot use wildcard expressions. URLs not in the specified domain(s) are not downloaded or parsed.
You can list multiple domains by separating each one with a single space.
Specifies that Verity Spider follows links within duplicate documents, although only the first instance of any duplicate documents is indexed.
You might find this option useful if you use the same home page on multiple sites. By default, only the first instance of the document is indexed, while subsequent instances are skipped. If you have different secondary documents on the different sites, using the -followdup option lets you get to them for indexing, while still indexing the common home page only once.
File system only
Specifies that Verity Spider follows symbolic links when indexing UNIX file systems.