Users browsing a Web site should able to locate the information they need in just a few minutes. Information at a Web site is of little use to anyone if it is hard to reach. For a moment, think what the World Wide Web would be if not for various Web site cataloging databases. If not for these cataloging and search databases, the Internet community would be unable to locate the information it needs. The same concept applies to your Web site on a smaller scale.
In this chapter, you learn how you can make your Web site searchable by setting up a Web search engine. Several search engines are available for Windows NT. The first section discusses how you can use the Verity topicSEARCH engine to make a Web site searchable. This search engine is an Internet Server Application Programming Interface (ISAPI) application that is designed to take advantage of features of Internet Information Server (IIS) and make a Web site searchable while conserving system resources.
In the other sections, you learn how you can use the built-in search engines of WebSite and the Netscape server to make a Web site searchable. After reading this chapter, you will be able to make a Web site hosted with IIS, WebSite, and Netscape searchable in a matter of minutes.
Purveyor also is shipped with Verity's search engine. Although you do not specifically learn how a Web site hosted with Purveyor can be made searchable in the following section, you do get an idea how to do so. Refer to Purveyor documentation for additional information about setting up and using the Verity search engine.
In this section, you learn how you can make a Web site hosted
with IIS searchable using the Verity topicSEARCH engine. You can
download the topicSEARCH engine from Verity's Web site. Note that
you might need to fill in a form and submit it before you are
given a username and a password to download the search engine.
URL |
Verity topicSEARCH engine download site: http://www.verity.com/products/topicSEARCH.html |
After you download the Verity search engine, copy it to a temporary directory and decompress it. Then run the file setup.exe to begin installing the Verity search engine. The installation program first gathers some information from you, such as your name and company name. Then a dialog box similar to the one shown in Figure 15.1 appears. In the Choose Destination Location dialog box, you specify the target directory of the Verity search engine. After you select a directory, click the Next button to continue. In the next dialog box that appears, you select an NT Start Menu program folder for the search engine. Either select an existing folder or type in the name of a new folder, and click the Next button to continue.
Figure 15.1: Choosing a destination location of topicSEARCH.
Security |
The Verity search engine should be installed on an NTFS partition. If it is installed on a FAT partition, unauthorized users can access the search engine's administration menu and potentially abuse its functionality. |
The installation program then installs the topicSEARCH search engine and displays a Setup Complete dialog box like the one shown in Figure 15.2. Checking both checkboxes in this dialog box is a good idea. You then can learn more about the search engine and immediately begin using it with Internet Explorer. Note that before you can use the search engine, you need to configure it with Internet Explorer by indexing your Web site.
Figure 15.2: The Setup Complete dialog box.
After you install the search engine, you can test the search engine installation by looking up the URL http://<your server name>/topic/admin/qstart.htm, as shown in Figure 15.3. Replace <your server name> with the Internet address of your server. Note that you might need to stop and restart IIS before you can look up this URL. It is not a very good idea to allow users browsing your Web site to configure your search engine if they stumble onto the Web page shown in Figure 15.3. Refer to Chapter 6, "Installing and Using Microsoft Internet Information Server," for more information about how you can restrict access to certain parts of a Web site with a username and a password. Note that you can restrict access to certain parts of a Web site only if the search engine is installed in an NTFS partition.
Figure 15.3: The topicSEARCH welcome page.
The Verity search engine is easy to configure. As you learn later in this section, you can make a Web site searchable in just a few minutes by using the Web page shown in Figure 15.4. You invoke this Web page by clicking the quick start tab of the Web page. Before you continue to build a search index, you should test the search engine to make sure that it is properly installed. To do so, click option 1, Diagnose your system.
Figure 15.4: The topicSEARCH Quick Start page.
When the Web page shown in Figure 15.5 appears, you can diagnose the topicSEARCH installation by clicking the diagnose button. The Verity search engine then diagnoses the topicSEARCH installation and displays a Web page similar to the one shown in Figure 15.6.
Figure 15.5: The topicSEARCH diagnostics Web page.
Figure 15.6: The topicSEARCH diagnostics completed Web page.
If the Verity search engine is installed properly, a Web page similar to the one shown in Figure 15.6 appears. If no errors are found, proceed to create a new search index by clicking option 2, Index your local Web site.
Before you can search a Web site, you have to index it using the Web page shown in Figure 15.7. Indexing your entire Web site is generally a good idea. Users browsing the Web site then can easily locate information they need by executing a search. After you type the URL indicating where topicSEARCH should begin indexing, click the index button. topicSEARCH then begins with the URL specified and creates a search index indexing all local Web pages linked to that URL.
Figure 15.7: Creating a search index.
After you click the index button on this page, topicSEARCH indexes the specified URL and then displays a Web page identical to the one shown in Figure 15.8. Depending on the number of documents at your Web site, indexing the entire Web site might take a while. You can click the terminate button to cancel indexing.
Figure 15.8: The indexing status Web page.
You can click the recent button on this Web page to check the status of indexing the URL specified earlier. After you click this button, the last few Web pages indexed by topicSEARCH are displayed, as shown in Figure 15.9. After you index the Web site, you can search it by clicking the search tab.
Figure 15.9: Viewing the status of indexing a Web site.
You can use the search tab in the Web page shown in Figure 15.10 to test the topicSEARCH engine by executing a search. Click the Power Search link to initiate a search.
Figure 15.10: Selecting a search form.
Security |
Do not create any links to http://<your.server.com>/topic/admin/search.htm from other pages at your Web site. Note that Web pages under /topic/admin/ are generally used only for administration purposes, and you should not expose this directory to users. Later in this chapter, you learn how you can use NTFS security to secure this directory. |
You can use the Web page shown in Figure 15.11 to initiate a search. For the purpose of this example, search the Web site indexed earlier for the string "Great Falls". Great Falls is a national park in West Virginia, and several Web pages contain this string.
Figure 15.11: Initiating a search.
The results of the search initiated in Figure 15.11 are displayed in Figure 15.12. As you can see, topicSEARCH has successfully indexed the Web site and matched the string "Great Falls" with several personal Web pages. When search results are displayed, topicSEARCH assigns a score to each document based on the number of matches and other criteria. Refer to topicSEARCH documentation for additional information about customizing searches.
Figure 15.12: The search results.
As I mentioned previously, You should use URLs containing the directory /topic/admin/ only for administration purposes. Users browsing a Web site can use the URL http://<your.server.com>/topic/docs/search3.htm to search for various keywords. Be sure to create several links to this Web page from various Web pages at your site.
It is a good idea to create a standard button bar and include it in all your Web pages. You can map one of these buttons to the URL just mentioned. The other buttons can be used to provide feedback, return to the main Web page, and so on.
Limiting access to the \Verity\topic\admin directory is crucial so that unauthorized users cannot access Web pages in this directory to configure the search engine. Users with malicious intent can potentially abuse various search engine configuration settings and bring about undesired results.
Because IIS uses NTFS file permissions, you can use the File Manager to restrict access to the Verity\topic\admin directory. To do so, first invoke the File Manager and select the Verity\topic\admin directory. Then choose Security|Permissions from the main menu to invoke the Directory Permissions dialog box, as shown in Figure 15.13. In this dialog box, you can restrict access to the Internet guest account that is used by IIS. Assign "No Access" to the Internet guest account and "Full Control" to the Administrators group. Next, check the two checkboxes to replace permissions of subdirectories and existing files; then click the OK button.
Figure 15.13: File Manager's Directory Permissions dialog box.
The Verity topicSEARCH engine administration menu is now accessible only to members of the Administrators group. At this time, if you restart your Web browser and try to connect to http://<your.server.com>/topic/admin/search.htm, you are asked for a username and a password. A username and password of a user who is part of the Administrators group are now required to access files in the Verity\topic\admin directory. To make sure that users of the Administrators group can log on, you should check the Basic (Clear Text) checkbox of the IIS properties dialog box.
You also can check the Windows NT Challenge/Response checkbox to enable users of the Administrators group to log on and configure the search engine. The Windows NT Challenge/Response authentication mechanism automatically encrypts usernames and passwords. Only Internet Explorer version 2.0 and later, however, support this password authentication scheme. Refer to Chapter 6 for more information about securing a Web site hosted with IIS.
In a matter of minutes, you can make a Web site hosted with the Netscape Enterprise Server searchable. Netscape Enterprise Server ships with the Verity topicSEARCH engine, and you can configure it using the Web page shown in Figure 15.14. Refer to Chapter 7, "Publishing on the Web with WebSite, Purveyor, and Netscape," for more information about installing the Netscape Enterprise Server and accessing the configuration Web page shown in the figure.
Figure 15.14: The Netscape search engine configuration Web page.
Using the Create a Collection Web page shown in Figure 15.14, you can create a collection by specifying a collection name, description, and directory. A collection is a collection of documents that can be searched. A Web site, for example, might have a directory for technical support and another directory for sales reports. A user executing a search for technical support is probably not interested in documents in the sales reports directory. You can use collections to refine searches by indexing similar documents. You can create, for example, a collection (or search index) for the sales reports directory and the technical support directory. After you type in the information requested, click the On button to index the Web site. Click the Help button at the end of the page if additional help is needed.
After you create a search index, the Netscape server displays a Web page similar to the one shown in Figure 15.15. As described in this Web page, the directory that was indexed is now searchable. All that you need to do is copy the HTML code in this page to another Web page and enable users to use the form in the HTML code to search the directory that was indexed. Users also can search the Web site by looking up the URL
Figure 15.15: Results of successful search index creation.
http://<your.server.com>/search/iaquery.exe as shown in Figure 15.16.
Figure 15.16: Initiating a search.
Using the Web page shown in Figure 15.16, you can initiate a search by typing in several keywords and selecting a subject. Subjects correspond to search indexes created using the Web page shown in Figure 15.14. After you type in the search criteria, click on the Search button to initiate the search.
After Netscape's search engine performs the search, Web pages matching the search criteria are displayed in a Web page similar to the one shown in Figure 15.17. This Web page lists titles of Web pages matched along with their URLs. Depending on the contents of the Web page, a score also is given to each Web page to make it easier to locate the most relevant Web pages. You can use the input controls in the search results page either to refine the search by typing in additional keywords or to perform a new search altogether.
Figure 15.17: Search results Web page.
A search engine also is included with WebSite. Refer to Chapter 7 for more information about installing WebSite and accessing the WebIndex dialog box shown in Figure 15.18. To invoke the WebIndex dialog box, double-click on the WebIndex icon in the WebSite applications folder. In this dialog box, you can select Web pages to index.
Figure 15.18: Selecting Web pages to index.
On the Merge Indexes tab of the WebIndex dialog box, you can merge search indexes to create larger search indexes. As shown in the dialog box in Figure 15.19, you can merge several search indexes by selecting them and giving a name to the new search index by typing it in the Merged Index Name text box.
Figure 15.19: Merging search indexes.
After you select Web pages on the Create Index tab, you can configure search index settings on the Preferences tab of the WebIndex dialog box, as shown in Figure 15.20. Then click the OK button to create a search index. After creating a search index, users browsing the Web site can use the Web page shown in Figure 15.21 to search and locate Web pages in which they are interested.
Figure 15.20: Configuring search index settings.
Figure 15.21: Initiating a searach.
Using the Web page shown in Figure 15.21, you can search a Web site indexed with WebIndex. You can use this Web page to select an index and initiate a search for several keywords. After you generate a search index, create links to the URL http://<your.server.com>/cgi-bin/WebFind.exe from other Web pages to enable users browsing the Web site to search for various keywords. Because index names are exposed to users, giving descriptive names to search indexes is a good idea. This way, users can easily locate Web pages they are interested in without being confused about various index names.
After it searches for the search criteria specified in Figure 15.21, WebIndex displays Web pages that match the search criteria, as shown in Figure 15.22.
Figure 15.22: Search results Web page.
Making your Web site searchable is very important so that users browsing it can search and locate information they need without unnecessarily browsing hierarchies of Web pages. As you learned in this chapter, you can make a Web site searchable in a matter of minutes. As demonstrated, search engines are available for all four Web servers covered in this book-Internet Information Server, WebSite, Purveyor, and Netscape Enterprise Server.
Although in this and preceding chapters you learned how you can set up a Web server to publish information, none of these chapters discussed how to make it provide dynamic content based on various needs of users browsing a Web site. The next chapter introduces you to Windows NT CGI programming and demonstrates how you can make your Web site interactive with the use of CGI programs.
One of the best things about the World Wide Web is how it can be used to distribute information to millions of people. CGI enables you to interact with this large audience of people. The next chapter covers all the fundamentals of CGI programming. First, you learn various aspects of adding CGI applications to a Web site and then how you can write CGI applications in PERL (Practical Extraction Report Language) and C/C++.
After you read the next chapter, be sure to read Chapter 17, "Advanced Windows NT CGI Applications," to learn how you can use various CGI programs to create innovative server-side applications to provide dynamic content to users browsing a Web site. The next few chapters demonstrate how you can exploit various capabilities of CGI to create information-rich, active Web sites that are interesting and easy to navigate.