Registering Search Interface to SAS® Content as Google OneBox Module Search Interface to SAS Content supports two kinds of search results: * Reports search - supports searching of SAS BI Dashboard 4.3 (and later), SAS Web Report Studio reports, Stored Process reports, and images. * Data search - supports searching of Information Maps To search both types of content together, Search Interface to SAS Content will have to be registered as two separate OneBox modules. The sections below describe the steps required to use the Google Search Appliance (GSA) Administrative User Interface to perform the registration. Define the OneBox Module to the GSA Go to the OneBox Module definition screen in the Serving section of the interface. At the bottom of this page, there will be an entry form to create a new OneBox module definition. Enter any name. Suggestion: Select a name that is generic enough to cover the triggers and items to be searched, but unique enough to be able to easily identify this module later when selecting it from a list. After creating the OneBox module definition, you will be brought to a screen to enter the details about the module. Fill in the following fields. Trigger The only supported option at this point is Regular expression. In the Regular expression field, a regular expression should be entered. When the input search matches the regular expression, this module will be called. The regular expression must contain a “parenthesized” field which will be passed as the search term to the module. Note: Regular expression processing can be expensive, so try to keep the regular expression simple. Examples: (.*) report – would match any search phrase that has the word report in it, and it would pass as the search string the phrase before that word. “sales report” would search for reports that have the word “sales” in it. “new sales report” would search for reports that have the phrase “new sales” in it. ^report (.*) – would match any search phrase where the first word in the search is “report”, and it would pass as the search string the phrase after that word. “report sales” would search for reports that have the word “sales” in it. “report new sales” would search for reports that have the phrase “new sales” in it. Note that this definition is equivalent to using the keyword trigger option; however, the keyword trigger option does not pass the appropriate phrase to the module, and thus cannot be used as a replacement for this syntax. ^(.*sales.*) – would match any search phrase that has the word “sales” in it. In addition, it will pass the entire entered search string to the module. Thus, if the user enters “new sales in US”, that entire phrase will be passed, and only results that match the entire phrase will be returned. ^promotions|promotion(.*) – would match any search phrase where the first word in the search is either “promotion” or “promotions”, and it would pass the search string the phrase after that word. Provider In the provider section, select External Provider and enter a URL that points to the location where you deployed the sas.searchsas.ear file. For example, if you have deployed the EAR file to a Web server on yyy.mycompany.com that is listening on port 8080, you would enter * for reports: http://yyy.mycompany.com:8080/SASSearchService/Controller?forward=Search&searchT ype=reports * for information maps: http://yyy.mycompany.com:8080/SASSearchService/Controller?forward=Search&searchT ype=data The HTTP protocol should only be used if you are using None for the Authentication value. If you are using Basic Authentication, then it is preferable to use HTTPS (HTTP would also work but the credentials would be visible in the URL), such as: https://yyy.mycompany.com:8443/SASSearchService/Controller?forward=Search&search Type=data Authentication The GSA supports multiple types of authentication for OneBox providers. At this time, only None and Basic are supported for the Search Interface to SAS OneBox provider. None - No Authentication is required to use this module. Thus, when a user does a search with only “Public Content” selected, this module will be called. If you choose this option, then Search Interface to SAS Content provides search results for user SAS Web Anonymous user. If you want to change the user to some other user, then you will have to append the username and password in the parameter as below: http://yyy.mycompany.com:8080/SASSearchService/Controller?forward=Search&searchT ype=reports&userName=sasguest@saspw&password=xxxxxxx Note: If you provide the user name and password in the URL, all users will have the access to see all the results accessible for this user. Basic – The GSA will prompt the user for a user ID and password and will pass these values to the provider. When a user is doing a search, the user must have selected “Public and Secure content” to be searched for this module to be invoked. The authentication scheme allows for the results to be limited to just what that individual user is authorized to access. Once done with the ‘Security settings’, click Save OneBox Definition. Now you land on the main page, where you can find your newly added module in the list. If you want to change/edit the ‘OneBox Stylesheet Template’, click on the ‘Edit’ link corresponding to the module listing. At the bottom of the configuration page, you will see a link of ‘Edit XSL’. Note: When using Basic Authentication, Search Interface to SAS Content should use the HTTPS protocol to protect the user ID and password that is being passed to the provider. OneBox Stylesheet Template When the results are returned from the module to the GSA, the GSA can apply an XSL style sheet to the results to format them for display. You should provide a style sheet to make the user experience more robust than is provided by the GSA’s default XSL. To edit the XSL, use the Edit XSL link at the bottom of this area of the administrative page. The following is a sample XSL:
|||||||| |||||||| _parent Powered by SAS Powered by SAS _parent
_parent
_parent
Note: There are limitations imposed by the GSA on the XSL used here. For more information, see the Google Enterprise Web site, www.google.com/Enterprise. Additional Authentication Steps If you used a value of Basic Authentication for the Authentication option, you will need to perform an additional step in setting up a Google One-Box to use Basic Authentication. In the Google Admin UI->Crawl and Index->Crawler Access, make sure that the URL pattern of the One Box provider resides there, with some user ID and the "public" checkbox NOT checked. For example, if the web application is deployed to http://yyy.mycompany.com:8080/SASSearchService, then this same string needs to be used in the URL pattern line.  Add the OneBox Module Definition to a Front End For the OneBox module to be called when a user does a query, the OneBox module must be attached to a Front End (for more information on what a Front End is, see www.google.com/Enterprise). In the Google Admin UI->Serving->Front Ends, edit the Front End you want to add this OneBox Module to by going to the OneBox Modules tab, and selecting your OneBox Module from the list. Feeding SAS Content to the Index of Google Search Appliance using Search Interface to SAS Content Search Interface to SAS Content supports feeding SAS contents to the index of Google Search Appliance. To feed the contents follow the steps below. Step 1 - Revise the Configuration Information in url_list.txt To load SAS Contents to the index of Google Search Appliance, the index loading application requires configuring the Search Interface to SAS Content URL in the url_list.txt file available in the Search Interface to SAS Content installation home directory. While configuring the URL, you can either specify the user credential in the URL, or configure to load only public reports to the index of Google Search Appliance. If you specify the user credential in the URL, the index loading application will load all contents accessible for that user to the index. If you configure to load the public reports, index loading application load all contents accessible for SAS Web Anonymous user to the index. Note: If you provide the user name and password in the URL, all users will have the access to see all the results accessible for this user. To configure the URL for public reports or for a specific user, follow the respective section below. For public user: Uncomment the URL in the url_list.txt which has searchClient=gsafeeder and authType=none parameter appended in it. Ensure that all other URLs are commented if you need to provide feed only to Google Search Appliance with public user. For a specific user: Uncomment the URL in the url_list.txt which has searchClient=gsafeeder and userName=&password= parameters appended in it. Ensure that all other URLs are commented if you need to provide feed only to Google Search Appliance for a specific user. Replace with the metadata username. Replace with the password for the specified user. Note : It is recommended that the password is given in the encoded format using the procedure pwencode. For example, if the password is Welcome123, execute the following command in a SAS session to encode the password: proc pwencode IN = ‘Welcome123’; run; Use the output produced by this proc for the password. After you have uncommented the respective URLs for a public user or a specific user, modify the hostname and port in the URL on which Search Interface to SAS Content is deployed. For example, if you configure the URL for a specific user with username “myuser” and password “Welcome123”, and you decided to encode the password, and the hostname for Search Interface to SAS Content is http://searchsas.mycompany.com and the port is 8080, then the configured URL will be as follows: http://searchsas.mycompany.com:8080/SASSearchService/Controller?forward=Search&u serName=myuser&password={sas002}835DA53542E057BA07B02C302BDC76360963D41A&searchC lient=gsafeeder Step 2 - Run the loadindex Script with Required Parameters After the URL has been modified in the url_list.txt, you can run the loadindex script available in the installation home directory of Search Interface to SAS Content. The extension of this script file will vary based on the platform on which the search interface to SAS Content is installed: loadindex.exe for Windows, loadindex for UNIX-based systems and loadindex.rexx for z/OS. The loadindex script accepts the following command line arguments: -filename : the name of the configuration file containing the URL of Search Interface to SAS Content. This parameter needs to be specified only if you have changed the configuration file name to a name other than url_list.txt. -gsaurl : URL for the Google Search Appliance. The following is a sample URL: http://:19900/xmlfeed Replace with the host name of your Google Search Appliance. Note: Refer Google Search Appliance’s Feeds Protocol Developer's Guide to get the version specific URL for your Google Search Appliance. If the Google Search Appliance URL is http://gsa.mycompany.com:19900/xmlfeed and the configuration file is url_list.txt (default name), then run the following command in the command line loadindex –gsaurl http://gsa.mycompany.com:19900/xmlfeed Note: You can schedule a job to load the contents to the Google Search Appliance periodically using Windows Scheduler or in crontab of your UNIX-based systems To get the help about the command line options for loadindex, run the following command: Windows: loadindex.exe –help UNIX-based systems: loadindex –help z/OS: loadindex.rexx –help When the index has been loaded with SAS Content, you can perform a search in Google Search Appliance’s Search UI. Make sure that you select a search string which matches some SAS content. Note: In case a blank space exists in the path for the file url_list.txt, provide the path in double quotes. For example, if the path to the file is C:\my files\url_list.txt, then use the following command: loadindex -filename “C:\my files\url_list.txt” Advanced Configuration Option * While running the loadindex application, if the contents are more, you may need to increase the java heap size to avoid OutOfMemoryError. To increase the java heap size, open the loadindex.ini file located in the Search Interface to SAS Content installation home directory and add the following commands: JavaArgs_7=-Xmx512m JavaArgs_8=-Xms512m Note: To provide more java heap size, you can change the heap size value from 512 to more appropriate value. * If Search Interface to SAS Content is deployed in a web-app server configured with https, then the SSL certificate has to be imported in the JRE on which the loadindex application is running. SAS and all other SAS Institute product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. Other brand and product names are registered trademarks or trademarks of their respective companies. ® indicates USA registration. Copyright (c) 2010 SAS Institute Inc., Cary, NC, USA. All rights reserved. 1 July 2010