|
JSP XML Fuzzy Search Tool
|
||
|
This installation manual was designed to provide software developers with an easy guide to installing the i-fax.com tools on document processing systems. If you have any questions that are not addressed by this manual or if you have suggestions on how i-fax.com Inc. could serve you better, we welcome your input at service@i-fax.com.
copyright © 2013 i-fax.com Inc. All rights reserved.
|
||
Table of Contents
JSP XML Fuzzy Search Tool
This tool provides an example JSP page that uses the i-fax.com XML Fuzzy Search stlyesheet to perform full-text fuzzy searchs on text nodes in XML documents. It uses an n-gram algorithm to "score" records in the document and output a list of the records that score highest. The output of the template is a sorted xml document in the following form: <result_set> <result> <ID>12356</ID> <score>0.14</score> </result> <result> <ID>101</ID> <score>0.091</score> </result> </result_set> The ID node is simply the content of whichever node you want to identify the results with, such as a record id. The score itself doesn't say very much - it's the count of n-grams found in the text divided by the length of the text - it simply forms the basis for identifying the most likely candidate for the search. The JSP page wraps the XSL stylesheet with a front-end that allows it to be used like a regular web search engine. Installing the JSP XML Fuzzy Search Tool
Using the JSP XML Fuzzy Search ToolThe example application works like this. First, the jsp prepares the data source. In this case it's an xml file, but the xml can come from any source you like. Next it gets the search critera and passes it to the search utility's stylesheet and uses this to perform the transform on the xml data. The output of the stylesheet is an xml dataset, a list of IDs of records sorted by their "score", which is a measure of how close the text in that record matches the search criteria. Once you have this sorted list of IDs you can do whatever your application requires. This example builds a table of results with snippits of the search text and links to images of the pages. To do it, it uses the formatting stylesheet to combine the origional xml dataset with the score results to create an html table. For more information on implimenting the fuzzy search utility, please see the comments in the enclosed files.
|
|||