Google Search Appliance Protocol Reference User Manual
Page 92

Google Search Appliance: Search Protocol Reference
Dynamic Result Clustering Service /cluster Protocol
92
The top-level entries are described in the following table.
The dynamic result clustering service’s default JavaScript client ignores the <document> element and
does not use the <doc> array. The XML response is very basic, and does not use any validations such as
a DTD or XML.
The following DTD defines the XML rules, however the XML output is not validated against these rules:
<?xml version="1.0"?>
<!ELEMENT toplevel (Response, t_fetch, document+)>
<!ELEMENT Response (algorithm, t_cluster, cluster)>
<!ELEMENT cluster (gcluster+)>
<!-- each gcluster element is an alternate query and its location indexes from the
top results -->
<!ELEMENT gcluster (label, doc+)>
<!-- each document element is search result, complete with url, title, and snippet
-->
<!ELEMENT document (url, title, snippet)>
<!ELEMENT algorithm EMPTY>
<!ELEMENT t_fetch EMPTY>
<!ELEMENT label EMPTY>
<!ELEMENT doc EMPTY>
<!ELEMENT url EMPTY>
<!ELEMENT title EMPTY>
<!ELEMENT snippet EMPTY>
<!ATTLIST algorithm
data (Concepts)>
<!ATTLIST t_cluster
int CDATA #REQUIRED>
<!ATTLIST label
data CDATA #REQUIRED>
<!ATTLIST doc
int CDATA #REQUIRED>
<!ATTLIST url
data CDATA #REQUIRED>
<!ATTLIST title
data CDATA #REQUIRED>
<!ATTLIST snippet
data CDATA #REQUIRED>
Entry
Description
<cluster>
The output from different clustering algorithms. There is only one supported cluster
algorithm, so the value of <algorithm> must be Concepts.
The <cluster> category consists of:
•
A series of <algorithm> and subordinate <gcluster> pairs.
•
The subordinate <gcluster> is a series of <label> statements and the array of
<doc> elements that have that label.
•
The label is a query suggestion. The <doc> statements are indexes into the
<document> section that follows.
Each <label> provides an alternative query, and each <doc> array provides the
document location indices.
<document>
A sequence of the URL, title, and snippet for each of up to 100 top search results
from a search query. The search appliance creates the <doc> arrays from the
<document> list.