Feeds protocol developer’s guide, Overview – Google Search Appliance Feeds Protocol Developers Guide User Manual

Page 5

Advertising
background image

Google Search Appliance: Feeds Protocol Developer’s Guide

5

Feeds Protocol Developer’s Guide

This document is for developers who use the Google Search Appliance Feeds Protocol to develop
custom feed clients that push content and metadata to the search appliance for processing, indexing,
and serving as search results.

To push content to the search appliance, you need a feed and a feed client:

The feed is an XML document that tells the search appliance about the contents that you want to
push.

The feed client is the application or web page that pushes the feed to a feeder process on the search
appliance.

This document explains how feeds work and shows you how to write a basic feed client.

Overview

You can use feeds to push data into the index on the search appliance. There are two types of feeds:

A web feed provides the search appliance with a list of URLs. A web feed:

Must be named “web”, or have its feed type set to “metadata-and-url”.

May include metadata, if the feed type is set to “metadata-and-url”.

Does not provide content. Instead, the crawler queues the URLs and fetches the contents from
each document listed in the feed.

Is incremental.

Is recrawled periodically, based on the crawl settings for your search appliance.

A content feed provides the search appliance with both URLs and their content. A content feed:

Can have any name except “web”.

Provides content for each URL.

May include metadata.

Can be either full or incremental.

Is only indexed when the feed is received; the content and metadata are analyzed and added to
the index. The URLs submitted in a content feed are not crawled by the search appliance. Any
URLs extracted from the content, that have not been submitted in a content feed, will be
extracted and scheduled for crawling if they match the crawling rules.

Advertising