2. Overview of filter types

We now briefly consider each of the types of filter supported by the core Metaproxy binary. This overview is intended to give a flavor of the available functionality; more detailed information about each type of filter is included below in Reference.

The filters are here named by the string that is used as the type attribute of a <filter> element in the configuration file to request them, with the name of the class that implements them in parentheses. (The classname is not needed for normal configuration and use of Metaproxy; it is useful only to developers.)

The filters are here listed in alphabetical order:

2.1. auth_simple (mp::filter::AuthSimple)

Simple authentication and authorization. The configuration specifies the name of a file that is the user register, which lists username:password pairs, one per line, colon-separated. When a session begins, it is rejected unless username and password are supplied, and match a pair in the register. The configuration file may also specify the name of another file that is the target register: this lists username:dbname,dbname... sets, one per line, with multiple database names separated by commas. When a search is processed, it is rejected unless the database to be searched is one of those listed as available to the user.

2.2. backend_test (mp::filter::Backend_test)

A partial sink that provides dummy responses in the manner of the yaz-ztest Z39.50 server. This is useful only for testing. Seriously, you don't need this. Pretend you didn't even read this section.

2.3. bounce (mp::filter::Bounce)

A sink that swallows all packages, and returns them almost unprocessed. It never sends any package of any type further down the row, but sets Z39.50 packages to Z_Close, and HTTP_Request packages to HTTP_Response err code 400 packages, and adds a suitable bounce message. The bounce filter is usually added at the end of each filter chain route to prevent infinite hanging of, for example, HTTP requests packages when only the Z39.50 client partial sink filter is found in the route.

2.4. cql_rpn (mp::filter::CQLtoRPN)

A query language transforming filter which catches Z39.50 searchRequest packages containing CQL queries, transforms those to RPN queries, and sends the searchRequests on to the next filters. It is, among other things, useful in a SRU context.

2.5. frontend_net (mp::filter::FrontendNet)

A source that accepts Z39.50 connections from a port specified in the configuration, reads protocol units, and feeds them into the next filter in the route. When the result is received, it is returned to the original origin.

2.6. http_file (mp::filter::HttpFile)

A partial sink which swallows only HTTP_Request packages, and returns the contents of files from the local filesystem in response to HTTP requests. It lets Z39.50 packages and all other forthcoming package types pass untouched. (Yes, Virginia, this does mean that Metaproxy is also a Web-server in its spare time. So far it does not contain either an email-reader or a Lisp interpreter, but that day is surely coming.)

2.7. http_rewrite (mp::filter::HttpRewrite)

A true filter that can rewrite HTTP requests and responses. Passes all other types through unmodified. There is a configuration example file config-rewrite.xml under the etc directory.

Warning

This filter is somehat experimental. Its main use is in connection with our cf-proxy filter, which unfortunately can not be released as Open Source.

2.8. load_balance (mp::filter::LoadBalance)

Performs load balancing for incoming Z39.50 init requests. It is used together with the virt_db filter, but unlike the multi filter, it does send an entire session to only one of the virtual backends. The load_balance filter is assuming that all backend targets have equal content, and chooses the backend with least load cost for a new session.

Warning

This filter is experimental and not yet mature for heavy load production sites.

2.9. log (mp::filter::Log)

Writes logging information to standard output, and passes on the package unchanged. A log file name can be specified, as well as multiple different logging formats.

2.10. multi (mp::filter::Multi)

Performs multi-database searching. See the extended discussion of virtual databases and multi-database searching below.

2.11. query_rewrite (mp::filter::QueryRewrite)

Rewrites Z39.50 Type-1 and Type-101 ("RPN") queries by a three-step process: the query is transliterated from Z39.50 packet structures into an XML representation; that XML representation is transformed by an XSLT stylesheet; and the resulting XML is transliterated back into the Z39.50 packet structure.

2.12. record_transform (mp::filter::RecordTransform)

This filter acts only on Z39.50 present requests, and let all other types of packages and requests pass untouched. It's use is twofold: blocking Z39.50 present requests, which the backend server does not understand and can not honor, and transforming the present syntax and elementset name according to the rules specified, to fetch only existing record formats, and transform them on-the-fly to requested record syntaxes.

2.13. session_shared (mp::filter::SessionShared)

This filter implements global sharing of result sets (i.e. between threads and therefore between clients), yielding performance improvements by clever resource pooling.

2.14. sru_z3950 (mp::filter::SRUtoZ3950)

This filter transforms valid SRU GET/POST/SOAP searchRetrieve requests to Z39.50 init, search, and present requests, and wraps the received hit counts and XML records into suitable SRU response messages. The sru_z3950 filter processes also SRU GET/POST/SOAP explain requests, returning either the absolute minimum required by the standard, or a full pre-defined ZeeReX explain record. See the ZeeReX Explain standard pages and the SRU Explain pages for more information on the correct explain syntax. SRU scan requests are not supported yet.

2.15. template (mp::filter::Template)

Does nothing at all, merely passing the packet on. (Maybe it should be called nop or passthrough?) This exists not to be used, but to be copied - to become the skeleton of new filters as they are written. As with backend_test, this is not intended for civilians.

2.16. virt_db (mp::filter::VirtualDB)

Performs virtual database selection: based on the name of the database in the search request, a server is selected, and its address added to the request in a VAL_PROXY otherInfo packet. It will subsequently be used by a z3950_client filter. See the extended discussion of virtual databases and multi-database searching below.

2.17. z3950_client (mp::filter::Z3950Client)

A partial sink which swallows only Z39.50 packages. It performs Z39.50 searching and retrieval by proxying the packages that are passed to it. Init requests are sent to the address specified in the VAL_PROXY otherInfo attached to the request: this may have been specified by client, or generated by a virt_db filter earlier in the route. Subsequent requests are sent to the same address, which is remembered at Init time in a Session object. HTTP_Request packages and all other forthcoming package types are passed untouched.

2.18. zeerex_explain (mp::filter::ZeerexExplain)

This filter acts as a sink for Z39.50 explain requests, returning a static ZeeReX Explain XML record from the config section. All other packages are passed through. See the ZeeReX Explain standard pages for more information on the correct explain syntax.

Warning

This filter is not yet completed.