Name

zoom — Metaproxy ZOOM Module

DESCRIPTION

This filter implements a generic client based on ZOOM of YAZ. The client implements the protocols that ZOOM C does: Z39.50, SRU (GET, POST, SOAP) and Solr .

This filter only deals with Z39.50 on input. The following services are supported: init, search, present and close. The backend target is selected based on the database as part of search and not as part of init.

This filter is an alternative to the z3950_client filter but also shares properties of the virt_db - in that the target is selected for a specific database.

The ZOOM filter relies on a target profile description, which is XML based. It picks the profile for a given database from a web service, or it may be locally given for each unique database (AKA virtual database in virt_db). Target profiles are directly and indirectly given as part of the torus element in the configuration.

CONFIGURATION

The configuration consists of six parts: torus, fieldmap, cclmap, contentProxy, log and zoom.

torus

The torus element specifies target profiles and takes the following content:

attribute url

URL of Web service to be used to fetch target profiles from a remote service (Torus normally).

The sequence %query is replaced with a CQL query for the Torus search.

The special sequence %realm is replaced by the value of attribute realm or by the realm DATABASE argument.

The special sequence %db is replaced with a single database while searching. Note that this sequence is no longer needed, because the %query can already query for a single database by using CQL query udb==....

attribute content_url

URL of Web service to be used to fetch target profile for a given database (udb) of type content. Semantics are otherwise like url attribute above.

attribute auth_url

URL of Web service to be used for auth/IP lookup. If this is defined, all access is granted or denied as part of Z39.50 Init by the ZOOM module, and the use of database parameters realm and torus_url is not allowed. If this setting is not defined, all access is allowed and realm and/or torus_url may be used.

attribute auth_hostname

Limits IP lookup to a given logical hostname.

attribute realm

The default realm value. Used for %realm in URL, unless specified in DATABASE parameter.

attribute proxy

HTTP proxy to be used for fetching target profiles.

attribute xsldir

Directory that is searched for XSL stylesheets. Stylesheets are specified in the target profile by the transform element.

attribute element_transform

Specifies the element that triggers retrieval and transform using the parameters elementSet, recordEncoding, requestSyntax, transform from the target profile. Default value is "pz2", due to the fact that for historical reasons the common format is that used in Pazpar2.

attribute element_raw

Specifies an element that triggers retrieval using the parameters elementSet, recordEncoding, requestSyntax from the target profile. Same actions as for element_transform, but without the XSL transform. Useful for debugging. The default value is "raw".

attribute explain_xsl

Specifies a stylesheet that converts one or more Torus records to ZeeRex Explain records. The content of recordData is assumed to be holding each Explain record.

attribute record_xsl

Specifies a stylesheet that converts retrieval records after transform/literal operations.

When Metaproxy creates a content proxy session, the XSL parameter cproxyhost is passed to the transform.

element records

Local target profiles. This element may include zero or more record elements (one per target profile). See section TARGET PROFILE.

fieldmap

The fieldmap may be specified zero or more times. It specifies the map from CQL fields to CCL fields, and takes the following content:

attribute cql

CQL field that we are mapping "from".

attribute ccl

CCL field that we are mapping "to".

cclmap

The third part of the configuration consists of zero or more cclmap elements that specify the base CCL profile to be used for all targets. This configuration, thus, will be combined with cclmap-definitions from the target profile.

contentProxy

The contentProxy element controls content proxying. This section is optional and must only be defined if content proxying is enabled.

attribute config_file

Specifies the file that configures the cf-proxy system. Metaproxy uses setting sessiondir and proxyhostname from that file to configure name of proxy host and directory of parameter files for the cf-proxy.

attribute server

Specifies the content proxy host. The host is of the form host[:port]. That is without a method (such as http://). The port number is optional.

Note

This setting is deprecated. Use the config_file (above) to inform about the proxy server.

attribute tmp_file

Specifies the filename of a session file for content proxying. The file should be an absolute filename that includes XXXXXX which is replaced by a unique filename using the mkstemp(3) system call. The default value of this setting is /tmp/cf.XXXXXX.p.

Note

This setting is deprecated. Use the config_file (above) to inform about the session file area.

log

The log element controls logging for the ZOOM filter.

attribute apdu

If the value of apdu is "true", then protocol packages (APDUs and HTTP packages) from the ZOOM filter will be logged to the yaz_log system. A value of "false" will not perform logging of protocol packages (the default behavior).

zoom

The zoom element controls settings for the ZOOM.

attribute timeout

Is an integer that specifies, in seconds, how long an operation may take before ZOOM gives up. Default value is 40.

attribute proxy_timeout

Is an integer that specifies, in seconds, how long an operation a proxy check will wait before giving up. Default value is 1.

QUERY HANDLING

The ZOOM filter accepts three query types: RPN(Type-1), CCL and CQL.

Queries are converted in two separate steps. In the first step the input query is converted to RPN/Type-1. This is always the common internal format between step 1 and step 2. In step 2 the query is converted to the native query type of the target.

Step 1: for RPN, the query is passed un-modified to the target.

Step 1: for CCL, the query is converted to RPN via cclmap elements part of the target profile as well as base CCL maps.

Step 1: For CQL, the query is converted to CCL. The mappings of CQL fields to CCL fields are handled by fieldmap elements as part of the target profile. The resulting query, CCL, is then converted to RPN using the schema mentioned earlier (via cclmap).

Step 2: If the target is Z39.50-based, it is passed verbatim (RPN). If the target is SRU-based, the RPN will be converted to CQL. If the target is Solr-based, the RPN will be converted to Solr's query type.

SORTING

The ZOOM module actively handles CQL sorting - using the SORTBY parameter which was introduced in SRU version 1.2. The conversion from SORTBY clause to native sort for some target, is driven by the two parameters: sortStrategy and sortmap_field .

If a sort field that does not have an equivalent sortmap_-mapping, it is passed un-modified through the conversion. It doesn't throw a diagnostic.

TARGET PROFILE

The ZOOM module is driven by a number of settings that specify how to handle each target. Note that unknown elements are silently ignored.

The elements, in alphabetical order, are:

authentication

Authentication parameters to be sent to the target. For Z39.50 targets, this will be sent as part of the Init Request. Authentication consists of two components: username and password, separated by a slash.

If this value is omitted or empty, no authentication information is sent.

authenticationMode

Specifies how authentication parameters are passed to server for SRU. Possible values are: url and basic. For the url mode username and password are carried in URL arguments x-username and x-password. For the basic mode, HTTP basic authentication is used. The settings only take effect if authentication is set.

If this value is omitted, HTTP basic authentication is used.

cclmap_field

This value specifies the CCL field (qualifier) definition for some field. For Z39.50 targets this most likely will specify the mapping to a numeric use attribute + a structure attribute. For SRU targets, the use attribute should be string based, in order to make the RPN to CQL conversion work properly (step 2).

cfAuth

When cfAuth is defined, its value will be used as authentication to the backend target, and the authentication setting will be specified as part of a database. This is like a "proxy" for authentication and is used for Connector Framework based targets.

cfProxy

Specifies HTTP proxy for the target in the form host:port.

cfSubDB

Specifies sub database for a Connector Framework based target.

contentAuthentication

Specifies authentication info to be passed to a content connector. This is only used if content-user and content-password are omitted.

contentConnector

Specifies a database for content-based proxying.

elementSet

Specifies the elementSet to be sent to the target if record transform is enabled (not to be confused with the record_transform module). The record transform is enabled only if the client uses record syntax = XML and an element set determined by the element_transform / element_raw from the configuration. By default that is the element sets pz2 and raw. If record transform is not enabled, this setting is not used and the element set specified by the client is passed verbatim.

literalTransform

Specifies an XSL stylesheet to be used if record transform is enabled; see description of elementSet. The XSL transform is only used if the element set is set to the value of element_transform in the configuration.

The value of literalTransform is the XSL - string encoded.

piggyback

A value of 1/true is a hint to the ZOOM module that this Z39.50 target supports piggyback searches, i.e. Search Response with records. Any other value (false) will prevent the ZOOM module to make use of piggyback (all records part of Present Response).

queryEncoding

If this value is defined, all queries will be converted to this encoding. This should be used for all Z39.50 targets that do not use UTF-8 for query terms.

recordEncoding

Specifies the character encoding of records that are returned by the target. This is primarily used for targets were records are not UTF-8 encoded already. This setting is only used if the record transform is enabled (see description of elementSet).

requestSyntax

Specifies the record syntax to be specified for the target if record transform is enabled; see description of elementSet. If record transform is not enabled, the record syntax of the client is passed verbatim to the target.

sortmap_field

This value the native field for a target. The form of the value is given by sortStrategy.

sortStrategy

Specifies sort strategy for a target. One of: z3950, type7, cql, sru11 or embed. The embed chooses type-7 or CQL sortby, depending on whether Type-1 or CQL is actually sent to the target.

sru

If this setting is set, it specifies that the target is web service based and must be one of : get, post, soap or solr.

sruVersion

Specifies the SRU version to use. It unset, version 1.2 will be used. Some servers do not support this version, in which case version 1.1 or even 1.0 could be set.

transform

Specifies an XSL stylesheet filename to be used if record transform is enabled; see description of elementSet. The XSL transform is only used if the element set is set to the value of element_transform in the configuration.

udb

This value is required and specifies the unique database for this profile. All target profiles should hold a unique database.

urlRecipe

The value of this field is a string that generates a dynamic link based on record content. If the resulting string is non-zero in length a new field, metadata with attribute type="generated-url" is generated. The contents of this field is the result of the URL recipe conversion. The urlRecipe value may refer to an existing metadata element by ${field[pattern/result/flags]}, which will take the content of the field, and perform a regular expression conversion using the pattern given. For example: ${md-title[\s+/+/g]} takes metadata element title and converts one or more spaces to a plus character.

zurl

This setting is mandatory. It specifies the ZURL of the target in the form of host/database. The HTTP method should not be provided as this is guessed from the "sru" attribute value.

DATABASE parameters

Extra information may be carried in the Z39.50 Database or SRU path, such as authentication to be passed to backend etc. Some of the parameters override TARGET profile values. The format is:

udb,parm1=value1&parm2=value2&...

Where udb is the unique database recognised by the backend. The parm1, value1, .. are parameters to be passed. The following describes the supported parameters. Like form values in HTTP, the parameters and values are URL encoded. The separator, though, between udb and parameters is a comma rather than a question mark. What follows the question mark are HTTP arguments (in this case SRU arguments).

The database parameters, in alphabetical order, are:

content-password

The password to be used for content proxy session. If this parameter is not given, value of parameter password is passed to content proxy session.

content-proxy

Specifies proxy to be used for content proxy session. If this parameter is not given, value of parameter proxy is passed to content proxy session.

content-user

The user to be used for content proxy session. If this parameter is not given, value of parameter user is passed to content proxy session.

cproxysession

Specifies the session ID for content proxy. This parameter is, generally, not used by anything but the content proxy itself when invoking Metaproxy via SRU.

nocproxy

If this parameter is specified, content-proxying is disabled for the search.

password

Specifies password to be passed to backend. It is also passed to content proxy session, unless overridden by content-password. If this parameter is omitted, the password will be taken from TARGET profile setting authentication .

proxy

Specifies one or more proxies for backend. If this parameter is omitted, the proxy will be taken from TARGET profile setting cfProxy. The parameter is a list of comma-separated host:port entries. Both host and port must be given for each proxy.

realm

Session realm to be used for this target, changed the resulting URL to be used for getting a target profile, by changing the value that gets substituted for the %realm string. This parameter is not allowed if access is controlled by auth_url in configuration.

retry

Optional parameter. If the value is 0, retry on failure is disabled for the ZOOM module. Any other value enables retry on failure. If this parameter is omitted, then the value of retryOnFailure from the Torus record is used (same values).

torus_url

Sets the URL to be used for Torus records to be fetched - overriding value of url attribute of element torus in zoom configuration. This parameter is not allowed if access is controlled by auth_url in configuration.

user

Specifies user to be passed to backend. It is also passed to content proxy session unless overridden by content-user. If this parameter is omitted, the user will be taken from TARGET profile setting authentication .

x-parm

All parameters that have prefix "x-" are passed verbatim to the backend.

SCHEMA

# Metaproxy XML config file schemas
#
#   Copyright (C) Index Data
#   See the LICENSE file for details.

namespace mp = "http://indexdata.com/metaproxy"

filter_zoom =
  attribute type { "zoom" },
  attribute id { xsd:NCName }?,
  attribute name { xsd:NCName }?,
  element mp:torus {
    attribute allow_ip { xsd:string }?,
    attribute auth_url { xsd:string }?,
    attribute url { xsd:string }?,
    attribute content_url { xsd:string }?,
    attribute realm { xsd:string }?,
    attribute xsldir { xsd:string }?,
    attribute element_transform { xsd:string }?,
    attribute element_raw { xsd:string }?,
    attribute element_passthru { xsd:string }?,
    attribute proxy { xsd:string }?,
    attribute explain_xsl { xsd:string }?,
    attribute record_xsl { xsd:string }?,
    element mp:records {
      element mp:record {
        element mp:authentication { xsd:string }?,
        element mp:authenticationMode { xsd:string }?,
        element mp:piggyback { xsd:string }?,
        element mp:queryEncoding { xsd:string }?,
        element mp:udb { xsd:string },
        element mp:cclmap_au { xsd:string }?,
        element mp:cclmap_date { xsd:string }?,
        element mp:cclmap_isbn { xsd:string }?,
        element mp:cclmap_su { xsd:string }?,
        element mp:cclmap_term { xsd:string }?,
        element mp:cclmap_ti { xsd:string }?,
        element mp:contentAuthentication { xsd:string }?,
        element mp:elementSet { xsd:string }?,
        element mp:recordEncoding { xsd:string }?,
        element mp:requestSyntax { xsd:string }?,
        element mp:sru { xsd:string }?,
        element mp:sruVersion { xsd:string }?,
        element mp:transform { xsd:string }?,
        element mp:literalTransform { xsd:string }?,
        element mp:urlRecipe { xsd:string }?,
        element mp:zurl { xsd:string },
        element mp:cfAuth { xsd:string }?,
        element mp:cfProxy { xsd:string }?,
        element mp:cfSubDB { xsd:string }?,
        element mp:contentConnector { xsd:string }?,
        element mp:sortStrategy { xsd:string }?,
        element mp:sortmap_author { xsd:string }?,
        element mp:sortmap_date { xsd:string }?,
        element mp:sortmap_title { xsd:string }?,
        element mp:extraArgs { xsd:string }?,
        element mp:rpn2cql { xsd:string }?,
        element mp:retryOnFailure { xsd:string }?
      }*
    }?
  }?,
  element mp:fieldmap {
    attribute cql { xsd:string },
    attribute ccl { xsd:string }?
  }*,
  element mp:cclmap {
    element mp:qual {
      attribute name { xsd:string },
      element mp:attr {
        attribute type { xsd:string },
        attribute value { xsd:string }
      }+
    }*
  }?,
  element mp:contentProxy {
    attribute config_file { xsd:string }?,
    attribute server { xsd:string }?,
    attribute tmp_file { xsd:string }?
  }?,
  element mp:log {
    attribute apdu { xsd:boolean }?
  }?,
  element mp:zoom {
    attribute timeout { xsd:integer }?,
    attribute proxy_timeout { xsd:integer }?
  }?

  

EXAMPLES

In example below, Target definitions (Torus records) are fetched from a web service via a proxy. A CQL profile is configured which maps to a set of CCL fields ("no field", au, tu and su). Presumably the target definitions fetched, will map the CCL to their native RPN. A CCL "ocn" is mapped for all targets. Logging of APDUs are enabled, and a timeout is given.

    <filter type="zoom">
      <torus
         url="http://torus.indexdata.com/src/records/?query=%query"
	 proxy="localhost:3128"
         />
      <fieldmap cql="cql.anywhere"/>
      <fieldmap cql="cql.serverChoice"/>
      <fieldmap cql="dc.creator" ccl="au"/>
      <fieldmap cql="dc.title" ccl="ti"/>
      <fieldmap cql="dc.subject" ccl="su"/>

      <cclmap>
        <qual name="ocn">
          <attr type="u" value="12"/>
          <attr type="s" value="107"/>
        </qual>
      </cclmap>
      <log apdu="true"/>
      <zoom timeout="40"/>
    </filter>

   

Here is another example with two locally defined targets: A Solr target and a Z39.50 target.

      <filter type="zoom">
        <torus>
          <records>
            <record>
              <udb>ocs-test</udb>
              <cclmap_term>t=z</cclmap_term>
              <cclmap_ti>u=title t=z</cclmap_ti>
              <sru>solr</sru>
              <zurl>ocs-test.indexdata.com/solr/select</zurl>
            </record>
            <record>
              <udb>loc</udb>
              <cclmap_term>t=l,r</cclmap_term>
              <cclmap_ti>u=4 t=l,r</cclmap_ti>
              <zurl>lx2.loc.gov:210/LCDB_MARC8</zurl>
            </record>
          </records>
        </torus>
        <fieldmap cql="cql.serverChoice"/>
        <fieldmap cql="dc.title" ccl="ti"/>
      </filter>

   

SEE ALSO

metaproxy(1)

virt_db(3mp)

COPYRIGHT

Copyright (C) 2005-2023 Index Data