cf_webservice — Connector Framework Web service
cf_webservice
is a
Metaproxy
filter which offers a Web service for the Connector Framework.
The module may also provide a Z39.50 server for dealing with search based connectors. For a description of the Z39.50 server functionality refer to the cf-zserver(8) manual page. The remainder of this man page is focused on the webservice.
The Web service uses JSON content for responses. A future version may also support XML.
HTTP clients must use Content-Type
application/json
to post JSON content.
The Web Service will use the same
Content-Type for JSON content responses as well.
HTTP clients must use Content-Type
text/xml
to post XML.
The cf_webservice module filters only HTTP requests with a certain
prefix. The default prefix is "connector
" and is
used in the description that follows.
The following requests are offered by the webservice:
Makes a connector framework session. The content is Connector Framework File (XML).
Session data passed as a JSON string in the X-CF-Args header will be decoded and available to the connector in the $.session object.
If successful, the response includes a JSON object with a single member "id" with a session integer value. This session must be used in subsequent requests to refer to this connector.
If the content is empty, no connector is loaded into the engine. In this case only the engine session is established. A connector may be loaded later with load_cf operation (see below).
One or more arguments may be given for the POST in the form of
name
=value
pairs, separated by &
.
proxy=
IP
Specifies HTTP proxy for the session.
thread=
0|1
Enables threaded mode (value 1), or forked mode (value 0). If thread is not given, forked mode is used.
loglevel=
level
Specifies log level for the engine session. The following names are recognized: DEBUG, INFO, WARN, ERROR .
logmodules=
modules
Enables logging only for a subset of modules to be retrieved
by the log webservice command. The
modules
list is comma separated
list of named modules. The available modules are:
runtime
(JavaScript runtime logger),
engine
(Engine encapsulating browser),
timing
(timing for tasks),
stdout
(unstructured text printed
to standard output in various places).
By default logging is enabled for all modules.
timeout=
seconds
Sets task timeout for task in session. Any task that takes longer than this amount will be aborted and the session will be terminated.
By default, the timeout is 120 seconds (2 minutes).
id
/op
/opargs
Performs an operation op
with
arguments opargs
on connector
identified by id
.
The following operations, op
, are
supported: run_task
, run_task_opt
,
run_tests
, screen_shot
,
load_cf
and log
.
For operations run_task
and run_task_opt
, the opargs
is the name of the
task to run and the POSTed content is task parameters. The POSTed
content must be JSON.
For operation run_tests
, the
opargs
is the test tasks to run.
Operation screen_shot
returns an Window dump
of the current browser in PNG format. Content-type of HTTP response is
image/png
.
Operation log
returns the log for the connection
session as it is produced by the Engine as well as the shared
Java runtime. It may be limited to certain modules by the
logmodules
argument when POSTing a connector.
The log
operation may optionally be followed by ?clear=1
which will clear the log upon completion. Thus a following
log operation will only return log material following most recent log
operation.
log
is a special operation, where the POSTed
content and content-type is ignored.
dom_string
returns the current DOM for the session
rendered as a string. The POSTed content and content-type is ignored.
Operation load_cf
loads the connector posted (XML).
Currently the Content-Type is ignored. It should be text/xml.
If an operation is successfully completed (HTTP status 200), the
HTTP response is result. For run_task
,
run_task_opt
, run_tests
the response is a JSON document.
For run_tests
, however, the response is
simply a JSON object with name "result" and a boolean value with
true
for success and false
for
failure.
For operation log
the response is text and
content-type is set to text/plain
.
id
Deletes the connector identified by id
.
The webservice is implemented as a shared object for the Metaproxy server. The Module ID of is simply "cf".
The following elements may be given as part of the module configuration:
Specifies various settings WRT the environment in which the the
module is run. These are the values that were previosly controlled
by environment variables for cf-zserver. The env
element takes several attributes. These are:
Same as CF_TMP_DIR.
Same as CF_APP_PATH.
Same as CF_MODULE_PATH.
Same as CF_DISPLAY_LOCK.
Same as CF_DISPLAY_CMD.
Same as CF_BASE_PATH.
Same as CF_CONNECTOR_PATH.
Same as CF_REPO_AUTH_URL.
Same as CF_REPO_FETCH_URL.
Specifies the HTTP path for the Web Service. By default it is
connector
. If a HTTP request does not
use the prefix given, the cf module will pass the request to the
next module in chain of modules defined by the Metaproxy configuration.
Controls the Z39.50 server interface of the module.
This element takes one attribute, enable
which
has values "false"
to disable the Z39.50 server
(default) or "true"
to enable the Z39.50 server.
Below is shown a small Metaproxy configuration file which loads the CF Web service module:
<?xml version="1.0"?> <metaproxy xmlns="http://indexdata.com/metaproxy" version="1.0"> <dlpath>.</dlpath> <start route="start"/> <filters> <filter id="frontend" type="frontend_net"> <port>@:9000</port> <threads>50</threads> </filter> </filters> <routes> <route id="start"> <filter refid="frontend"/> <filter type="log"><category user-access="true" apdu="true" /></filter> <filter type="cf"> <env app_path="/var/cache/cf" module_path="/usr/share/cf/modules" display=":1.0" tmpdir="/tmp/cfengine" /> <url_prefix>connector</url_prefix> </filter> <filter type="bounce"/> </route> </routes> </metaproxy>
The dlpath
must be set to the directory containing
Metaproxy modules - in particular the CF module
metaproxy_filter_cf.so
.
#!/bin/sh C=/usr/share/cf/connectors/inactive/doaj.cf if test "$1"; then C=$1 fi H=http://localhost:9070/connector # Create session (empty content) curl --output ws.log --data-binary "" $H # Parse it ID=`cat ws.log | cut -d":" -f 2|cut -d"}" -f 1` # Load connector file curl --data-binary @$C $H/$ID/load_cf # Run a set of tests curl --header "Content-Type: application/json; charset=UTF-8" --data-binary "{}" \ $H/$ID/run_tests/search,parse,next,parse # Run task search curl --header "Content-Type: application/json" \ --data-binary "{\"keyword\":\"water\"}" \ $H/$ID/run_task/search # Run task parse curl --header "Content-Type: application/json" --data-binary "{}" \ $H/$ID/run_task/parse # Take screen shot (requires pnmtopng, xwdtopnm) if test -x /usr/bin/pnmtopng; then curl --output screen.png \ --data-binary "{}" $H/$ID/screen_shot fi # Run opt task init curl --output init.log --header "Content-Type: application/json" --data-binary "{}" \ $H/$ID/run_task_opt/init # Get log curl --header "Content-Type: application/json" --data-binary "{}" \ $H/$ID/log # Get dom curl --header "Content-Type: text/html" --data-binary "{}" \ $H/$ID/dom_string # Delete the connector curl --request DELETE $H/$ID