Chapter 1. Introduction

Table of Contents

1. Overview
2. Zebra Features Overview
2.1. Zebra Document Model
2.2. Zebra Search Features
2.3. Zebra Index Scanning
2.4. Zebra Document Presentation
2.5. Zebra Sorting and Ranking
2.6. Zebra Live Updates
2.7. Zebra Networked Protocols
2.8. Zebra Data Size and Scalability
2.9. Zebra Supported Platforms
3. References and Zebra based Applications
3.1. Koha free open-source ILS
3.2. Kete Open Source Digital Library and Archiving software
3.3. ReIndex.Net web based ILS
3.4. DADS - the DTV Article Database Service
3.5. ULS (Union List of Serials)
3.6. Various web indexes
4. Support

1. Overview

Zebra is a free, fast, friendly information management system. It can index records in XML/SGML, MARC, e-mail archives and many other formats, and quickly find them using a combination of boolean searching and relevance ranking. Search-and-retrieve applications can be written using APIs in a wide variety of languages, communicating with the Zebra server using industry-standard information-retrieval protocols or web services.

Zebra is licensed Open Source, and can be deployed by anyone for any purpose without license fees. The C source code is open to anybody to read and change under the GPL license.

Zebra is a networked component which acts as a reliable Z39.50 server for both record/document search, presentation, insert, update and delete operations. In addition, it understands the SRU family of webservices, which exist in REST GET/POST and truly SOAP flavors.

Zebra is available as MS Windows 2003 Server (32 bit) self-extracting package as well as GNU/Debian Linux (32 bit and 64 bit) precompiled packages. It has been deployed successfully on other Unix systems, including Sun Sparc, HP Unix, and many variants of Linux and BSD based systems.

Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads records in a variety of input formats (e.g. email, XML, MARC) and provides access to them through a powerful combination of boolean search expressions and relevance-ranked free-text queries.

Zebra supports large databases (tens of millions of records, tens of gigabytes of data). It allows safe, incremental database updates on live systems. Because Zebra supports the industry-standard information retrieval protocol, Z39.50, you can search Zebra databases using an enormous variety of programs and toolkits, both commercial and free, which understand this protocol. Application libraries are available to allow bespoke clients to be written in Perl, C, C++, Java, Tcl, Visual Basic, Python, PHP and more - see the ZOOM web site for more information on some of these client toolkits.

This document is an introduction to the Zebra system. It explains how to compile the software, how to prepare your first database, and how to configure the server to give you the functionality that you need.