Controlled Digital Lending

Last week, I attended the 2018 Open Libraries Forum at the Internet Archive’s remarkable headquarters in San Francisco. The focus of the Forum was twofold: to learn firsthand from some of the authors of the newly-minted position statement on controlled digital lending (CDL), and to help provide input for the Archive on their own Digital Lending platform, which precedes the Position Statement, but has the potential to be an important part of a global CDL infrastructure (with some caveats that I will return to). Two of the authors of the CDL position statement have created a companion white paper which lays out the legal argument for CDL in greater detail. Anyone considering engaging in CDL should consult these resources as well as their own legal counsel, but having said that, the documents are admirably clear and readable. It is important to stress again that the position statement on CDL and the associated white paper are not about the Internet Archive’s platform: they describe a practice which could be adopted by any library.

I think that CDL has the potential to be a big deal for libraries. So much of the transformation from print to digital has been accompanied by an erosion of the rights of content consumers. The balance of rights and responsibilities historically associated with copyright law has been replaced by digital licenses and DRM systems which reserve virtually all privileges and affordances for the licensor. CDL develops an argument, grounded in copyright law, for libraries to claim some territory of their own in the digital realm. This offers new degrees of freedom, and new opportunities for innovative library services or simply for reducing material handling costs.

Only time will tell how this will shake out, how libraries will come to apply these new possibilities. Will they be the domain of commercial vendors or large organizations, or might they become a mainstream part of library practice? I would like to consider what it would look like for libraries to extend current models of circulation and resource sharing to encompass CDL, what tools and infrastructure exist and what new ones might help libraries realize the greatest benefits for their patrons.

What is CDL?

CDL is a blueprint for an emerging approach for libraries to share copyrighted material. In a nutshell, CDL allows a library to digitally lend one “copy” of a title for every physical copy that they have acquired, provided that the total number of concurrent physical and electronic lends does not exceed the number of copies owned, and provided that reasonable steps are taken to prevent borrowers from creating their own copies. CDL extends the rights of a library to circulate physical books into the digital realm, provided that the book is physically owned by the library (and not licensed). The CDL position statement was crafted by a group of legal experts, and it has been endorsed by a rapidly growing list of organizations and individual experts.

CDL defines digital lending in terms of six specific controls that a library must put in place to protect the rights of the copyright owner, and it lays out the arguments, with references to laws and legal precedents, for why such lending is within the fair use rights of the library. Because copyright law is subject to interpretation and legal challenge, no ironclad guarantees against challenge or litigation are provided. However, some additional steps are offered for libraries that wish to reduce their liability even further, and it is suggested that ultimately, libraries are granted a level of legal shelter which makes the risk of litigation and any possible penalties quite low. But again, as with any activity, it is advisable for a library to engage legal counsel before making an investment in new services.

These are the six controls required by a library to provide controlled digital lending (from the white paper):

  1. ensure that original works are acquired lawfully;
  2. apply CDL only to works that are owned and not licensed;
  3. limit the total number of copies in any format in circulation at any time to the number of physical copies the library lawfully owns (maintain an “owned to loaned” ratio);
  4. lend each digital version only to a single user at a time just as a physical copy would be loaned;
  5. limit the time period for each lend to one that is analogous to physical lending; and
  6. use digital rights management to prevent wholesale copying and redistribution.

Note that the first two points deal with the selection of material, and only the last four points deal with “technical” measures which are intended to mimic the constraints of a physical book to a reasonable degree.

If a library implements these measures, then it is doing CDL in accordance with the position statement, and it is free to lend and circulate parts of its collection electronically. If a library feels queasy and wishes to further reduce its risks, the authors suggest additional steps that might be considered including: limiting CDL to older or even out of copyright materials; focusing on non-fiction and steering away from new/current materials; implementing practices that further emulate the “transactional friction” of printed books, like artificial delays between lends or a maximum number of loans per copy to emulate a book wearing out over time. Again, these “refinements” are optional with respect to the CDL model, but they might help reduce concerns in first-time CDL practitioners.

Possible applications of CDL

CDL adds new degrees of freedom, new tools to help libraries do their work. I am no librarian, and only time will tell what creative uses people make of these tools; but it may be helpful to try to imagine some of them, to guide an exploration into possible technical approaches.

  • Libraries can bring new life to marginal parts of their collection; items available for easy online use or download to a tablet may lend themselves far better to serendipitous discovery than those buried in closed stacks.
  • Items that are very rare or costly, which might not be available for physical lending, can be made available for digital lending, either directly or through interlibrary loan.
  • For libraries with far-flung branches or for interlibrary loan in time-critical situations, digital lending may sometimes be preferable to shipping the item out and back.
  • Libraries can create thematic online exhibits or collections which cross the boundaries of institutions. If items have already been digitized and are available for digital lending (direct or through interlibrary loan), new possibilities are opened for engaging presentations.
  • CDL can be part of a cross-institutional collection management strategy. Libraries can collaborate to establish electronic as well as physical coverage in a virtualized collection.
  • CDL can be an alternative to permanently weeding a collection. Physical items of marginal value can be put into permanent storage or destroyed, yet remain available to patrons in their digital form.

What’s next?

I should state clearly that hardly any of the thoughts in this post belong to me. I have discussed CDL with a number of people over the past several months, brainstorming ideas, thinking about concerns, technical implications, etc. I have done my best to synthesize what I have learned and what makes me excited about this development, in the hope of exciting and inspiring others.

In the next post, I will explore some practical approaches to deploying CDL in a library, and think about implications for technology choices as well as the role of services like the Internet Archive.

Sebastian Hammer is co-founder and president of Index Data.

Reflections on the European BIBFRAME Workshop

Attending the European BIBFRAME Workshop in Florence, Italy, was a great way to wrap up my first month with Index Data — and it wasn’t just about the hills, art, and food. The workshop proved to be an excellent introduction to the BIBFRAME community and the variety of exciting initiatives taking place around the globe.

The first thing that struck me was the true international nature of BIBFRAME. Of course the fact that there is a European BIBFRAME Workshop at all goes to show that BIBFRAME has grown well beyond its origins at the Library of Congress. This year’s workshop included more than 80 participants from countries across Europe, as well as a handful from North America and Asia. In his introduction to the meeting, Leif Andresen of the Royal Danish Library echoed this observation, saying he believes BIBFRAME has the potential to become more international and more collaborative than MARC.

If the workshop was any indication, that’s already well on its way to being true. While BIBFRAME originated in the States, it’s the European national libraries — with their centralized models and willingness to take risks — who have really taken taken the BIBFRAME baton and run with it. For me their work was a great illustration of the way that many nations working together, while still playing to their own strengths, can make real progress in moving the library profession forward .

This success was especially obvious in the discussions surrounding the National Library of Sweden’s recent move to a full linked data environment within its union catalog. Attendees were eager to learn more from Sweden’s implementation, but they were equally inspired by the fact that it had been done at all. Niklas Lindstrom, who presented on the project, really summed up the mood of the room when he described Sweden’s efforts as “Not done, just real.”

A similar emphasis on the value of learning through implementation flowed through many presentations. Philip E. Schreur from Stanford University said that Phase 2 of the LD4P project would be strongly focused on implementation, with 20 new partner institutions exploring everything from record creation to discovery. And Richard Wallis presented a range of actionable possibilities for libraries interested in exploring linked data, from starting points like adding URIs to MARC to as-yet untackled challenges like converting BIBFRAME to Schema.org.

Wallis also echoed one of the other major themes of the conference, the importance of developing a true community approach to BIBFRAME. With individual libraries implementing projects on their own, BIBFRAME approaches are often too different to allow for real collaboration. The community needs to work on consistency, cut down on duplication, and focus on creating connections between the most well-established projects.

Many other presenters addressed this same issue, offering up potential ideas and solutions. Sally McCallum of the Library of Congress described plans to harness external authority data, including standards like ISNI. Schreur talked about the role that a Wikimedian-in-Residence will play at LD4P, working to figure out how wiki data can be used to support library linked data. And Miklós Lendvay of the National Széchényi Library of Hungary shared his library’s decision to implement the FOLIO library services platform in the hopes that it will extend the sharing mindset and allow for possible interactions between the BIBFRAME and FOLIO communities.

Having worked extensively with BIBFRAME and FOLIO, Index Data is especially excited about this last possibility, and I had the opportunity to present a lightning talk outlining some of the ways this might be achieved. So far it’s just a starting point, but I’m confident that the experiences and ideas I heard at the workshop will help shape and inspire Index Data’s future work with the BIBFRAME community.

Kristen Wilson is a project manager / business analyst working on efforts related to BIFRAME, FOLIO, and resource sharing. She joined Index Data in August 2018 after more than a decade of academic library experience.

Machine learning in libraries: profiling research projects rather than people

Machine learning in libraries, as in many other contexts, will often rely on data about people and their activities. Data in a library system can be made available for use with machine learning algorithms to develop predictive models, which have the potential to help patrons in their research. Of course, the data might also be used for the benefit of others without the patrons’ permission, either in the present or at some future time. In particular, the data can serve as a basis for creating profiles of individuals, which may be used for undue advantage. If “knowledge is power,” then it is worth considering whether that power is authorized and what its limits are.

If libraries were to avoid keeping a record of patron activities, then it would be much more difficult to build profiles of patrons. However, libraries need to track some basic information, such as circulation transactions, and can offer better services to patrons if they track and analyze how resources are being used. In addition, they may be obligated to record information about access to electronic resources. There are also the benefits of machine learning, and while “opt-in” or “opt-out” models can limit profiling, many machine learning algorithms will only work well if most people participate.

Suppose, however, that libraries were to request that every patron specify one or more “research projects” corresponding to the patron’s interests, broadly defined, and to select one of these projects when logging in to the library system. Within the library’s database, most patron activities could then be associated with a project rather than with a patron.

For example, Jenny might create a project called “Information theory” for her graduate study, a second project called “Cooking” to pursue her passion for learning new cuisines, and a third called “Reading” to represent reading for pleasure. Now somewhere in the library system there is stored an association between Jenny and her three projects. This association would not be authorized to be shared outside of the library or with machine learning algorithms, except with specific consent in cases where absolutely necessary. Elsewhere in the database, Jenny’s data would be associated not directly with her but with the project she has selected to work on.

When Jenny is logged into the library system, she might see a dashboard for the project that she is currently working on, and can easily switch to another project. This may make a certain sense to Jenny, as it would allow her to view and manage related information together. She probably would not care to see articles about information theory suggested by a recommender system while browsing books of recipes.

With some exceptions, there is no need for machine learning to profile people in order to help with their research interests. Jenny’s selection of a project could even assist the algorithms to be more accurate, by indicating what she is working on. At any rate, the decision would be up to Jenny in how she chooses to organize her projects and to set boundaries for how data are used, based on which of her interests she thinks would benefit. She could begin to use machine learning selectively as a tool, rather than being pressured into an all-or-nothing choice.

Another advantage of this approach could be seen in cases where data are anonymized and exported from the library system to be analyzed by someone outside the library. Anonymized data can sometimes be “re-identified” because the data may reveal a combination of specific activities or interests that can be linked to a person. If the library were to track projects rather than patrons, and assuming the patron-project groupings were not disclosed with the anonymized data, then patron data would be fragmented by project, potentially making re-identification more difficult.

Nassib Nassar joined Index Data in 2015 as senior product manager and software engineer.

 

The State of FOLIO: Numbers and Muses

,

As the calendar turned to a new year, we took the opportunity to reflect on the state of FOLIO today.

The Index Data team members are excited to have jumpstarted this open source effort, and the numbers tell a story of a growing and highly engaged community.

FOLIO Community

 

FOLIO community adoption

 

To get beyond the numbers, we asked some of the project leaders to share their thoughts on where FOLIO is today and where it’s going.

The 5-minute version

You can watch the entire interview here.

2017 was a year of remarkable progress for FOLIO, and we have good reasons to believe that 2018 will be even more exciting!  Index Data is eager to engage with the community and be at the forefront of welcoming new participants to the FOLIO project.

Bibliotech Education Offers Node-based Zoom Client Based on Index Data’s YAZ Client

Daniel Engelke, chief technology officer and co-founder of Bibliotech Education Ltd, notified us about their release of an open source Z39.50 toolkit for Node.js that uses Index Data’s YAZ toolkit. The source code is available on GitHub. Daniel said, “Having the YAZ toolkit available, specifically the libyaz5-dev package made developing a zoom client in Node.js extremely easy for us!”

npm package

Bibliotech is an online platform that provides students with access to their textbooks and libraries with affordable textbook packages.

RA21 project aims to ease remote access to licensed content

In the two decades since electronic journals started replacing print journals as the primary access to article content, the quandary of how to ensure proper access to electronic articles that are licensed and paid for by the library has been with us.Note 1 Termed the “off campus problem”, libraries have employed numerous techniques and technologies to enable access to authorized users when they were not at their institutions. Access from on campus is easy — the publisher’s system recognizes the network address of the computer requesting access and allows the access to happen. Requests from network addresses that are not recognized are met with “access denied” messages and/or requirements to pay for one-off access to articles. To get around this problem, libraries have deployed web proxy servers, virtual private network (VPN) gateways, and federated access control mechanisms (like Shibboleth and Athens) to enable users “off campus” to access content. These techniques and technologies are not perfect, though (what happens when you get to a journal article from a search engine, for instance), and this is all well known.

Stepping into this space is the STM Association — a trade association for academic and professional publishers — with a project they are calling RA21: Resource Access in the 21st Century. The website describes the effort as:

Resource Access for the 21st Century (RA21) is an STM initiative aimed at optimizing protocols across key stakeholder groups, with a goal of facilitating a seamless user experience for consumers of scientific communication. In addition, this comprehensive initiative is working to solve long standing, complex, and broadly distributed challenges in the areas of nwww.stm-assoc.org/standards-technology/ra21-resource-access-21st-century/etwork security and user privacy. Community conversations and consensus building to engage all stakeholders is currently underway in order to explore potential alternatives to IP-authentication, and to build momentum toward testing alternatives among researcher, customer, vendor, and publisher partners.

Last week and earlier this week there were two in-person meetings where representatives from publishers, libraries, and service providers came together to discuss the initiative. Two points were put forward as the grounding principles of the effort:

  1. In part, the ease of resource access within IP ranges makes off campus access so difficult
  2. In part, the difficulty of resource outside IP ranges encourages legitimate users to resort to illegitimate means of resource access

What struck me was the importance of the first one, and its corollary: to make off-campus access much easier we might have to make on-campus access a little harder. That is, if we ask all users to authenticate themselves with their institution’s accounts no matter where they are, then the mode of access becomes seamless whether you are “on-campus” or “off-campus”.

The key, of course, is to lower that common barrier of personal authentication so far that no one thinks of it as a burden. And that is the focus of the RA21 effort. Take a look at the slides [PowerPoint] from the outreach meeting for the full story. The parts that I’m most excited about are:

  • Research into addressing the “Where Are You From” (WAYF) problem — how to make the leap from the publisher’s site to the institution’s sign-on portal as seamless as possible. If the user is from a recognized campus network address range, the publisher can link directly to the portal. Can clues such as geo-location also be used to reduce the number of institutions the user has to pick from? Can the user’s affiliated institution(s) be saved in the browser, so the publisher knows where to send the user without prompting them?
  • User experience design and usability testing for authentication screens. Can publishers agree on common page layout, wording, graphics to provide the necessary clues to the user to access the content?

The RA21 group is leveraging two technologies, SAML and Shibboleth Note 2, to accomplish the project’s goals. There are some nice side effects to this choice, notably:

  • privacy aware: the publisher trusts the institution’s identity system properly authorize users while providing hooks for the publisher to offer personalized service if the user elects to do so.
  • enhanced reporting: the institution can send general tags (user type, department/project affiliation, etc.) to the publisher that can be turned into reporting categories in reports back to the institution.

Beginning next year organizations will work on pilot projects towards the RA21 goals. One pilot that is known now is a group of pharmaceutical companies working with a subset of publishers on the WAYF experience issue. The group is looking for others as well, and they have teamed up with NISO to help facilitate the conversations and dissemination of the findings. If you are interested, check out the how to participate page for more details.

Within Index Data, we’re looking at RA21’s impact on the FOLIO project. FOLIO is starting up a special interest group that is charged with exploring these areas of authentication and privacy. I talk more about the intersection of RA21 and FOLIO on the FOLIO Discuss site.

Note 1: I am going to set aside, for the sake of this discussion, the argument that open access publishing is a better model in the digital age. That is probably true, and any resources expended towards a goal of appropriately limiting access to subscribed users would be better spent towards turning the information dissemination process into fully open access. The resource access project described here does exist, though, and is worthy of further discussion and exploration. back to text

Note 2: SAML (Security Assertion Markup Language) is a standard for exchanging authentication and authorization information while Shibboleth is an implementation of SAML popular in higher education. back to text

Index Data Staff Offer a Primer on the FOLIO Code

Coinciding with the public release of the FOLIO code repositories, Index Data staff offered a primer to developers on the context around how the FOLIO platform provides integration points for module development through the Okapi layer and what that means for developing modules in FOLIO. In the 90 minute presentation (embedded below), Peter Murray (open source community advocate) and Jakub Skoczen (architect and developer) walk through how the pieces come together at a high level, and Kurt Nordstrom (software engineer) demonstrated how to build a back-end module for Okapi using NodeJS.

 

The video is available for download from the Open Library Environment website.

Index Data Turns 20

Today, it was 20 years ago that Adam Dickmeiss and I founded Index Data together in Copenhagen. There was a bottle of champagne, and our parents shared the moment with us along with our wives, because, honestly, we were little more than big kids at the time. We were a little scared, but we were also in the fortunate position of being young, still without kids or debt. Oh, and our wives had steady jobs. Let the adventure begin!

We met just a few years earlier, as interns at the State Library Service (Statens Bibliotekstjeneste), looking to make some extra money for college. We were hired during a tumultuous period, both organizationally and technologically. In Denmark, libraries benefit from substantial support from national and municipal authorities. At that time, there was an effort afoot to consolidate the services offered to the public and research libraries, respectively, into a single organization, the Danish Library Center (DBC) with a single software platform, including a centralized, national union catalog and interlibrary loan platform. Eventually, Adam and I along with a team of young programmers were given the task of creating an indexing and search engine for the new system. The task took about a year, and I still look back on it as one of the most exciting projects I have worked on. Somewhere in there, Adam managed to graduate and I managed to forget all about my studies, but we both felt like we knew everything there was to know about library technology (remember, we were just big kids!).

The new system went into production on schedule and was a tremendous success. Today, it forms the basis for a unique, patron-centered ‘national OPAC’ that gives any citizen access to the collection of every library in the nation. But Adam and I had developed a taste for big, ambitious projects. In a sense, our shared experience had made us into entrepreneurs, and we felt hungry for more.

For our first day of work at Index Data, we each brought a chair, our PC from home, and a thousand dollars which formed the entirety of the operating capital of the company. The goal of the business, we decided, would first and foremost be to provide a good place of work for us and any colleagues who might one day join us. The purpose was to have fun doing work that we loved. The business model was to create cutting-edge, client/server-oriented software (buzzword of the era) for libraries, and to finance the development by offering our services as consultants in whatever areas we could. We felt that the best place to start would be to build a complete Integrated Library System (I did mention we were just big kids).

Our workplace was a small room in a disused factory building which had been turned into rental offices. But not in the fashionable, expensive way it’s being done today. This place was rough. Our neighbors were bohemian artists and tiny film production companies hoping to make it big. Break-ins were a continuing concern, so one of our first purchases was a large steel grid that we padlocked to our door at night. At one point during a rainstorm, water started coming down through the ceiling, so we draped plastic sheets to keep our computers dry. After that, strange mushrooms would sometimes grow out of our walls.

The original artwork for our first Christmas card, by Adam's brother Otto Dickmeiss

The original artwork for our first Christmas card, by Adam’s brother Otto Dickmeiss

In between consulting gigs, we worked steadily on our own software, building components that we thought we’d need for our big library system. At one point, we started releasing our software under Open Source licenses. We reasoned that someone might see the software and decide to ask us to help them work with it. Fresh out of an academic environment where Open Source projects were enormously influential (Linux was still new, then, but getting lots of attention), it felt natural to us but it was still a relatively unknown phenomenon in the larger industry and we suffered a good deal of friendly ribbing from our friends. We also endured some more pointed questions from our wives that were still carrying the brunt of the household expenses.

But something cool happened; people did find our software, and the consulting work increasingly involved integration and enhancements to our growing family of software components. Along the way, the building blocks we’d been creating took on a life of their own and became a focal point of our work; we never did build that library system, but our software components have been integrated into the vast majority of library systems out there in various roles, and we have enjoyed two absolutely remarkable decades of working relationships with exciting organizations and brilliant people all over the globe. We moved out of the mushroom-infested office and were joined by coworkers. We had kids.

The Europagate project team in 1995. Adam and Sebastian in the back row

The Europagate project team in 1995. Adam and Sebastian in the back row

Ten years ago this summer, my family and I moved to the US. Our business gradually shifted away from Denmark and Europe, but we struggled to maintain our old, informal and very personal company culture with me way over in New England and the rest of the team in Copenhagen. In 2007, Adam and I made a decision that in some ways were as dramatic as quitting our jobs and founding the company. We hired Lynn Bailey to be our CEO and re-configured the company mentally and structurally to be a US-based company which just happened to have its core development team in Copenhagen. Soon, they were joined by colleagues in many locations as we made a policy of hiring the most talented people with a strong interest in search and library technology, no matter where they lived. Today, we are a virtual company with colleagues in six different countries (Denmark, Sweden, Germany, the UK, Canada, and in four different US states). After writing the book on operating a commercial business around Open Source Software, we had to learn how to be a tiny multinational company, and how to work well together as a team while scattered across the globe.

The company that existed ten years ago, with a jolly group of Danes hanging out in the middle of downtown Copenhagen, has been transformed almost beyond recognition. But what has arisen in its stead is in many ways more vital and exciting. Our team is passionate about their work: We swim in a ridiculously specialized area of the sea of information technology, but we do so with tremendous pride and passion.

Index Data, at a recent team meeting in New England

Index Data, at a recent team meeting in New England

It has been an amazing 20-year journey. We were successful in creating a fun and supportive work environment for ourselves and our colleagues. I couldn’t be more proud and grateful, both for my great coworkers and for the remarkable people I have had the good fortune to do business with.

Let the adventure continue!

Adding Discovery and More to Koha with Smart Widgets

Previous post: Using Smart Widgets to Integrate Information Access

This is the second in a series of posts about our Smart Widget platform. You can also read the first post or find background material about the technology.

In our introduction to Smart Widgets, I said that part of our purpose in developing the technology was to move away from the search box as the primary paradigm for accessing information: to give librarians more tools to organize and present information for their consumers/patrons. But the widgets can also be used to IMPROVE the capabilities of the search boxes that we already have — to offer new functions beyond what your existing software is capable of. In this post, we will show a couple of different examples of how Smart Widgets can be used to add functionality to Koha, but the same principles apply to any system that allows you to customize the HTML structure of a search results page.

Previously, I showed an example of a search result widget which simply executed what you might call a ‘canned’ search, and displayed the results of that search whenever the page was loaded. The HTML code for such a widget might look something like this:

<div class=’mkwsRecords’ autosearch=’american political history’>

This widget will show a list of matching records for the query ‘american political history’ using the set of databases that has been configured into the MasterKey back-end for the library (literally anything searchable on the web can be accessed in this way). But what if we were to put such a widget on, say, the search results page of an OPAC, and have it search for whatever query the user has input? The Smart Widgets allow us to try this: The syntax would look like this:

<div class=’mkwsRecords’ autosearch=’param!query!’>

Where ‘query’ is whatever HTTP parameter name the particular search interface uses to carry the search term.

In Koha, there is a function on the staff page that allows the administrator to add extra markup to the end of the facet column on the left-hand side of the display. That is an ideal place for us to slip in a little extra functionality. The screen looks like this:

There’s a little more to this than the simple examples I showed in the last post. That is because the initial integration was so easy that we decided to see if we could add an entire Discovery function to Koha (spoiler: We could!). Embedding this markup in the facet bar gives us the following display:

If you look at the facet column to the left, you will see, below the normal facets, a list of different databases and their hit-count for the given query. The list is updated as results come in so the normal Koha response time is not affected in any way whatsoever.

We also added a separate tab to the Koha OPAC with the Discovery function, so that if you click on the widget above, you will get to this page

Pretty cool, right?

We’ll go through the steps needed to add this functionality in a later, more technical blog post, but before we get to that, I want to show you another application of the Smart Widgets which isn’t about Discovery/metasearching.

We have been thinking that there might be useful functions that an OPAC could perform beyond merely providing a peek into the physical holdings of a library (or physical/electronic in the case of Discovery platforms). What if the OPAC could evolve into a kind of information center, a front door to the library as a facilitator of learning or research.

We thought that one way to explore this idea further was to surface reference content right into the OPAC itself, to supplement the usual bibliographic results. Wikipedia is a good subject for this experiment, since it is free and quite often provides relevant information to a query. So we went back to the Koha administration console and replaced the Discovery widget with a special Wikipedia widget, and this is what we got for a search for ‘nuclear power’.

Is this useful? You’ll have to be the judge, but I think it often could be: As a way to provide another angle on the user’s query (another ‘facet’), and possibly inspiration/guidance for further research. Now obviously, Wikipedia is far from being the only possible source: Commercial reference sources or even locally maintained knowledge bases might be more obvious candidates in some settings. The widget approach would work with just about any source or combination of sources imaginable.

So, in this post, we have shown how you can add significant functionality to Koha without having to install local software and without complex programming. We happen to think that these Smart Widgets are a natural outgrowth of the move towards cloud-based services and I believe you’ll be seeing a lot of them show up in the years to come from all kinds of data and service providers. But if you’re interested in learning more about our take on them, or possibly trying them out in your OPAC, please get in touch.

Using Smart Widgets to Integrate Information Access

Next post: Adding Discovery and more to Koha with Smart Widgets.

This is the first of a series of blog posts in which we will talk about a concept that we have been developing over the past few years. We call it ‘smart widgets’ to distinguish our approach to widgets from the almost ubiquitous notion of ‘widgets’ meaning little search boxes that you insert into your page, but which ultimately send your users to some remote site.

How do they work? Well, for a really simple example, consider this search box:

You can put in a search and you will search across a collection of different resources, in real time. You might like to look at the HTML source code of this blog post to see how it was implemented, but to save you the trouble, the bit that does the searching looks like this:

<link rel=”stylesheet” type=”text/css” href=”//mkws.indexdata.com/mkws.css” />
<script type=”text/javascript” src=”//mkws.indexdata.com/mkws-complete.js”></script>
<div class=”mkwsSearch”></div>
<div class=”mkwsResults”></div>

That’s all. You can take that bit of HTML and put it in your home page, and it should do the same thing. The code uses Ajax to communicate with our SaaS MasterKey back-end, and it will search virtually any combination of resources that you can imagine if you have an account.

In a nutshell, our Smart Widgets are intended to make access to information a fluid thing — something that can be easily manipulated and surfaced just about anywhere you can imagine, from a blog post to a library home page. In that sense, our widgets are two things:

  • A technology platform that uses dynamic HTML together with our SaaS back-end to make it incredibly easy to embed access to almost any combination of resources into almost any page.
  • A whole new way of thinking about how information sources are used in the service of library patrons, and how the library can project its services into the surrounding community (whether that is a town, as school, or a business).

In a way that second point arose from the realization that “Searching” — i.e. providing mechanisms by way patrons could access pre-indexed collections of materials by entering search terms — has become a commodity. Librarians have spent decades (if not centuries) thinking of mechanisms to make things findable. Today, the Internet and Google in particular have made that function utterly mainstream — it is part of the fabric of the Internet, and so much a part of people’s everyday experience that it has become very hard for the library community to convince anyone that we have a better solution. This is pleasing in a way — it has been cool to watch something so esoteric as searching become just an everyday part of our culture. But it also presents new challenges. Ironically, while easier access to massive piles of information have lead some decision makers to question the continued value of libraries and librarianship, at the same time people are struggling with information overload; how to filter and select the best sources to answer a given question. Information is not the same as knowledge, and access to too much information may in fact impede the acquisition of knowledge.

We in the library community are partly to blame for this. We have pursued the vision of the ‘universal search box’ for so long and with such ardor that it is only now, as we’re finally reaching that goal, that some people are asking if this really was such a good idea after all. I think certainly for some tasks, single search boxes are a great solution (the success of Google makes this clear), but I don’t think it’s the right answer for every problem. We believe that libraries have more important roles to play than merely managing search boxes, and we would like to use our widget platform to support those roles, by organizing and enabling access to information, and ultimately by facilitating the creation of knowledge from that information.

Below is a widget that surfaces the current results from the Digital Public Library of America for ‘american political history’. There is no search box: The widget retrieves the most current information based on a search that has been prepared by the page author (me, in this case).

DPLA results will appear here

The HTML source code for this widget looks like this:

<div class=’mkwsRecords mkwsTeam_dpla’
autosearch=’american political history’
sort=’position:1′
target=’dpla_api’>

Why is such a widget useful? Well, the widget can be used to surface results from virtually any combination of resources (up to over 100 databases per widget), ordered in any way desirable, for any given search. The sources for the widget can include subscription databases and open access sources. Different widgets can be combined together on a page to illuminate a current event, a certain genre of literature, or a subject of research. The widgets can be a powerful tool to build information sources of all kinds for the users of a library.

Over the coming days and weeks, we will be discussing various applications and uses of the widgets. We hope you’ll agree they present some pretty exciting possibilities.

As always, feel free to contact us if you have any questions or if you are interested in using the widgets in your own applications or site. You can also find more information at this site.

Next post: Adding Discovery and more to Koha with Smart Widgets.