Code4lib 2011 report -- part I

Code4lib 2011 in Bloomington, IN

So it took me a while to process all that happened during the conference and come up with a short summary. I am not aiming to be anywhere near comprehensive, Code4lib grows fast and there’s quite a lot of stuff going on each year! This year’s talks covered a vast selection of subjects, ranging from back-end software topics (databases and search engines with the ubiquitous Solr, tuning, ranking and merging results from different sources) to front-end user interfaces (usability, testing, data visualization and mash-ups).

Code4lib 2011 took place in Bloomington, Indiana on the Indiana University campus (the Memorial Union building to be exact). It was quite a successful choice of venue, in my opinion. Having the hotel and the lecture rooms (and the food hall with Starbucks to get your daily coffee fix :) all in one place encouraged attendees and speakers alike to share ideas and discuss things long after the scheduled talks had ended.

A thing that, I felt, worked really well was the rather short duration of talks. About 20-30 minutes per talk seems perfectly balanced, giving enough time for the speaker to present his material without having the audience yawning :). As usual, each day’s session concluded with the Lightning Talks, five-minute-long ramblings on random subjects. I must say I enjoyed them quite a bit, with some being even more educational and entertaining than the scheduled lectures.

Pre-conference day

The conference started for good on Tuesday the 5th, but most people showed up on Monday to attend the pre-conference sessions. There were quite a few things going on at once, with at least four or five parallel tracks in both the morning and afternoon slots. I chose to attend the “What’s new in Solr” talk by Erik Hatcher and the “Pre-conference Un-conference”… well, not a talk so much but a free-form discussion and brain storming among the attendees.

I think Erik Hatcher is a frequent Code4lib speaker (at least I remember him having a talk at the 2009 Code4lib) always covering latest advances in Lucene and Solr. He works for Lucid Imagination, a company that provides commercial support for Solr and has a few paid developers on the project, so he’s really a first hand source of information on the cutting-edge Solr development efforts. This and other talks only solidified my belief that Solr is a huge thing in the library-land at the moment, with pretty much every next-generation catalog using it (Blacklight, VuFind) and lots of other smaller or bigger archive projects building on top of it (e.g Smithsonian/NASA Astrophysics Data System covered during one of the talks). Anyway, some of the more notable new Solr features include the so-called field collapsing, or in other words, record grouping ability. This is a pretty powerful feature that allows you to group results with the same field value in a single (or more) entry so it appears as a single document. There are many ways to use this feature: one example is document de-duplication. It’s easy to think about this feature as a faceted search with top documents for a facet (or group query) returned right away. There have been a bunch of improvements to faceting in general: pivot aka “hierarchical” facets, “term” type for filter/facet queries and many more. Solr development is thriving and there have been so many new things and improvements mentioned (ICU filters, spatial queries, edismax, spell checking, auto-suggest, UIMA) that it’s impossible to cover them in one paragraph. For more information check Erik’s slides, available on-line.

The “Pre-Conference Un-conference” was led by Julie Meloni. Well, at least she tried to put some Law and Order into the otherwise completely unstructured thing. The idea of Un-conference is best explained by Wikipedia “..a facilitated, participant-driven conference centered around a theme or purpose.” We started off by putting some discussion topics on the whiteboard and voting on them. The winning topics were then divided into two groups of participants and from there it was on the attendants to take it over. The topics ranged quite a bit: from user interface usability (UX) to Solr indexing and federated searching. In the UX testing discussion there were some very useful insights from people doing that sort of thing. It appears that getting representative test subjects is quite an achievement and encouraging them to take part in the test is an art in itself (sweets or other “bribes” were mentioned :). Also, some usabilty statistic mentioned were quite a shock for me: about 2-5% users ever uses facets! Surprising, especially taking into account how much work goes into supporting them in search engines (e.g Solr) and effort that’s put into building more advanced search interfaces. It’s definitely over-simplifying it, but it does seem that all people want is a Google-like, single search box and good ranking. Ah yes, ranking. We touched a bit on it during the search engine/indexing discussion in terms of federated searching – general conclusion being that it’s impossible to get it right. Still, people try to at least do it “good enough” (don’t we too at Index Data?). Guys from the State Library of Denmark actually managed to persuade Summon to make the term weights available with the search results and use them to combine the catalog results in a smoother fashion.

Conference Day 1

The conference proper started on Tuesday with two welcoming talks from our IU hosts: Brenda Johnson, the Dean of Libraries, and Brad Wheeler the Vice-President for Information Technology. After those short introductions we listened to a keynote speech from Diane I. Hillmann, a metadata expert and the Director of Metadata Initiatives for the Information Institute of Syracuse. Diane talked about relations (tough) between cataloguers and system librarians or programmers (it wasn’t easy giving that talk, taking into account that the room was full of the former :) and touched on the history of cataloging and metadata management (with some cute pictures!) which I’m sure made everyone feel nostalgic.

But let’s get to the meat, shall we? :). Karen Coombs from OCLC started off with a presentation on “Visualizing Library Data”. As always, her talk was full of interesting ideas and creative ways of showing the, otherwise, boring data (if you’re like me, spending most of the time in the terminal you tend to neglect this stuff – don’t!). She showed some nice uses of Google Maps API to geo-locate libraries close to the users (the coordinates are part of the WorldCat records), timelines to chronologically organize bibliography of a given author, charts, graphs to show relations and whatnot. Cool thing that all the demos are available on-line along with the source code.

Next on the stand was Thomas Barker from the University of Pensylvania. He was talking about MetriDoc, an open-source tool developed thanks to the funding from the Institute of Museum and Library Services, UPenn. MetriDoc is meant to be a buzzword-free (no SOA!) answer to the data integration problems (think flat files, DBs, Web Services) within libraries. It uses a Domain Specific Language for expressing workflows and will eventually include a dashboard to assist monitoring and management. If data integration is your bread and butter you should check out the project – but be careful, it will soon change the name as MetriDoc has been used before. For now you can find it here

Just before lunch we heard two more presentations: Brad Skiles from the Kuali Project talked about OLE: Oh-Libraries-in-the-Enterprise (or possibly Open Library Environment) which aims to deliver an all-in-one, enterprise-ready software package for libraries but so far only gotten to define requirements and architecture for it. Coding is (was?) supposed to start with the beginning of this year. Finally, Cary Gordon from the Drupal Association gave an overview of exciting new features and changes coming in Drupal 7.

With a full stomach I was delighted to listen to a talk from Josh Bishof, a local developer at the IU library. He shared his experiences on providing access to mobile devices on the library website. With the proliferation of mobile devices on the market they went for a pure Web (HTML/CSS/JS) solution to support as wide spectrum of them as possible and it seems to work quite well for them. But he didn’t only talk about making websites look nice on small screens, he gave examples on how to utilize capabilities of mobile devices (e.g GPS and how to find your way to the library) and make the library website a gateway to seemingly unrelated stuff like information on campus bus stops. All intended to make the website a bit more interesting for students who seem to, in most cases, visit it mostly to find the library opening hours.

Next on the schedule was a talk of a different sort: a report from a sociological experiment. I quote: “In summer 2010, the Center for History and New Media at George Mason University, supported by an NEH Summer Institute grant, gathered 12 ‘digital humanists’ for an intense week of collaboration they dubbed ‘One Week | One Tool: a digital humanities barn raising’.” So they spent the week together brainstorming, designing, coding and finally releasing to the public Anthologize, a WordPress plugin that transforms this popular CMS into a platform for publishing electronic texts in various formats, including ePub, PDF and TEI. How is that for a project management approach? Check the slides here

Demian Kats, core VuFind developer, talked a bit about what’s been done in VuFind to make it more MARC-agnostic. I guess VuFind doesn’t need an introduction, as currently it’s the one of the most popular next-gen, open-source OPACs available (along with Blacklight) and it’s definitely nice to hear that it is becoming less and less MARC-dependent. Why do I have this weird feeling that catalogers love MARC and programmers hate its guts? The afternoon session concluded with Jay Luker and Benoit Thiell talking about migrating the search infrastructure of Smithsonian/NASA Astrophysics Data System to Solr. With standard integration problems, as any migration of this size brings (9 million metadata records), and users expecting you to at least re-implement existing UI functionality, they were constrained by the fact that the records were stored and maintained in Invenio. Invenio is a good institutional repository and digital library but a poor search server, having trouble indexing and quickly searching on data sets this size. Their idea was to offload searching to Solr and keep the two in sync. If you’re curious whether it worked read the slides (Okay, it did :)

Before heading off to socialize at the reception (with Index Data being one of the sponsors, yay!) we chatted a bit during the breakout sessions. I attended the Drupal one and had a useful chat with Cary Gordon on porting Drupal 6 modules to Drupal 7. Oh, and not to forget about a Code4lib tradition: the Lightning Talks! A lot of condensed information, way too much to present here, what etched into my memory was some pretty unexpected numbers on usage statistic for library websites (afterwards corrected on the mailing list :) – well, it happens, we all know that there are lies, damn lies and statistics anyway.

That’s it for now, folks – stay tuned for the next part covering the last two days of Code4lib 2011. And don’t forget to check out the video archives from the conference


Nice report, Jakub! Don't

Nice report, Jakub!

Don't forget about the last two days. :)

Re usage statistics, you wrote: "Also, some usabilty statistic mentioned were quite a shock for me: about 2-5% users ever uses facets!"

Did the presenter make it clear that the 2-5% refers to users, as in actual people? Or might it refer to usages, as in search sessions, or even to actual searches? The implications are very different depending on what is actually being measured.

And did the presenter say what the range represents? Margin of error based on the number of samples--or something else?

It is hard for me to interpret what the 2-5% might mean without getting more background information.

Anyway, thanks for the report.

same day payday loans for bad credit

End up being linked to the cost. There's a great reason it is referred to as the actual "payday loan, inch as well as the reason being you to definitely assist begin having to pay upon mortgage once you compensated, that is generally along with per week. The actual pay day lenders tend to be mindful you need the cash right now as well as are likely to spend this because rapidly as possible, so might be that you should cost uncommon curiosity insurance costs.
You will find couple of banking institutions that a person with pay day borrowing. However you'll find plenty of businesses cash loans online the particular web in addition to inside your community that additionally provide pay day lots. To help you search for this online or even may click the businesses actually.
The actual query many people regarding quick loans for bad credit is actually "how a lot cash can one purchase? inch Nicely, that's just about all decided along with organization you utilize as well as your credit rating. A few businesses tend to be that you should provide you with as much as $10, 000 to get individual loan for any poor. Keep in mind you are able to finish off having to pay a great deal in your rate of interest with regard to credit score tend to be reduce because low-interest prices are not easy to obtain with regard to bad borrowers.


Advance banks are people or associations from whom trusts can be loaned for gathering diverse prerequisites. This measure of cash is to be discounted by the specific indebted person in the same portions with premium. Moneylenders loans are open as obligation solidification, home loan, refinance, home value or banks individual loans. Specifically puts there are systems of loan specialists for heading individuals for selecting loans as for their needs. For this capacity they for the most part depend on most progressive offices and advances by offering their items and offering surprising administrations. In their dealings, they remain immovably on their obligations and approaches for offering most extreme points of interest to their clients. In this way the system of credit banks dependably help clients to keep their targets alive.