Making a New Books RSS Feed for Millennium

I wanted to be able to display newly cataloged books on our website. We use Millennium. I know Innovative sells an RSS module, but we avoid making lots of extra purchases on our ILS. So here’s how you do it using Create Lists, Yahoo! Pipes and Feed2JS. (Ok, so these tools are all old and this post could basically have been written in 2007, but still, I haven’t seen it explained elsewhere!)

Step 1: Create Lists

The first step is to use Millennium’s “Create Lists” feature to catch newly cataloged books. I’m not going to go through the process of creating the query because 1) I don’t really know how to do it, I just used a query a colleague had saved, and 2) even if I did, much depends on how your library has set things up. But probably somebody at your library already knows how to do this. In my case, I looked at saved queries and found something that seemed to fit. (In addition, you could potentially limit by subject heading, call number range, etc.)

Once you’ve created the list, save the query for future use, since this will allow you to repeat this quickly on some schedule (e.g. weekly, monthly etc.)

After running the list, export it. Here you’ll need to export any fields that you want to be in the feed. I chose (all of these come from the bib record, not the item record):

  • Title
  • Created
  • Record # (this allows me to create the link later)
  • Descript
  • Note
  • ISBN/ISSN (this is connected to embedding the book jacket later)

Exporting a list from Millennium

(As you can see, I also exported the Author, but I didn’t list it above because I didn’t end up using it.)

I chose not to include the call number, location etc., but of course you could—just depends what you want to display later.

After you’ve created the CSV (which by default has a .txt extension, which is fine, no need to change) you need to host it online. This could be somewhere on your library website if you’re able to host files there, or it could be a Dropbox public folder, Google Drive (this is what I’m using—note that you need to use the “web hosting” feature, not just upload the file normally), whatever—the main requirement is that it be publicly accessible, since it will be accessed periodically by Yahoo! Pipes.

Step 2: Yahoo! Pipes

Again, this makes me seem pretty behind the times, but I only realized in the last year or so how useful Yahoo! Pipes can be. (How long until Marissa Mayer kills it?) The reason I use Pipes here is to convert the CSV file I’ve exported to an RSS feed that has everything it needs, including a certain amount of HTML, to be displayed nicely on a website. A word of warning: this is the part that might take a while.

Note: probably anything that Pipes does could also be done via PHP or some other programming language. The advantage of Pipes is its easy interface, which allows non-programmers to do these sorts of things.

Pipes allows you to start with any of a number of sources. For many projects, you begin with an RSS feed. But you can also begin with a CSV file if you choose the “Fetch CSV” module. Filling out this module is fairly intuitive: you need to tell it where your file is hosted, what character separates the columns and to use the first line for column names.

Our goal is to output an RSS feed. The fields we exported from Millennium, though, do not map neatly onto the basic RSS fields of Title, Description, PubDate, and Link. So we need to do some work to get those fields in order. Pipes allows us to modify the data, in several steps if necessary, using modules such as Regex, which allows us to substitute characters according to some pattern, and Create RSS, which allows us to take use the fields, modified to our liking, and insert them into RSS-standard fields.

What exactly you do depends on what you want the feed to look like, so I’m not going to explain what I did step-by-step. I wanted the RSS title to be the record’s Title (245) field, since that contains both the title and author or editor; the link to be a permalink to the catalog record; and the description to contain the Created field, (in parentheses, prepended by the phrase “added on”), and then the Note field. I also needed to include a little HTML (including, I’m afraid, a bit of inline styling, since I can’t modify the stylesheet in our CMS). I’ve published my pipe in case it’s helpful to look at it as a reference. I’m sure there are better and simpler ways to do some of the things I did there! (If you see anything obvious I should change, please let me know.) The point is that you don’t need to worry too much about doing it the best way—lots of things will work. Here’s a screenshot (fragment, click for the full screenshot):

screenshot of new books feed in Yahoo Pipes

I hit a couple snags worth mentioning:

  1. Permalinks in our OPAC are formed from the record number, but for some reason, the record numbers that were exported contained an extra digit at the end, breaking the link. So I had to create a regular expression that looked for the last digit and lopped it off. (I’m no regex expert, so I found searching Stack Overflow and testing things out in Rubular extremely useful.) Of course your setup may be different—the point is that Pipes is flexible enough to deal with this sort of snag.
  2. I had to make some transformations to the dates, first flipping the month and date, and then using the Pipes “date formatter” to make the RSS pubDate come out accurately.

I also thought it might be nice to include cover images. I did a very small amount of research on options here and decided to try the Open Library Covers API, since it has very permissive terms of service and relies on ISBNs, which were part of my export. Unfortunately, Millennium outputs a series of ISBNs, so I had to use a regular expression to delete all but the first. Even then, far from every cover is in Open Library, but I think it works ok to have covers show for only some books. I simply included it in the description field, floating it to the right of the text.

Once the feed has what you want it to have, you’re nearly done. (You can always fiddle with your Pipe later on.)

Step 3: Feed2JS (or alternative)

It seems like Feed2JS has been around forever, and guess what? It still works. It is a really simple way of outputting RSS as an HTML list. So simple: give it the feed URL; fill out the form; preview; output and copy code, a short bit of Javascript that will render as a list. Make sure, if you’ve used HTML in the RSS Description field, that you select “Yes” on “Use HTML in item display”. To complicate things, and not rely on the Feed2JS server, one can download the script and self-host it. You can set Feed2JS to cache the feed at long intervals (assuming you’re only updating the feed weekly or so), which should help with performance. So this is what I’ve done.

On the page I’m using for this, at least so far, the feed is at the bottom of the page. If you are putting your new books feed higher up, it is probably better not to use Feed2JS, and instead to use a script that waits until the page loads to execute—sometimes Feed2JS lags a bit, and that can hold up the rest of your page from loading. I recently used Zazar’s RSS Feeds Reader JQuery plugin for a different page and it works superbly. Or, you could probably use a Libguides box for this: put the feed into an RSS box and then use the API tool to host it elsewhere (or, much simpler, display your new books feed in a Libguide).

So here’s what the result looks like, for now (website is transitioning next year, not sure if this sort of thing makes the cut), or see the live page:

Recently cataloged books

Last step: do it regularly

So ideally, this wouldn’t be necessary. But once it’s set up, it’s no big deal to maintain. The query can be saved in Millennium and run as needed (query just needs to be modified before searching in order to have recent date boundaries), and the newly exported CSV file needs to overwrite the previous one wherever it is being hosted. Millennium seems to save the export setup; I’m not sure if there’s some better way to preserve it, but so far it’s been stable. With instructions, this process could be farmed out to a student or other helper. The complicated part is in Yahoo! Pipes, and that does not need to get modified at all, at least until Marissa Mayer decides to pull a Google Reader on us…

Getting the most out of JSTOR’s ‘Register and Read’

JSTOR’s “Register & Read” program is great for those of us who can’t afford a full set of JSTOR collections. (For several years we only had two collections, and recently doubled that to four.) I suppose JSTOR is mostly thinking about non-academic populations with this program, but it can also be a boon to community colleges and other lower-budget operations.

But I’m not sure how our users are supposed to find it. JSTOR by default searches only content in your institution’s JSTOR collections (this can be manually disabled, but I don’t expect our users to do that), and links that go there from EBSCO only go to those same collections. So how do users know that there are some journals not in our collections to which they nevertheless have access?

I’ve been working with EBSCO’s “customlink” system quite a bit over the last year or so. It’s a very convenient means of linking from EBSCO’s bib records directly to a full-text source, especially with EBSCO Discovery Service, although it’s very useful with regular EBSCOhost databases, too. Since we have a relatively small number of resources, I think it makes more sense for us than using an all-out link resolver. It’s not without its quirks—sort of a strange syntax (not that I code competently enough to be able to discriminate much in that area), and it has a habit of breaking periodically—but when it works, it’s super. Lots of JSTOR bib records are available in various EBSCOhost databases, and all of them are in EDS, so I figured this would be a way to do it.

JSTOR provides a downloadable spreadsheet of JSTOR Reg & Read journals on the about R&R page. I used that to upload to EBSCOadmin as a “local collection”—this required only a minimal amount of tweaking. Here’s what it looks like once it’s uploaded:


Of course, like almost everything in JSTOR, Reg & Read journals have a significant embargo period. It would be pretty time-consuming to set the dates for each journal, so I just had them all set to include no journal issues published later than 2008 (might need to change this to 2007, though). Also, ideally I would have taken out all the titles that we already have access to, since in most EBSCOhost databases (everything except EDS) the Register & Read link will show in addition to the regular JSTOR link; that seemed too time-consuming, though, so I’m just taking them out as I notice the duplication.

Once you’ve got a local collection, you can use EBSCO’s prefab JSTOR customlink to create a link that is displayed only when the journal is one from that collection. What’s more, the record will now count as “full text”; when a user clicks the full-text limiter in EBSCO, the record will display even though the full text is not in EBSCOhost “natively” (this “external full text” feature is, incidentally, one of the things about customlinks that breaks periodically). JSTOR Reg & Read results on EBSCO results screen:


One last issue emerged. Normally we link directly to full text, either licensed or open-access. This is neither—the user needs to create a MyJSTOR account and understand the various limitations of the program. I think JSTOR does a good job of explaining this to the user as s/he goes along, but still I thought it required some warning on our part. So I created an intermediary page that explains what is happening to the user. This was kind of neat, since you can use the customlink to pass information about the article, such as the title and journal, into the URL and then grab them using PHP, and then put the URL into a “go to the article” button. This also allows me to log the clicks, both by sending some basic info to a text file on the Web server and via Google Analytics. This might help us understand which JSTOR collection we ought to think about getting next, should the opportunity arise. Here’s what the page the user sees currently looks like: splashscreen.png

To make this work, I needed to modify EBSCO’s customlink. Instead of pointing directly to JSTOR, the link points to a PHP file on our Web server and includes the original JSTOR linking info after a “url=” parameter. I also included some extra info in the link that allows me to include the article- and journal-level information in the PHP page and to log what’s being requested. Here’s basically what the link now looks like:

http://[path to our file is here].php?url={ISSN1}({YEAR}){VOLUME}%3a{ISSUE}%3c{STRTPAGE}%3a{JOURNAL}%3e2.0.TX%3b2-2&origin=EBSCO&title={TITLE}&journal={JOURNAL}&collection={DBCODE}

So, again, not sure this is exactly what JSTOR had in mind, but it seems to be helping a few people find what they’re looking for.

EBSCO ebooks: an update

It seems like my lament about a big gap in EBSCO’s migration of netLibrary ebooks to their own platform generates a good amount of the hits this humble, neglected blog gets. For that reason, I feel like I need to provide an update: the issue I was complaining about has, according to EBSCO, been fixed for months.

As part of EBSCO’s system upgrade during summer 2012, they are supposed to have made it possible to “close” ebooks, so that those with limited simultaneous users would be made available, rather than remaining locked until the user session expired. I learned about the fix through an email from EBSCO support, with my ticket number for the original complaint (which had been classified as an “enhancement request”) in the subject header. I didn’t see the ebook enhancement listed in EBSCO’s official posts about its update, which included the vastly improved mobile interface. ebookclosing.png

EBSCO now says that clicking any of the links you see on ebook screens—for instance, Results list, New Search, Back, Exit, Detailed Record, etc.—will “close” the book, although, I was told in a follow-up email, the book might actually remain unavailable to others for an additional 5 minutes. What you do NOT want to do is hit the browser’s “Back” button—if you do that, the book will remain open until the end of the browser session, which could be set anywhere from 15 minutes to 2 hours, based on the settings in EBSCOAdmin.

I haven’t taken the time to thoroughly test this. Soon after I got the email I did a few tests, opening a book in one browser, closing it, then opening it in another, but it seemed inconsistent. In some cases the book was soon available, in others, not. And it seemed like the Chrome browser had more trouble closing the book than Firefox. I keep thinking this is something that I need to make time to test more systematically, but then I never do… Still, best to tell people to use those links, or better yet, find ebooks that don’t have these blasted single-concurrent-user limitations on them.

Edit: found an EBSCO support article on this.

Adapting the “Best Search Box Ever”

Eric Frierson of St. Edward’s University recently wrote about his efforts to make library search boxes more responsive to users. We know that many users don’t see our search boxes as keyword search receptacles—they see them the same way they see Google search boxes: a place you just throw in a question or anything you want in order to get some results. So what does his search box do?

It defaults to  their discovery system (EBSCO Discovery Service, or EDS). But before taking the user there, a PHP script parses the search, aiming to correct for certain behavior. If a user types the name of a database, e.g. JSTOR, they might get taken directly to that database. If someone types "Sociology books," they go to a page giving them a choice ("Did you mean…") of going to a book-limited search for sociology OR a search for "sociology books" in the discovery system. If they search for "hours," they get a choice of a link to a page on library hours or a discovery search for "hours". Brilliant, right? St. Edward’s features the search box on their visually appealing library home page.

Eric was kind enough to share his PHP file, so I tried my hand at adapting it to our own context. Search for "hours" and you’ll get a link to our hours page. Our information literacy tutorial is called PILOT, so I added that. It’s terrific, since I could never have made this PHP script from scratch, but once it’s set I can adjust the variables. It’s now live on the Library at the Davis Center page. I’m also intending to incorporate into the next version of our Desire2Learn widget.

I also switched the HTML template on the "Did you mean" page to the Twitter Bootstrap framework, which I had been wanting to work with, just to figure out how the whole grid-based responsive design thing works. It’s quite easy to work with, and the result is that the elements on the page reflow nicely as the screen resolution drops.

response to search for "reserves", horizontal flow response to search for "reserves", vertical flow

One thing I really liked in the St. Edward’s search box was that it suggested questions from their LibAnswers site as the user types. As I recently posted, we decided to go with an open-source Q&A solution, so I couldn’t used Springshare’s handy API. I realized though that I could include arguments in the PHP file that would direct particular questions toward particular pages in our Q&A site—more labor-intensive to set up, but maybe worth it—if I could find a way to auto-suggest them. Twitter Bootstrap actually has a handy typeahead script that works great, but required access to the CSS, which in our CMS I simply don’t have.

So another auto-suggest that appears to be widely used comes from JQuery UI, and that one actually includes some styles in the script itself. This allowed me to do some styling even without access to the CSS.

jquery ui code

So it still looks a little awkward (and I haven’t been able to make it look uniform in different browsers), but you’ll see that if you start typing e.g. "how do i" into the search box, suggestions begin to appear.

search box with search suggestions appearing automatically

It took me a while to find a way to log search queries to a text file, but finally I hit upon a page from University of Tennessee that had some usable code. I know it’s better to save queries to a database than a flat file, but I don’t have access to a database on the server, so this will have to do.

One question that I’m reluctant to face is just how much overhead one can justify including for this. I’m using:

  • JQuery, which is not a big deal because most people will already have it cached (I use the Google-hosted version);
  • JQuery UI for the search suggestions, and I had to customize it, so I can’t use the Google-hosted version. I removed the parts that I didn’t need, and will minify it, but still;
  • Bootstrap, which is a very large CSS file to include for the very simple "did you mean?" page, though I’ve removed large parts of that too and likely can do more;
  • Another JQuery plugin to put the dummy text in the search box (can’t do without that!)
  • and then the PHP script could start to get pretty long as well…

I’m a little afraid to count up how much data is needed for this search box to be so fabulous.

But in any case, three cheers for librarians who get great ideas, have the skills and talent to execute them, and then, crucially, share their work. Again, I could never have written this script from scratch. I might add that this is exactly how I like to see universities and community colleges working together—that is to say, community colleges learning from and adapting universities’ innovations, since it’s so rare for us to have UX specialists and the like. Unequal exchange? Maybe…

California’s Open Textbook Bill

Interesting legislation coming out of the California State Senate (PDFs, SB1052 and SB1053). President Pro Tem Darrell Steinberg has brought forward two bills to try to address students’ textbook woes. Here’s what they do:

The first step (cue Republicans warning of bureaucracy) is to create a commission of 9 faculty members, 3 each from University of California, California State University, and California Community Colleges. This commission is tasked with creating a list of the 50 most widely taken lower-division courses (there will be some work here in trying to figure out whether overlapping courses are the same or different). Once this list is drawn up, in order for a bookstore to carry a book for one of these courses, the publisher must provide at least 3 copies for the library. OK, this part is a little underwhelming;  it’s always nice to have reserve copies, but that’s just a band-aid.

Flatworld knowledge, All Access Pass (PDF, epub etc.) for $34.95The more interesting part of the bill is the attempt to drive production of one open, Creative Commons-licensed textbook for each of these classes. The bill’s drafters clearly took some time to figure parts of this out. There would be an RFP and grant process administered by the commission—they don’t assume that these textbooks would come from nothing. Traditional publishers would be invited to submit bids along with everyone else. Digital would be free, print copies would need to be provided at low cost.

Textbooks would be licensed under Creative Commons, allowing derivative works, so you could remix/adapt content. There’s also a bit about the textbooks needing to be provided in XML or a similar format. This is smart; you don’t want publishers providing something free but then charging for every alternative format (Flatworld Knowledge does this, providing a Web-based version and then charging for PDF, ePub etc., though their prices are low). A companion bill provides for the establishment of a California Open Source Digital Library, where all this stuff would presumably be made easily accessible and preserved.

So, yes, interesting stuff. The risks are pretty clear:

  • You can’t always commission a good textbook. Publishers in the current system don’t win every time. Someone gets a book contract, writes a textbook, and nobody adopts it. When publishers get a hit, they milk it, with a new edition every couple years. So it could be that the RFP looks good, but nobody likes the textbook once it’s written.
  • Publishers could just start to milk the ancillary materials, and charge $150 for access to online problem sets, quizzes etc. They’ve already started this process, right? Many students can’t buy used textbooks for a lot of classes because they need a bundled code for the online portion. There’s nothing in the bill about the CC license being non-commercial, so students could continue getting reamed, just not by the textbook itself. Presumably this question would be dealt with by the commission in the RFP process.

We’ll see. The publishers don’t like it, but then they might end up competing to get the grants—that’s always motivational. The college bookstores won’t like it for sure, but there’s probably not a lot they can do. In any case, it’s great to see people looking for solutions here. Links to the bills are at the top of the page. I didn’t see anything in depth in the LA Times, Sac Bee or other papers, but UC Berkeley’s Daily Cal had a good piece back when the legislation was first introduced.

The bill is now out of the Senate, so it then needs to pass the Assembly. And there’s no funding in the bill itself—that would need to be part of California’s always smooth-sailing budget process.

[go to top]