Retro Formats

Saturday, November 11, 2006

Time to head home

So here I am at an airport again, starting the trek back to Australia. I have a 4:35pm flight that gets into LAX at 8pm local time. Then my plane to Sydney leaves at 10:30pm. While airports are not generally welcoming, I have that feeling of contentment that I'm going home.

This morning I did a lot of walking. After packing up everything at the hotel room, I walked down towards the river. The riverside is beautiful - similar to Lake Burley Griffin in the sense of people walking, jogging, rollerblading and cycling around it. It is a nice day in Boston today, so it was also beautiful in terms of the scenery.



This was probably one of the most enjoyable experiences of the trip in terms of leisure time. I walked back through the government centre area of town alongside the Boston Common, up Newbury Street and then up to Copley Plaza. From there I made my way back to the hotel to check out.

I left my luggage at the hotel and went to have lunch, before heading to the airport. The cab ride was interesting - the driver was of Russian descent and was telling me all about the difference between Russia and other countries. Essentially he was saying that you can't compare them because they're so different, which I guess in a sense is true. I hadn't realised that the cabs don't take credit card, so I had to run in to the terminal to get some more cash to pay him!

Now I'm just waiting for the flight sitting in one of these rocking chairs that are so quaintly placed overlooking the runway.



Well, I guess it adds character to the airport terminal anyway!

Friday, November 10, 2006

DLF Fall Forum - Day 3

Today I attended a panel session on Mass Digitisation and the Collective Collection, which (again) wasn't exactly what I expected. Discussion mainly focused on large-scale digitisation of books, and integration with Google Books. I was hoping for some focus on newspapers/journals, but alas it was not to be.

The other session I attended contained two OAIS-related presentations as well as one about the design of a DSpace-based preservation repository from NYU. This latter presentation went into some detail about their component-based approach to the solution (Java, Ruby, SRU/W, xmlrpc, Shibboleth, Handle, METS, MARC/XML, etc.). It would be interesting to take a look at in the context of our own plans to implement an SOA-based approach to software solutions.

I also met with Tim DiLauro to chat about the NLA TakePart service. We had a good chat about it & other NLA initiatives. He was apologetic about not being able to show me something from his side!

I was supposed to catch up with the people from OCLC to demo PANDAS 3, but they disappeared before I got to catch up with them today. Never mind - I can always give them an login to the test or evaluation system at a later date.

The DLF forum finished at about 12:30pm and I got away about 1pm. I arranged to meet Gordon Mohr (Internet Archive) & head out to the Museum of Science with him. He was going to see a special exhibition called BodyWorlds. Unfortunately, I couldn't get a ticket to see it with him, so I wandered around the museum for a bit, then decided to go to the New England Aquarium.

I really enjoyed the aquarium. There was an interesting display on jellyfish. There was a central cylindrical tank with a ramp running around it. The tank had heaps of fish in it, plus a shark and a huge turtle. It was all pretty impressive.

Then I wandered around the wharf as the sun was going down. There are some really beautiful buildings around there, and a nice view across the bay. The mix of historic and modern architecture is really interesting.



I wandered to South Station, caught a train out to Central Square, and walked along Massachussetts Avenue to Harvard Square where I heard the rhythmic sounds of percussion. It turned out to be these awesome guys playing various pieces of junk with sticks of wood, who called themselves "Junkyard Jazz".



They were incredible. I was there for about 20 minutes, just mesmerised by it and they didn't stop the whole time I was there - amazing!

Thursday, November 09, 2006

DLF Fall Forum - Day 2

Today the forum continued on. We had a great keynote talk from Anurag Acharya (Google), talking about the concepts of simplicity that Google have applied in developing their services. He talked about the continued development/refinement of Google Scholar, which he is directly involved in. His list of lessons learned & opinions:
  • Discovery must be universal.
  • Make the common case easy.
  • Interfaces need reliable semantics.
  • Discovery is best centralised ("metasearch is a dead-end").
  • Opportunity to help users:
    • Libraries are wonderful repositories.
    • Help users find the wealth.
    • "if you link it, they will come..."
Following the keynote, I attended a session on distributed storage networks, which wasn't quite what I had expected, but it was interesting all the same - a lot of talk about people allowing their "spare" laptop hard disk space to be used as a distributed storage node. There are a lot of issues to be hurdled before you can see it working in practice.

I ended up going to lunch with a bunch of people, including Anurag Acharya (Google), Kris Carpenter (Internet Archive), SJ Klein (OLPC), Gordon Mohr (Internet Archive), John Kunze (California Digital Library), Shigeo Sugimoto (University of Tsukuba), Clay Shirky, David Weinberger and Kristine Hanna (Internet Archive). We had a brainstorming session about providing a distributed computer network for uploading and downloading created content to a world-wide distributed storage network. I mainly listened with interest, since I didn't have any prior background knowledge.

After lunch I attended a session on some of the work Harvard has been doing with their digital collections access, followed by the Birds of a Feather (BOF) sessions. The first BOF session I went to was the PREMIS with METS one. This wasn't all that exciting but there were some general ideas coming out of the discussion, such as the consensus that duplicated fields in the different schemas should be redundantly populated. The main reason behind this seemed to be that people were interested in using the separate schema data separately in some cases. There was also the idea of using an XML schema include to specify dynamically what the controlled lists' vocabularies should be, and the idea of splitting the different objects of the schema into separate namespaces to circumnavigate issues.

I also went to a late BOF session (5:30 to 6:30pm) on the direction RLG is heading in as part of OCLC. This was interesting from an observation point-of-view. They had a brochure describing what they think are the kind of things they will work on, and the DLF partner institution representatives were supposed to be feeding in on what they thought the priorities might be. However, there seemed to be some tension, or at least confusion, over how RLG Programs and Research fit into the OCLC picture and whether the now subdivision of OCLC were still going to be able to provide some of the services that are considered to be important by the Library research community.

After the meeting, Tim DiLauro (Johns Hopkins University) caught me in the corridor. He heard my summary of what NLA is doing with our Take Part spaces and wants to have a chat about it. We're going to get together tomorrow & take a look at it. I also have to catch up with the OCLC people & try to give them a glimpse of PANDAS 3, which they're interested in. They have a web archiving product and are interested to see others. They're also keeping an eye on the Web Curator Tool that NLNZ & BL have been developing.

I caught up with the OCLC guys again over dinner. We went to the Turner Fisheries restaurant, just across the street from the forum hotel. The food was delicious. I have to say that all the food I've had in Boston has been pretty good.

We had some really good conversation - almost completely not work-related - and I think I've convinced Andreas to come & work at the NLA for a while. Well, we'll see...

Wednesday, November 08, 2006

DLF Fall Forum - Day 1

Today was the beginning of the DLF Fall Forum, which is being held at the Fairmont Copley Plaza Hotel. I attended the Developers' Forum in the morning, as the actual forum didn't officially start until 1pm. The Developers' Forum is a group of technologists from DLF member institutions that get together at most DLF forums to talk about common issues and potential for collaboration. It was a great opportunity to hear what many libraries in the US are currently wrestling with.

I talked to Geneva Henry from Rice University, as previously arranged, about the Library Services Framework development that she's a part of. They've done lots of background work, but there's not a lot to show for it yet. I was unable to get to the Services Framework group meeting this week, since I was attending the GDFR TWG meeting on Tuesday, but according to Geneva it was a very useful meeting with very good outcomes. She has the task of writing up the results from butcher's paper. I think she will keep me in the loop regarding the future progress.

I also chatted to Morgan Cundiff (Library of Congress) who works in the area of standards. He was very interested in some of the work we've been doing at NLA. He said they had stumbled across our Lucene prototype on the web, so he was asking a little bit more about that, as well as other things we're working on. It turns out they've been doing some similar thinking regarding splitting out of their OPAC web interface.

Dan Chudnov gave a very short presentation on Zero Config, a concept that I'm not sure I completely understand. However, the essence of it was making Library search services as accessible and easy to use as iTunes. I indicated to him that we would be interested in the project and might be able to be a collaborative partner. He's going to send me some more details so I can follow it up at NLA.

Ran into James Bullen (New York University Library, previously NLA), who was very surprised to see me there. He's looking well, enjoying being a dad, and enjoying working on the digital library project at NYU from the sound of it.

The very interesting presentations I attended this afternoon were:

  • Collex: NINES in the Semantic Web - University of Virginia - web-based technology for collecting and exhibiting digital resources described with RDF (uses Solr, MySQL and Ruby on Rails under the hood)

  • Swimming in the Resource Pool: The USC Libraries' Gandhara Project - University of Southern California - federated search

  • SIMILE - Semantic Web browsing in DSpace - MIT - open-source tools including faceted browsing

  • PennTags: Social Bookmarking in an Academic Environment - University of Pennsylvania - social bookmarking system for use in academic environment

  • Cooperative Architecture and Cooperative Development of a Course Reserves Tool - Harvard University - automation for processing and student display of course reserves reading lists

  • Open URL Unleashed: Six Questions (Q6) and the Open URL Object Model (OOM) - OCLC - skins over the top of a services layer



The forum reception was great. At the beginning I was having a chat with Adam Farquhar (British Library), and then I ran into Gordon Mohr (Internet Archive). He recognised me from a visit to Australia a couple of years ago. So, of course, I had to be introduced to the whole Internet Archive crew - Kris Carpenter, Kristine Hanna - as well as Michelle Kimpton (previously of IA). I spent quite a long time talking with Kris Carpenter about a variety of topics including some of the tools we're using in common like Confluence and JIRA.

After that I ran into the guys from OCLC - Andreas Stanescu and Brian Clark - and we had a brief chat. Also talked to Jeffrey Barnett (Yale University), spent some time talking with Rachel Gollub (Stanford University) and then John Mark Ockerbloom (University of Pennsylvania) right at the end. It was a very enjoyable evening and great networking time.

Tuesday, November 07, 2006

TWG meeting - Day 2

Today the meeting started at 9am, a little more of a shock to the system than yesterday. However, I managed to get there a few minutes early.

Most of the day was spent with Andreas Stanescu (OCLC) walking us through the analysis model. This was a useful process, as some of the document made a lot more sense after further description. There was detailed discussion about the domain model, which was reasonably self-explanatory, but it was decided that including the cardinalities on the relationships would provide extra useful clarification.

There was also a detailed discussion around the technical architecture of the system to be developed. OCLC have already developed an underlying service framework that they intend to use in the development, including use of the Berkeley DBXML database. Animated discussion ensued about whether it was a good idea to use LOCKSS as the synchronisation technology.

Much of the discussion ended in the realisation of the need for more work or further investigation before a decision can be made. Further discussion of the issues will be facilitated by the project wiki, and decisions will be taken from there.

At the end of the day several of us headed over to a nearby pub, the Harvard Inn, for a well-earned beer. After a couple of hours of more social discussion the group headed in separate directions. I went back to the hotel (on the "T" again) before heading out to find somewhere to eat dinner.

I discovered a shopping mall, which has been right under my nose all along (attached to the hotel building), and found it contained Legal Sea Foods - Restaurant & Oyster Bar, which has been recommended to me, but I hadn't found one that wasn't too busy on previous nights. The food was amazing! If you ever visit Boston, eating at this local restaurant chain is a must! The restaurants are scattered around the city and east coast USA. Apparently, the seafood around the Boston area is mainly clams and scallops, so they are the best/freshest to eat. I had a mixture of seafoods and I have to say it was all pretty good.

Monday, November 06, 2006

TWG meeting - Day 1

The first main agenda item covered was the Format Model document, which includes the concept of a four layer approach to the concepts: Abstract Information Model (AIM); Format Coded Set (FCS); Format Encoding Form (FEF); and Format Encoding Scheme (FES). Each layer represents different aspects of a format, and each layer encapsulates the previous layer.

There was detailed discussion about the relationships between formats, which are also described in the document; especially the extension versus the subtype relationship. The distinction here is that the extension adds information that would not be recognised by the tools for the extended format, whereas the subtype only represents a derived format for which a there is no change to the information the format stores – hence the same tools can be used. Discussion about the value of storing these relationships followed. There seemed to be a lot of confusion about the relationships in terms of the conceptual layer model. The meeting decided that the document required more work and that further examples of the various relationships would add clarity.

Then we moved on to the data model. There was a lot of in depth discussion on this. We only got through looking at the various aspects of the Format object (and sub-objects). Some of the fields were eliminated in favour of further relationships between Format objects, where the values could be represented as a Format object. This was particularly true of some enumerations that had been proposed. I think we went too far with this. For example, I think perhaps the character encodings are not actually formats themselves, and should be recognised as a different entity altogether. I’m also starting to wonder if different objects such as this are representative of the four different layers described in the Format Model document. Anyway, I’ll do some more thinking on this & feed back into the TWG discussions.

The group then headed out to dinner at Sandrine’s Bistro, a very nice French restaurant right down the street from the HUL offices. The food was excellent, and it was good to talk socially for a while. I spent most of the evening talking with Adrian Brown (The National Archives, UK) and Andy Boyko (Library of Congress), who were sitting down my end of the table. Andy & I headed back to our respective hotels on the T (the Boston subway system).

Meeting with HUL staff

Today I headed out to Harvard University Library (HUL) offices for the first day of the Technical Working Group meeting. The meeting didn’t start until 1pm; however, I had arranged to meet with a few HUL people regarding their digital library software infrastructure from 10am till noon.

I found out some interesting things:
  • Each Harvard faculty has it’s own library or libraries, and HUL is responsible for services that involve bringing the information together and delivering it.
  • HUL uses many off-the-shelf products to deliver their systems. I was surprised at the number of commercial products used. They also use many technologies that NLA use (Linux, Solaris, Oracle, Java).
  • HUL is beginning to make some of their catalogue data available externally via OAI-PMH.
  • They have a geospatial delivery system for maps.
  • They have services that associate streaming audio files with finding aids, but they didn’t indicate they had any transcripts.
  • They have the concept of an archival master and print master in their digital repository. The archival master is the only one treated as a preservation copy. They allow any file type to be uploaded as an archival master, with the caveat that they will not guarantee the ongoing accessibility of non-supported formats.
  • They called their digital project the Library Digital Initiative (not the Digital Library Initiative) from the beginning, recognising that the physical and digital collections data should be one set of data.
They also indicated they would be interested in collaborating on the implementation of things. They generally seem to be at a similar stage to NLA with regard to digital repository things, and perhaps a little ahead/behind us in various aspects of digital delivery and management.

Lunch was provided following my meeting with HUL staff, and the TWG meeting commenced afterwards.

Sunday, November 05, 2006

IFLA publication - Networking for Digital Preservation: Current Practice in 15 National Libraries

I read Networking for Digital Preservation: Current Practice in 15 National Libraries on the flights over to the US. It’s fantastic stuff – a really informative read if you’re interested in what various National Libraries are doing in digital preservation of their collections. It’s a lengthy document, but I recommend it if you need to know what’s going on in digital preservation around the world.

The document has a section for the 15 National Libraries included in the report, each describing their structure and resourcing for digital preservation, plus outlining their current systems & processes, and their future plans in this area.

While this is not GDFR-specific, it is in the digital preservation realm and I thought it worth mentioning.

Rest day

OK, so today was supposed to be a rest day, but I have done lots of walking. I had a bit of a lazy morning, but after that I didn't stop. I ordered breakfast from room service, which turned up earlier than requested at 8:40. They delivered the paper to me, which I didn't ask for so I'm not sure if I will be charged for it or not. I flicked through it & the junk-mail while I had breakfast. I had a banana! For the first time in about six months, I think. I thought I would really enjoy it since it's been so long, but it wasn't all that spectacular a reaction from my taste buds.

Anyway, I think I left the hotel about 10:00 - thought I'd better make sure I could find the church for mass on time. I found the church very easily from the Google map I had printed; found the conference venue on the way too. It's autumn here so the leaves on the trees that line the streets have turned and not all fallen yet. Only some of the sidewalks have trees, and there are are various little rectangular parks (similar to Glebe Park in Canberra) scattered around the place. I got to the church, Our Lady of Victories, too early (around 10:30); so I wandered around for another little while.

Saw some beautiful things and some sad things. There are a few homeless people living on the streets - their presence is somewhat of a a blemish on the city parks, as they have slept with blankets on the park benches overnight. I also saw blankets and possessions (person was missing) in the corner of the entrance-way to a cathedral-like church. A modern sign had been posted on the old stone of the area - to the effect of no camping, etc. It seemed a bit hypocritical.

During my wanderings, I passed one of the newer buildings in Boston (also one of the tallest I think). You can see a reflection of the historic building/s on the other side of the street.



Back to the church: it didn't look much from the outside, but apparently was the first French Catholic church in Boston. It is run by the Marist fathers. I was surprised when I walked in. What I had expected from the outward appearance was something smaller, but this place was fairly large - maybe half the size of the cathedral in Canberra - with high ceilings, lots of ornate sculpted stone and stained glass windows behind the sanctuary. It was quite beautiful. Unfortunately, I didn't take any photos of it.

The priest was very friendly, and was welcoming people as they entered the pews. The parish is apparently quite small in terms of local people (about 140 parishioners in total), so they depend on visitors to fill the church up a bit, and no doubt to provide financial support too.

The mass was quite nice. There was a lot of cantoring, which wasn't that crash hot, but the organist was pretty good. We sang many old hymns. The recessional hymn spoiled it - far too patriotic for me. It was "Beautiful America" or some such - I have to say I was a bit turned off. However, the homily was very good, and only a few words of the prayers of the mass varied from what we are used to in Australia.

Then I walked back to the hotel and got changed into some more comfortable clothes. It was quite cold out and about. I'm wishing I had brought my scarf now - never mind. Temperatures are between 1 and 15 degrees celsius as far as I can work out. It was about 7 degrees when I got off the plane last night.

It was then that I went wandering through the shops and the streets of Boston city. I have to say the shops were a little disappointing - nothing much interested me, many of the shops weren't open, and a lot of those that were open were what I'd describe as "upper class". I did, however, enjoy my stroll through Boston Public Gardens and the Boston Common. There are lots of historic statues in both of these park areas that are situated centrally in Boston. The colours of the trees were fabulous and I took some nice photos.





I had also almost forgotten how funny squirrels are. I was quite amused at the keep-off-the-grass signs - thought perhaps they were meant to be for the squirrels!



I think I've got the hang of looking left for traffic first after all my walking around today, but there are a lot of one-way streets here, so that gets a bit confusing too.

I came back to my hotel room around 3:30pm, and checked out the T (subway) on the way, so that I know where I'm going tomorrow. There was a cranky security guy there, but after he got over me nearly pushing the Info button, he was quite helpful in telling me how to get to Harvard station.

I hooked up to the internet in my hotel room so I could check emails and some other bits and pieces, and will head out to dinner soon (it's now 5:45pm). Housekeeping turned up at 4pm, which seemed a bit of an odd time.

Saturday, November 04, 2006

Going to Boston

Well, here I am sitting at the Sydney international airport waiting to board my flight to LA. From LA I have a connection to Boston, where I’m attending the GDFR meeting and DLF Fall Forum later in the week. I arrive in Boston on Saturday night local time, which will also give me some opportunity to see the city of Boston on Sunday.

The weather, as usual for my travels, is crappy (see picture of Sydney city background below). However, it won’t make any difference since I’ll be on the plane soon.



I’ve got some more GDFR documents to review before the meeting, so I’m aiming to get through that on the flight over. I have some other related documents to read too, so at least I’ll have plenty to do on the plane.