Saturday, December 1, 2012

Week 13 - Legal Rights and the Future of Digital Libraries

Lesk - Chapter 11, Intellectual Property Rights

One bit of insightful information in this piece was its articulation of Bridgeman Art Library, Ltd. v. Corel Corp., 36 F. Supp. 2d 191. Lesk perceives this case as holding a photograph of an artwork is not copyrightable if the photograph was routinely taken just to depict the artwork. This is especially important for digital libraries which routinely digitize images of important works of art.

Another potential pitfall for English-language digital libraries is the United Kingdom's protection over "typographic arrangement." Because a public domain book may be set in a protected typeset, one may need to look into legal issues before digitizing certain books.

Muddiest Point:  What is the best resource/book for librarians to turn for copyright help, for large digitization projects?

Stiglitz - Intellectual Property Rights and Wrongs

Stiglitz reminds us that when designing a digital library, we must take into account how users from less advantaged regions may access our material. We should champion open-source software and open-access scholarship while maintaining high quality. In fact, this has already been down. Both DSpace and ContentDM are OS software available to librarians who have an Internet connection.

Lynch - Where Do We Go from Here?

Lynch does an excellent job of summarizing the history of digital libraries. I am glad he acknowledges the crucial role played by government-funded and led initiatives. Very often, a national government can play a pivotal role in innovation. Innovation is not the exclusive domain of decentralized lone rangers.

Second, Lynch raises many good fields of research for others to pursue in the future. However, because this article is a little old, 2005, many of these fields of research -- personal information management, long term relationships between humans and information collections and systems (e.g., human computer interaction) -- have already become well developed.

Knowledge Lost in Information - Report

The Ubiquitous Knowledge Environment, or "information ether," is now becoming a reality. I believe the scope of digital libraries includes cloud-supported libraries of videos and music. These are now readily accessible at a moment's touch from any area which is wireless supported. Also, Google Print permits one to add devices and then to print from one's Gmail account anywhere one is able to access it.

Even though these may not seem research-related, these above technologies represent "individualized, customized, human-centric computing."



Wednesday, November 21, 2012

Week 12 - Security and Economics

Arms - Economic and Legal Issues

This article captures the crux of digital information's economics from this quote:

With digital information, . . . once the first copy has been mounted online, the distribution costs are tiny. In economic terms, the cost is almost entirely fixed cost. The marginal cost is near zero. As a result, once sales of a digital produce reach the level that covers the costs of creation, all additional sales are pure profit [emphasis mine].

This made me think of all the profit that Apple is making from selling music and videos through ITunes! When the content is good, and the interface is popular, people are willing to pay =) .

Arms - Implementing Policies for Access Management

I really enjoyed this piece's explanation of security and implementation. It gave a very good piece of "design best practice" for user interface authentications embodied in this quote:

The least intrusive situation is when authentication is keyed to some hidden information, such as the IP address of the user's computer, or where the user logs in once and an authentication token is passed onto other computers[.]

I also found the illustrations showing particular roles and attributes for users and institutions to be very helpful.

Lesk - Economics

I greatly appreciated the reference to work done on library ROIs by Jose-Marie Griffifths and Don King - However, because this work is a little old, 2003, I would like to know -->

Muddiest Point:  Has there been more recent work on a library's return on investment, which give librarians strategies on showing their worth to their home institution?

Second - I believe there has been more recent work done on how administrative costs of OA journals can be borne by the authors, and not by the users. This may be a better model:  the prestige of having one's article accepted by a prestigious OA journal is greater than the small minimal cost of $2.99 for submission.

Kohl - Safeguarding Digital Library Contents and Users

Finally, I have a clear understanding on how encryption and decryption keys work in the digital world! These public and private key relationships were explained very clearly.

Tuesday, November 20, 2012

Week 11 - Social Issues

Borgman - Social Aspects of Digital Libraries

The great value from this early piece is summed up from these two sentences:

#1. Digital libraries are a set of electronic resources and associated technical capabilities for creating, searching, and using information.

#2. Digital libraries are constructed -- collected and organized -- by a community of users, and their functional capabilities support the information needs and uses of that community.

I am glad that, as early as 1996, researchers called for an empirical approach to digital library design, with a focus on users!

Roush - Infinite Library

This piece correctly pointed out three directions which Google's digitization project can go:

Door One - a private firm begins to purchase rights to things already in the public domain, in order to privatize them

Door Two - Parallel public and private databases coexist peacefully. Google could keep one copy of each library's collection for itself and give away the other copy.

Door Three - Private companies offer commercial access to digital books while public entities, such as libraries, are allowed to provide free access for research and scholarship.
I love this quote: "Libraries and publishing have always existed in the physical world without damaging each other; in fact, they support each other. What we would like to see is this tradition not die with this digital transformation."

Arms - A Viewpoint Analysis of the Digital Library

Again, a great emphasis on the user's perspective in light of interoperability.

Muddiest Point:  I have no questions this week - everything was easy to understand.

Thursday, November 1, 2012

Week 10 - Interaction and Evaluation

Arms - Chapter 8

One of the best insights from this reading was from this quote: "The functions offered to the user depend upon . . . structural metadata." Because this reading comes from 1999, I wonder whether structural metadata is still crucial for user functionality in digital libraries.

Muddiest Point:  In 2012, which structural metadata is most crucial for user functionality?

This reading also gave me a good sense of what Java is, its distinction from JavaScript, and how Java functions.

Muddiest Point:  Dr. he, may you please tell us some best practices for selecting servers, middleware, and CMSes? For a medium-sized digital library, what type of server, database language, and middleware should a library purchase? Is there a website or journal which a librarian should follow regarding this type of selection?

Kling & Elliot - Digital Library Design for Usability

 I felt that this article was too vague. Maybe it is because it is a little outdated. I appreciate the author outlining in Section 6.3 a usability engineering life cycle model, but I want to know concrete steps:  how to conduct a user study, how to develop a questionnaire, how one should go about creating a prototype. Too vague, too little.

Saracevic - Evaluation of Digital Libraries, An Overview

This article raised more questions than it gave answers. That is fine, because these were questions which needed to be asked. Just one example would be:

To what extent are user studies also evaluation studies?
To what extent are studies of specific user behavior in digital libraries also evaluation studies?

Also, I believe his list of factors under Section 6, "Criteria,"can act as a checklist for what digital library designers should keep in mind when they begin development.

Hearst - Search User Interfaces

This reading gave solid practical advice on designing search interfaces from a user-based perspective. Its assertions were backed up with stats and studies. Thanks for having us read this article!

Thursday, October 25, 2012

Week 9 - Reading Notes

Hedstrom - Research Challenges in Digital Archiving and Long-term Presernvation

This short essay summarizes the main challenges to digital preservation:  (a) the collections are heterogeneous and ever-growing; (b) one must digitally preserve for the long-term; (c) both infrastructure and technologies must be affordable.

Because this was so short, I wonder whether this was an old paper given at a workshop before the advent of OAI.

Lavoie - The Open Archival Information System Reference Model: Introductory Guide

This has to be one of the best pieces I have read this semester. It clearly sets forth how OAI got started. Second, it explains to a would-be architect the different steps and components one may need to build an OAI-compliant archive. I give two examples below:

"The first responsibility of an OAIS-type archive is to establish criteria for determining which materials are appropriate for inclusion in the archival store." (page 4)

"The second responsibility emphasizes the need for the OAIS to obtain sufficient intellectual property rights [ . . . ]" (page 4)

"Another responsibility of an OAIS-type archive is to determine the scope of its primary user community."

I especially appreciated the visual diagrams which showed the actors and different stages on page eight.

Preservation Management of Digital Materials: The Handbook

I felt that this reading seemed to repeat much of the material covered in the lecture and in previous readings. Because I actually studied Digital Preservation in an Archives context before, I already knew much of the material.

Littman - Actualized Preservation Threats

Muddiest Point - I know MODS in passing, but I would appreciate a more in-depth explanation and demonstration. Thank you!

I find it very helpful to know in advance some of the failures that took place. However, I wonder whether the utility of this paper is limited, because it was published in 2007, and most technology has now moved on.

Monday, October 22, 2012

Week 8 - Reading Notes

OAI for Beginners - I greatly enjoyed reading about the history of how OAI developed. I especially appreciated the clear definitions between Data Providers and Service Providers, and the pictures which illustrated the functions and interrelationship between these two types of Providers. However, because OAI depends heavily on HTTP, and because I have not yet learned HTTP, there were some parts of this tutorial which I did not understand.

Muddiest Point:  Because we are already learning HTML, XML, DTDs, and XML Schema in library school, why don't we also learn some basic HTTP? If such important metadata standards rely on HTTP, then we should learn this in library school.

The Truth About Federated Searching - This article held some very valuable insights for me. First, Hane reminds the reader that federated search engines must demonstrate to the library that they can search the library's databases using the library's own authentication, both locally and remotely. Second, I was surprised to learn that federated searching cannot improve on a native database's search capabilities. A federated search engine can only use the capabilities of the native database.

Muddiest Point:  PittCat subscribes to Summon. Has Summon demonstrated its value by effectively employing authentication and the capabilities of Pitt's subscribed databases?

Z39.50 Information Retrieval Standard - For the most part, I enjoyed how this article explained history. I appreciated knowing the origins of Z39.50, even knowing that NISO was once the Z39 committee of ANSI.

However, the article also assumed a lot of background knowledge of TP/IP, protocols upon which Z39.50 seems to be based. Again, like in the OAI for Beginners article, we at Pitt's ISchool probably do not have an adequate background in TP/IP. We should. I believe Pitt should teach this to us. I will most likely go learn it on my own via Lynda, but I think that Pitt should teach TP/IP to us if an important standard employs these protocols.

I was disappointed that this article did not use helpful diagrams, as the OAI article had. Therefore, many of the complex relationships between server and client were a little hard to visualize.

Lossau - Search Engine Technology and Digital Libraries - This article correctly points out that a library's vision should not be focused on its own collection, but should be broader. A library should focus on building search services targeting virtual collections of material even within the deep web. However, this strikes home the importance of interoperable accepted standards across all types of digital objects and their repositories.




Friday, October 5, 2012

Week 7 - Reading Notes

Lesk - Chapter Four

Frankly, I found this chapter to be outdated. This is because the technology which this chapter describes seems to be obsolete.

For example, in its chapter about pictures, only GIF and JPEG were described. However, from my training as an archivist and amateur photographer, it is common knowledge that TIFF files are preferred over any one of these formats. This is because each time a GIF or JPEG file is opened, its pixel count and thus resolution decreases. A TIFF file does not do this. Could it be that TIFF files were not around when this book was published in 2005? According to Wikipedia, TIFF was born in 1992, but did not receive wide usage till 2009.

With respect to Automatic Speech Recognition, the greatest contemporary example in everyday life seems to be Apple's Siri. Even though this is proprietary, I would love to know how this works. Because Siri is extremely new, Lesk does not mention it in this 2005 book.

Hawking - Web Search Engines, Parts 1 & 2

These two articles were fantastic. These are the most clearly written articles I have ever read about what exactly a web search engine does.

That being said, there were some muddy points where I did not know what Hawking was talking about -

Muddiest Point #1:  From Part 1, page 87: "Excluded Content. Before fetching a page from a site, a crawler must fetch that site's robots.txt file to determine whether the webmaster has specified that some or all of the site should not be crawled." What are the reasons that a webmaster would specify certain sections not be crawled? What would those sections be generally?

Muddiest Point #2: From Part 2, page 88: "An inverted file is a concatenation of the postings list for each distinct term." Would we be able to see a visual example of this list in class? I did look up the definition of the term "concatenation," but I am having a hard time visualizing this. Also, Hawking did not define what "postings list" is, so I need clarification on that, too.

Henzinger et al. - Challenges in Web Search Engines

This was an extremely good piece of writing. All of the terms were well-defined before used in an explanation by the authors.I especially enjoyed the descriptions of text spam, link spam, and cloaking. I had never known exactly how website creators try to improve their rankings in search results; now I know!