The Scholar’s Space

Communicating research findings in a networked world
User Photo

Problems and Costs of Data Preservation.

Posted by Alex Bienkowski on Apr 9th, 2008
2008
Apr 9

In today’s New York Times,  there is an article on the problem of long term data preservation, and on some of the cost factors involved in any scheme to do this.  Librarians are very familiar with the problems of preservation, and for a while there, a number of articles appeared in our press and in  other venues, discussing what was involved.  Then, the topic seemed to disappear, at least in regard to preserving the scholarly record as we have come to understand it. But the same animal has returned, wearing a somewhat different hide and coloration.  Interest is being focused on the enormous quantities of data produced as a result of  research in the physical and social sciences.  And, some investigators want to “repurpose” data generated in  earlier experiments, their own or someone else’s  with different endpoints or outcome measures,  different analytical techniques and so forth.  But as the enthusiasm for such efforts grows, the facts about data preservation emerge, or rather, re-emerge with discouraging force; long- term data preservation, in an  accessible and useful format,  is a real technological challenge. Moreover, whatever measures might be suggested as solutions, it’s all going to cost a great deal of money.  Preservation also means more than mere dumping of files someplace, even with something like intelligent and conscientious curation.  A good deal of what we would call subject analysis and description ( OK, metadata in today’s lingo), will be necessary.  Institutional repositories, “Long Tail” marketing, Publishing on Demand (POD), digital publication in general and a lot of other goodies existing now or promised all depend on a reliable and secure “data base”, and I’m wondering if we have it or can get it at a price our institutions can, or will, pay. It’s probably time for a serious assessment of what is possible and what the price tag will be.

User Photo

Encrypted Data Not Safe After All?

Posted by Alex Bienkowski on Feb 27th, 2008
2008
Feb 27

In the continuing arms race between data protectors and those who don’t like things that way, encryption has been the trick-taking card. Packages that allow users to encrypt data are really quite capable, more than enougt to scare off the casual cracker and quite difficult to break through even for powerful and dedicated systems. But it seems that experiments at Princeton have shown that skilled use of some simple tools can allow a hacker to recover the codes used as encryption keys from the DRAM chips in the machine.  The thought used to be that the chips lost data as soon as power was shut off. But, some DRAMs retain the information for seconds or even minutes after the loss of power. And, freezing the chip with a blast of air cleaner or some commercial Freeze can extend this period even longer; enough for a hacker to tap the chip and recover the keys. With the keys, the hacker can read the cypher with relative ease.  I’m sure there will be more comment on this in the future. It all adds to the drama.

Georgia Harper

free*the*books

Posted by Georgia Harper on Dec 12th, 2007
2007
Dec 12

free the booksWell, it’s official: We have launched our documentary blog for our public domain and orphan works project, free*the*books. We invite you to view and post comments! Our new blog is focused on research by the University of Texas Libraries about international copyright laws that control the use and distribution of digitized books online.

As a Google Library Partner, UT Libraries will digitize over a million books from its rich collections within the next six years. Digitization of 800,000 books in the Benson Latin American Collection began in June of this year followed by this companion project to develop an authoritative process for determining the copyright status of books published in various Latin American countries and to identify foreign works in the public domain.

We have found little guidance to help us reliably identify which of our books are already in the public domain so we are piloting a project to develop new tools for ourselves and for anyone who wants to tackle these difficult public domain problems. We will document our process, our progress and our results on the blog’s pages along with links to web resources we find useful.

The initial pages of the blog include online resources to determine critical author birth and death data, prototypes of legal evidence tables and draft guidelines by which books, wherever published, may be determined to be in the public domain

We will be adding features, more pages and new posts to the blog on a regular basis and from time to time will also have guest contributors to add variety and fresh perspectives. We invite suggestions and comments from other Google Library Partners and anyone undertaking similar or related projects.

Email us at freethebooks@gmail.com or IM us at our Meebo widget in the sidebar of the blog. We are here; we are building an evidence base and we are looking for virtual partners!

User Photo

Do You Believe in Ghosts?

Posted by Alex Bienkowski on Nov 27th, 2007
2007
Nov 27

Scholars are supposed to consult “the Literature” and contribute to it as part of their professional duties. Library types are supposed to preserve it, and make sure its treasures are available for future generations. In that subsection of  scholarly publishing that relates to medicine, however, there is growing reason to suspect that much of ”the Literature” is simply drug company flackery; very well conceived, slickly executed flackery, it is true, but flackery nonetheless.  And concern about preserving it is, to put it mildly, misplaced.  ”Ghost” authoring of journal articles has been known for a while, but studying it has been, for obvious  reasons, rather difficult.  “Opinion Leaders” are shy about admitting they allowed their names to be attached to articles they not only  didn’t write, but which were prepared by a Pharma company or its contractual agent. An article in PloS Medicine raises the ante a good bit. Why “ghost” just the article? Why not “ghost” the whole business…trial, data gathering, write-up, the works. A little judicious steering here and there, to make sure the right things get said and the wrong things left out, can do wonders for a product launch.  Read the article by Dr. Sismondo, and pay special attention to the section discussing the Sertraline trials. On the Scholar’s Space we spend a lot of time fretting about technology, assuming, operationally at least, that everything else in the hallowed research/publishing cycle is fine. What if we have it all backwards? What if the technology is the easiest, most tractable part, and it’s the rest of the process that needs worry and work?
Boo!

PS. In the context of all the hooha raised by the PRISM crowd about Peer Review and OA, notice in the PLoS article how neatly PR has been co-opted into the “marketing process”.   It’s no threat to the ghosters, and beating it is not only easy, but necessary for a successful “placement”.

Georgia Harper

Caveat Lector » Less cognitive load, faster deposit

Posted by Georgia Harper on Nov 4th, 2007
2007
Nov 4

Dorothea Salo makes a great argument for streamlining the submission process to upload things to institutional archives at Caveat Lector » Less cognitive load, faster deposit. I hope our own Texas Digital Library designers are on to this one. But she also identifies the licenses as a major area for improvement. That’s something I can help with, and I’ve made a mental note about it. We don’t need no stinkin’ licenses! At least not at the item level as concerns the relationship between the scholar and the library. She’s absolutely right. We do have to find a way to get Creative Commons license terms that match faculty preferences affixed to documents, however, but that seems like something that could be batched. But then I’m not a techie. Darn, I wish I were a techie.

She even goes so far as to suggest that maybe navigating submission processes should be offered as a service, at least in the backfile context. We are taking that approach to current submissions in a project with the School of Nursing here at UT Austin. We (Lexie, Roxanne and I) are working with the administration and several faculty members to create a process that would allow the School to submit faculty papers to PubMed Central quickly and efficiently. Faculty only publish one or two papers a year. A process you have to repeat that infrequently, especially one that can be a bit complicated at times, has to practically be “relearned” each time. On the other hand, if a departmental administrator deposits everyone’s papers as published (with required embargoes implemented for access rights), we create a more efficient process, one that faculty are more likely to take advantage of. Admittedly, there are faculty who will deposit their papers themselves and that’s fine. But there are also a lot of faculty who would be happy to hand the process off to someone else. In light of the possibility that the NIH policy suggesting submission to PubMed Central could become mandatory, if not this legislative session then sooner or later, exploring how to streamline submission is an important consideration now. Whether it’s PubMed Central or our institutional repository, there’s opportunity here to increase submissions and we ought to be taking advantage of it.

As for backfiles, she is right again. The need to batch-process these is critical. That’s the kind of thing that we can do and we can’t expect faculty to do in any effective or efficient way. Comments from our techies? Are we on this?

User Photo

sustainable digital preservation and access

Posted by Roxanne Bogucka on Sep 24th, 2007
2007
Sep 24

Fran Berman, of the San Diego Supercomputer Center at UC San Diego, and Brian Lavoie, a research scientist and economist with OCLC Online Computer Library Center, Inc. will co-chair the newly formed international Blue Ribbon Task Force on Sustainable Digital Preservation and Access. The NSF and the Mellon Foundation are the funders.