Feeds:
Posts
Comments

Under the category of “And Now For Something Completely Different,” I thought I’d just note in passing that I have a new publication up, not about e-Science or Libraries or IP, but about Television.

Yes, Television.

More specifically, the paper is about streaming television and the social affordances of the viewing space.  I started thinking about this stuff a few years ago at Michigan, and it kept popping into my head, so when the opportunity presented itself, well… I now have a paper about TV.

As an added bonus, if you happen to be in the Boston area this weekend, I’ll be presenting this paper at a free conference called Media in Transition 6, on Saturday at 1:30, on the MIT campus.

Whether or not you can make it, here’s the abstract, along with the link to the full paper, from the conference site:

Network Television Streaming Technologies and the Shifting Television Social Sphere, Elisabeth Jones

This paper builds upon and updates previous work on the social influence of television viewing to account for the novel forms of viewing provided by streaming television services like the ABC Full Episode Player and Hulu. Based on examples taken from those two interfaces, the paper details the relative affordances and constraints that streaming interfaces offer the television viewer, and points to the ways in which those factors might reshape the social impact of television. In particular, the paper highlights three potential impacts of streaming technologies: increasing television’s spatiotemporal ubiquity, shifting the social-spatial dynamics of the viewing area, and encouraging more selective – or perhaps biased – viewing behavior. These thematic findings emphasize the distinctiveness of the social phenomena surrounding streaming television relative to broadcast television and, as such, underline the need for further empirical work on the user-end impacts of streaming.

About half a decade ago, my fascination with (and anger about) issues arising from the growing push toward licensing access to library materials, rather than purchasing them outright, pushed me out of a traditional library career path and into one that would allow me to focus more deeply on explication of and advocacy for the liberties libraries have traditionally advanced.

This week, writing for the Christian Science Monitor, librarian Emily Walshe astutely and passionately points out that these issues of access vs. ownership have not disappeared in the five years since I got mad about them.  Indeed, if anything, they have only grown more widespread. In particular, Walshe compares the way that Amazon’s Kindle substitutes access for ownership to the catastrophic financial policies that have led to the recent global recession:

If our flailing economy is to teach us anything, it might be that an on-demand world of universal access (with words like lease, licensure, and liquidity) gets us into trouble. Amazon and other e-media aggregators know that digital text is the irrational exuberance of the day, and so are seizing the opportunity to codify, commodify, and control access for tomorrow. But access doesn’t “look and read” like printed paper at all – just ask any forlorn investor. Access is useless currency.

You should definitely read the article. It’s excellent.

Still, I do have one nit to pick.  That is, I’m not convinced that Ms. Walshe’s other central analogy – based on elementary school lunch-trading – quite works.  She compares trading actual books for Kindle e-books to trading a baloney sandwich for a spoon (and no pudding to eat with it).  But is that really the tradeoff? Keeping her analogical pieces, it would seem to me that the Kindle itself is the spoon – it’s the tool through which access is granted.  But the content…I’m not sure you can actually make a food analogy to the content.  Because unlike food, information is inherently non-rival – if information is pudding, everybody can simultaneously have the same pudding, and eat it too.

I pick this nit not to diminish the point made by the article (with which I entirely agree), but because I think the failure of this analogy points to a further injury to intellectual freedom caused by the trends the article describes.  Walshe’s analogy fails because information, unlike pudding, persists beyond its consumption.  If books are pudding, I could eat all the pudding in my cup a dozen times over, and then sell that pudding-cup to a used bookstore, or donate it to a library, and others could consume the contents of that very same cup hundreds or thousands more times.  If books are pudding, then, they are bottomless cups of it.

When conceived in this way, it becomes even clearer what monumental and egregious harm DRM systems like the Kindle’s may do to intellectual freedom and the maintenance of an informed populace as they become commonly accepted. When we consent to have mere access to a book rather than full rights to its contents, we turn a bottomless pudding-cup, capable of feeding limitless numbers of pudding-lovers, into a tiny, single-serve container, barely sufficient to feed even one.

This cloistering of intellectual wealth; this abridgement of our existing rights to share, lend, and resell the intellectual goods we legally purchase – it should make us angry.  We should understand what we are losing, and we should be furious. We should learn our rights, and demand that they not be taken away from us.

Again: read the article. And then consider whether by accepting a single-serve pudding cup from Amazon, we might not be imperiling the world of bottomless pudding cups we currently take for granted.

A few weeks ago, I gave two days’ worth of lectures on intellectual property to the undergraduate class I’m helping teach this quarter. They seemed to like it, and a few people have asked me to send this to them, so I figured I’d post it up here for other folks to peruse.

Note that the lecture is CC licensed the same way as this blog – I welcome reuse, in noncommercial settings, so long as I’m attributed. In that spirit, if you’d like the actual slides, I’d be happy to send them – just pop me an email (eaj6 [at] u [dot] washington [dot] edu).

[Update (same day, 7:30 pm): a friend recommended SlideShare, and I concur. New slideshow below.]

I’ve got a piece in the latest issue (the first issue, really) of Research Library Issues (previously the ARL Bimonthly Report) covering the takeaway points from last October’s forum on e-science and the future of science librarianship. The PDF has embedded audio from presentations by Rick Luce, Liz Lyon, and Cliff Lynch (oooo…multimeeeedia…) – that’s available under a CC license here.  For maximal accessibility, I’ve also HTML-ized it, below (minus the sound clips and footnotes).


Reinventing Science Librarianship:  Themes from the ARL-CNI Forum

Elisabeth Jones,PhD Student,University of Washington Information School, and Research Assistant on e-Science and Cyberinfrastructure,
University of Washington Libraries

On October 16–17, 2008, more than 230 science librarians and library directors gathered at the ARL-CNI Fall Forum in Arlington, Virginia, to consider the implications of e-science and e-research for science librarians and the changing nature of their work.  The forum, “Reinventing Science Librarianship:  Models for the Future,” was orchestrated by the ARL E-Science Working Group and brought together panels of scientists, science librarians, and research library directors to address the needs of scientists working in distributed and collaborative networked environments, the priorities for retraining science librarians, and the importance of new directions in library practices.  A comprehensive collection of forum resources is available from the ARL Web site and the author’s blog; this article focuses on three thematic threads woven throughout the various panels and presentations:

  1. The Process of Reinventing Science Librarianship
  2. Serving Future Generations of Users
  3. The Librarian as Middleware

Each of these themes recurred frequently at the forum, and each represents an area of particular relevance for science librarians—and in many cases, for research librarians more generally.  For this author, the themes represent the substantive takeaway messages from the forum that should influence libraries’ next steps in responding to the needs of scientific researchers.

The Process of Reinventing Science Librarianship

Several speakers put forth ideas about what the science librarian of the near future may look like in terms of skills, capacities, and institutional positioning. Three points of general consensus emerged:  first, because scientific research is itself being transformed, science librarians (and their libraries) need to become more adaptable to changing conditions; second, in order to understand changing conditions and respond to evolving user needs, science librarians need to focus more on strategies for library service assessment, evaluation, and improvement; and finally, the fundamental role of the science librarian needs to expand to incorporate skills related to organizing and manipulating data and data sets.

At the outset of the forum, Richard (Rick) Luce emphasized that, in an era of e-science, research libraries need to become nimbler, allowing for more
fluid and dynamic allocation of staff resources. Emerging forms of scientific practice will require different kinds of library support at different times.  He envisioned future science libraries that have the capacity to create multi-skilled information-management teams on the fly, embedding librarians within research teams or departments.  Science libraries must develop more flexible staffing structures in order to be more responsive to the needs of this kind of research.  This will, in turn, require highly adaptable science librarians, in terms of both skill set and attitude.

Further, as Sayeed Choudhury, Fran Berman, and others suggested, successful adaptability requires a clear sense of direction, and successful direction requires effective application of library service assessment and evaluation procedures. Institutional requirements are diverse, and ever changing.  Becky Lyon quipped, “When you’ve seen one research library, you’ve seen one research library.”  In other words, in order to know how best to serve one’s own institution, one must understand the particular needs and features of that institution.  What works at one research library will not necessarily port directly to another.  Still, as Neil Rambo suggested later in the forum, librarians should not let their institutional differences get in the way of learning from one another’s experiences.  For example, helpful models may be found in health science and medical library settings.  All of these speakers suggested that science librarians must engage in an ongoing process of measurement, assessment, and revision with regard to the services they provide—learning from and building upon the experiences of others where it is reasonable to do so.

Finally, as emphasized in particular by Liz Lyon, Catherine Blake, and Carole Palmer, many of the roles that science librarians will be called upon to play focus on data, as science becomes more data-driven itself.  Science librarians will need to become data consultants, data distributors, data service providers, data analysts, data miners, and data curators.  They will be called upon to enforce data quality, aid in data retrieval, construct data applications, and ensure that data collections are properly annotated and preserved.  This will require science librarians to repurpose and expand upon their existing competencies—especially information organization and retrieval—to meet the challenges of managing data in addition to literature and other more traditional research products.

Serving Future Generations of Users

A second recurring theme of the forum was the need to create sustainable models for data preservation and reuse.  The explosion in the volume of scientific data entails a need to both determine data selection and preservation procedures and find ways of maintaining access and usability as data management systems change.  Furthermore, lurking beneath all of these issues lies another:  how to financially sustain complex data systems over long periods of time.

One compelling strategy for developing sustainable data life-cycle solutions was voiced by William Michener early in the conference, and reiterated frequently thereafter:  discussing the issue of long-term support for scientific research, Michener asserted the need for “domain-agnostic solutions.”  That is, he contended that a single cyberinfrastructuresystem should be capable of supporting a range of disciplines, so that each discipline would not need to develop its own system.  Such an adaptable system would reduce the cost of both up-front development—which would require less duplication of effort—and ongoing support—since one support structure could serve many fields. Furthermore, a standardized, domain-agnostic solution would help to enhance data interoperability across domains, thus facilitating future collaboration within and across disciplines.

On a more general level, other speakers—particularly Fran Berman and Clifford Lynch—emphasized that preservation is not an end in itself, but is rather a step on the path to future reuse. Reuse of data created by others (or even by oneself) can accelerate advancement and discovery—purposes that should resonate with researchers and funders alike.  Thus, characterizing data curation in terms of reuse has two advantages:  first, it more accurately reflects the ultimate goal of such practices, elevating access and retrieval over static storage; and second, it enhances the appeal of data curation initiatives to those who are asked to contribute data and/or funding in order for those initiatives to succeed.

The Librarian as Middleware

A third theme—the librarian as middleware—was pervasive at the forum.  Rick Luce introduced the idea (and the phrase) on the first panel, and subsequent speakers offered a number of variations and elaborations on it as the forum progressed.  For the panelists, librarians became “bridges,” “facilitators,” “trusted arbiters,” and “relationship builders,” negotiating not just between people and systems, but also between systems and systems, and between people and people.

Mediating between people and systems is (or should be) a familiar role for librarians.  Whether they are helping an elementary-schooler learn to use a call number system, or assisting a chemistry professor in navigating Beilstein CrossFire, librarians serve this “middleware” role every day.  One sees a parallel, if more complex, role for science librarians in supporting e-science.  Medha Devare emphasized the key role that librarians will play in mediating between e-science systems and their users, helping individuals to effectively utilize the collaborative data sets, online simulations, virtual environments, and other technological and/or networked resources that e-science will create.  Further, as noted by Sayeed Choudhury,greater public access will entail a greater need for the mediation librarians can provide.  As more scientific data is made freely available through research enterprises like the Human Genome Project or the Sloan Digital Sky Survey, data will reach larger numbers of users dispersed across non-traditional audiences—undergraduates, K–12 students, and interested members of the public.  This expansion in access will create a parallel expansion in users’ need for help with data navigation across a range of library settings.

Somewhat less obvious, perhaps, are the ways that librarians could become middleware agents between systems and systems, and between people and people.

Several presenters, including Catherine Blake, Fran Berman, and William Michener, pointed to the need for mediation between different systems, and indicated that librarians will have an opportunity to play a strong role in this area.  In order to do so, however, librarians will need the skills to negotiate between different data systems and between different sorts and compilations of data sets.  Some key concerns in this area will be interoperability, migration, and emulation—all points at which humans must take action in order for systems to begin to talk with each other, and to remain interoperable over time.

Arguably the most important role for librarians as middleware in the e-science context, however, is mediation between people and people.  As Sayeed Choudhury pointed out, “human interoperability is more difficult than technical interoperability.”  It requires trust, common vocabulary, and negotiation of values.  And often—though not always—research librarians are uniquely well positioned to negotiate such issues within and beyond their institutions:  they can inspire the trust of a variety of actors, thus enabling them to develop a shared vocabulary and value set.  In an increasingly interdisciplinary and collaborative research environment, the capacity for expert mediation will become very important.  Indeed, some panelists’ stories suggest that it already has:  James Mullins recounted a situation at Purdue in which librarians were able to “bridge the gap” between researchers who did not have a“shared vocabulary.”  Medha Devare characterized Cornell Library’s successful leadership role in the VIVO project as a consequence of their reputation as “trusted arbiters of information.”  Interdisciplinary collaboration among researchers is increasingly important in the virtual communities formed by networked science, but that does not mean that it will be easy. To the extent that science librarians hold positions of trust within their communities, they will be in a unique position to play mediating and facilitating roles within and between those communities.

Conclusion

Closing speaker Clifford Lynch reminded the audience that what began only a few years ago as a more limited discussion of science data curation has expanded to include the reuse of data, data management skills, cyberinfrastructure planning, interinstitutional collaboration, incorporation of smaller-scale e-science activities, and discussions of values and policies.  Rather than imagine that science librarians will have to become experts in each of these areas, however, Lynch contended that many individuals may become proficient at one or two of these newly valuable skills.

The speakers and panelists outlined an array of perspectives and issues that could redefine the roles of science libraries and librarianship, and emphasized the potentially enormous benefits of librarians becoming more familiar and engaged with the new and evolving practices of scientists and researchers.  In the near future, however, librarians’ support for e-science will most likely be defined by their “middleware” role.  By forming a bridge between and among researchers, systems, and data, librarians have an opportunity to make a significant contribution to advancement in science, e-scholarship, and research in general.

To cite this article: Elisabeth Jones.  “Reinventing Science Librarianship:  Themes from the ARL-CNI Forum.”  Research Library Issues:  A Bimonthly Report from ARL, CNI, and SPARC, no. 262 (February 2009):  12–17. http://www.arl.org/resources/pubs/rli/.

James Boyle of the Duke Law School has an awesome op-ed in today’s Financial Times about the ridiculously mislabeled “Fair Copyright in Research Works Act” (introduced, I am chagrined to say, by a Senator from a state in which I used to live – but one for whom I never voted, at least…).

Given that I’m working on a study of scientists’ perceptions of the Human Genome Project’s open data policy, and given that all of my participants so far have basically told me that open data is the greatest thing since sliced bread, I vehemently oppose this absurd piece of legislation.  But at the moment, at least, I cannot think of any more eloquent or entertaining way to make the case than Boyle’s piece.  Here’s my favorite part:

As a copyright professor, I have to say the bill is a nightmare. For reasons I won’t bore you with, its limitations on Federal agencies are completely unworkable. And as a scholar who writes about innovation, I have to say that it flies in the face of decades of research which shows the extraordinary multiplier effect of free access to information on the speed of scientific development. But speaking as a human being, I just have to wonder what could be going through a politician’s head at a moment like this. How did the dialogue go?

Staff: ”Hey. Here is a Bill that 33 Nobel prize winners say will dramatically harm science. The current and former heads of NIH agree.”

Representative: ”What do they know about science! Let’s endorse it.”

Staff: ”A group of legal scholars says that it will mess up copyright law and undercut a central tenet of Federal information policy.”

Representative: ”Pshaw. We got our copyright opinion directly from the commercial publishers. They say it will be great! Why would they lie?”

Staff: ”And the patients’ rights groups say it will tragically limit patients’ access to medical studies that their own tax dollars have funded, and slow down research that could provide a cure more quickly.”

Representative: ”Whiners. Since when have sick people had anything useful to teach us about medical research?”

I mean seriously, right?  What do Nobel Laureates, NIH administrators, and patients’ rights groups know about science policy, after all? It’s not like they, I don’t know, deal with these issues every day, or anything.

You should read the rest.

Indexed <3 Libraries

I always love how Jessica at Indexed makes complicated things simple – and often hilarious.  Today, Indexed takes on censorship, in a post entitled “Keep libraries free!”

Hear hear!

I heart InfoViz

This is a visualization of all the words I’ve ever used used recently on this blog, courtesy of Wordle.net.  At the moment it’s pretty skewed towards e-Science – should be interesting to see how it shifts if I start posting more about digitization, libraries, copyright…

I think the e-Science Forum stacked the deck.

I think the e-Science Forum stacked the deck.

Also, I like that “Hey” is on a par with “organizations” and “virtual.” Ha!

We just added a CC license to these ‘ere talking points (Attribution-Noncommercial-Sharealike 3.0), so I thought I’d throw them up here in HTML, to complement the PDF available from ARL.  I hear they’re also being translated into Japanese!

~~~~~~~~~~~~~~~~~~~

E-SCIENCE TALKING POINTS FOR ARL DEANS AND DIRECTORS

by Elisabeth Jones, University of Washington
with contributions from
Wendy Lougee, University of Minnesota
Neil Rambo, University of Washington
Eric Celeste, Consultant to the ARL E-Science Working Group
and guidance from other members of the ARL E-Science Working Group

24 October 2008
Association of Research Libraries

TABLE OF CONTENTS

    1.    What is e-science (or e-research)?
    2.    What are the key components of the developing cyberinfrastructure?
    3.    What are the most relevant areas for library involvement in e-science projects?
    4.    What are the key issues surrounding data?
    5.    What are some examples of library involvement in the data arena? What roles are librarians and library staff fulfilling?
    6.    What impact might the rise of “virtual organizations” such as those championed by NSF have on the provision of library services?
    7.    What are the data policies of the major funding agencies?
    8.    What is the connection between Open Access and Open Data?

1.    WHAT IS E-SCIENCE (OR E-RESEARCH)?

The term “e-science” is roughly—though not precisely—synonymous with “Cyberinfrastructure;” where the latter term is prevalent in the United States, e-science predominates in the United Kingdom and elsewhere in Europe. Both terms refer to the use of networked computing technologies to enhance collaboration and innovative methods in research. “e-science,” however, has a more specific focus on scientific research, whereas Cyberinfrastructure is more inclusive of fields outside the sciences and engineering, and incorporates greater emphasis on supercomputing resources and innovation.

Some researchers favor a third term for similar efforts: “e-Research.” E-Research is more inclusive of the social sciences and humanities fields, which have also benefited from networked collaboration and investigative resources in recent years.
For e-science in particular, a frequently cited definition appears in a 2006 article by Tony Hey and Jessie Hey:
e-Science is not a new scientific discipline in its own right: e-Science is shorthand for the set of tools and technologies required to support collaborative, networked science. The entire e-Science infrastructure is intended to empower scientists to do their research in faster, better and different ways.

Further reading:

Related terms: e-Research, e-Science, Cyberinfrastructure

2.    WHAT ARE THE KEY COMPONENTS OF THE DEVELOPING CYBERINFRASTRUCTURE?

Cyberinfrastructure (CI), according to the NSF report that popularized the term, is composed of “hardware, software, services, personnel, [and] organizations” (Atkins, et al, 2003: 13). That is, it incorporates not only physical technologies, but also human processes and social structures; together, these components provide a socio-technical basis for collaboration across geographic, disciplinary, and temporal divides.

A central element on the technical side of this emerging infrastructure is high-performance computing (HPC). HPC involves the use of advanced computing structures with huge amounts of processing power to churn through complex data sets and computational problems. The current state of the art in HPC includes grid computing and cloud computing, the latter built on technical foundations laid by the former.Still, such computing infrastructure would be useless without the social elements of CI: people to develop useful systems, maintain those systems once built, and work with end-users on employing those systems efficiently in their research work. The idea of CI is to use advanced networking technologies to facilitate collaboration, data management, data analysis, communication, and dissemination across institutional and geographic borders; such technological facilitation will require a significant investment of individual and institutional commitment to system building, maintenance, and support.

Further reading:

Related terms: Cyberinfrastructure, Distributed Computing, Grid Computing, Cloud Computing

3.    WHAT ARE THE MOST RELEVANT AREAS FOR LIBRARY INVOLVEMENT IN E-SCIENCE PROJECTS?

Perhaps unsurprisingly, many of the relevant areas for library involvement in CI projects have to do with managing the large amounts of information that such projects produce. CI projects often reside in departments or institutes that lack any specific data management expertise. An important library role moving forward could be to help such departments and institutes with efficient storage, preservation, metadata creation, and access provision for the data they generate. Beyond this, libraries can develop methods for maintaining increasingly important chains of connection between publications and their data and between data and scientific workflows.

Libraries can provide researchers with valuable policy and content-management consulting services. Librarians increasingly will need to develop expertise in the areas of open access/open data issues, licensing, and data policy management in order to address challenges the we face; this expertise will in turn become a valuable resource for researchers and research teams with questions in these areas. This would build on the expertise librarians have already developed in the area of content management and the implementation of robust models for long-term data preservation such as the Open Archival Information System (OAIS). This base of expertise could help to make librarians an excellent resource for researchers in need of centralized data support for distributed, multi-institutional teams.

The emerging cyberinfrastructure also provides excellent opportunities to build partnerships between libraries and other university units, including science research teams, campus IT, offices of sponsored research, and offices for copyright or rights management. Depending on the particular structures in place at a given university, the library could even come to play a bridging role between different stakeholders in this area, as has occurred in the context of Cornell’s VIVO project. Since science is often practiced by teams that cross disciplinary and institutional boundaries, it will also be important for libraries to help their institutions meet the needs of interdisciplinary and multi-institutional research teams.

Further reading:

Related terms: Collaborative Working Environment (CWE), Computer-Supported Collaboration (CSC), Computer-Supported Cooperative Work (CSCW), Digital Preservation, Metadata

4.    WHAT ARE THE KEY ISSUES SURROUNDING DATA?

Key issues surrounding data and e-science include:

  • Discovery and Identification: What data exist? Where are the data and how can they be accessed?
  • Access: Who has access? How will the privacy of both users and research subjects be protected? What kinds of rights management structures need to be established, if any?
  • Interoperability: In what formats will data be stored and presented? What kinds of metadata will be applied? How will variables be described? What data models apply?
  • Retention Criteria: Is the data likely to be reused? Will another researcher be able to reasonably replicate or build upon the original results using this data? What is the cost of metadata creation, and how does that compare to the expected value of the data to other researchers?
  • Migration/Preservation: Will data need to be converted or migrated in order to be usable? Will legacy system configurations need to be preserved or emulated in order to ensure long-term usability of this data?
  • Idiosyncratic practices for data management: How was the data managed in the laboratory environment? If researchers developed their own ad hoc systems, what impact will this have on how the data will need to be stored for future usability?
  • Culture of “data as private good”: On what grounds do researchers and institutions object to data sharing? Is there a sense that the data is personally or institutionally owned? Is this the case legally or ethically?

Further reading:

Related terms: Data Access, Data Management, Data Sharing, Scientific Data Archiving

5.    WHAT ARE SOME EXAMPLES OF LIBRARY INVOLVEMENT IN THE DATA ARENA? WHAT ROLES ARE LIBRARIANS AND LIBRARY STAFF FULFILLING?

Roles that libraries are already playing in the data arena:

  • Data management, including collection, organization, description, curation, archiving, and dissemination.
  • Creation of new data- and scholarship-based electronic resources for university and/or public use.
  • Development of new models, standards, and architectures for various aspects of data management, description, etc.
  • Building accessible linkages between all the components and stages of research, from data to researchers to publications.
  • Bridging institutional hierarchies and departmental divisions in service of interdisciplinary initiatives.

This is by no means an exhaustive list. A growing body of work assessing possible library roles in e-science and data initiatives, as well as the professional skill base that will be necessary to successfully perform these roles, continues to emerge from libraries and library organizations worldwide, especially in Canada, the UK, and Australia; a few recent exemplars are cited below in greater detail and in the further readings. The NSF sponsored Science Data Literacy project at Syracuse University provides one list of opportunities.

The VIVO project emerged at Cornell in 2003, born out of a set of initiatives geared toward increasing interdisciplinary work at the university. The library, became a leader in this collaborative effort, acting as a bridge between Cornell’s strongly hierarchical administration, academic departments, and research centers. In the spirit of this leadership role, the library’s Life Sciences Working Group developed VIVO as a discovery tool for both resources and potential collaborators; that is, VIVO includes not only traditional library materials like journal articles, but links from those materials to pages for the faculty who produced them, other materials produced by the same researchers, and events related to the topic area the materials cover. To bring all of these resources together, the architects of VIVO scoured the university for datasets that they could mine and cross-reference. For example, grants information from Cornell’s Office of Sponsored Programs, journal citations from BioSis and PubMed, and researcher department and contact information from Cornell’s PeopleSoft human resources database all became part of VIVO.

The Distributed Data Curation Center (D2C2) at Purdue had a somewhat different genesis and development. Purdue University has a strong institutional orientation toward science, technology, and engineering disciplines. The D2C2 initiative sprang out of a recognition that the university’s librarians were well positioned to help such researchers and interdisciplinary groups manage their data needs. Purdue librarians are tenure track faculty, and this not only gained them credibility among the departmental faculty, but also made it reasonable for them to do things like sign on as co–Principal Investigators for grant proposals requiring a data sharing component. The D2C2 initiative has also led to the creation of tangible technical products such as the distributed institutional repository (DIR) framework, which “supports discovery and access to digital objects of e-research, including data and documents in various forms, formats and locations,” interoperating with other information systems and repositories through an OAI-based architecture. An especially visible output of the D2C2 efforts, Purdue e-Scholar, was built on this DIR framework; it acts as an umbrella service, including a document repository, a special collections repository, and a federation of data repositories.

Further reading:

6.    WHAT IMPACT MIGHT THE RISE OF “VIRTUAL ORGANIZATIONS” SUCH AS THOSE CHAMPIONED BY NSF HAVE ON THE PROVISION OF LIBRARY SERVICES?

Scientists have begun to work across institutional boundaries through inter-institutional or even international “collaboratories,” which provide network-enabled environments for executing particular kinds of research. A few examples:

  • The Southern California Earthquake Center (SCEC) gathers seismic data from hundreds of scientists at 46 institutions, and provides shared resources like a community modeling environment for visualizing quake impacts.
  • The NSF’s nanoHUB project provides a venue for sharing nanotechnology research resources, including simulations, presentations, and teaching tools, freely over the TeraGrid, and for communally filtering these resources so that the most useful will “rise to the top.”
  • The Humanities, Arts, Science, and Technology Advanced Collaboratory (HASTAC) links together a diverse set of more than 80 institutions—from supercomputing centers and grid infrastructure groups to museums and humanities institutes—to support education, archiving, and collaboration among those interested in the historical, social, and humanistic implications of digital technology use.

When research projects are composed of hundreds of researchers from dozens of universities, as many projects supported by virtual organizations are, librarians must work to establish services that are untethered from location, accessible broadly to researchers collaborating over the Internet. Libraries can establish their own presence in the virtual organizations relevant to their institutions (perhaps embedding chat reference services or data or repository linkages on collaboratory sites like nanoHUB), or establish “reference desks” in virtual worlds like Second Life. We can also continue promoting researcher participation in open access repositories, since these help to remove the institutional subscription barriers to electronic resource access, providing a common literature on which multi-institutional collaborations can draw.

More examples of virtual organizations

Further reading:

7.    WHAT ARE THE DATA POLICIES OF THE MAJOR FUNDING AGENCIES?

Funding agency data policies, especially in the United States, are highly dispersed, variable in their scope and specificity, and in many cases difficult to even locate. Some policies mandate data archiving, while others call only for data sharing; some exist at the highest agency level, while others are specific to particular departments or even specific projects. In May 2008 the president’s Office of Science and Technology Policy promulgated “Principles for the Release of Scientific Research Results” that may, in time, drive agencies to develop clearer policy.

The NIH policy is quite detailed for a US agency, but has raised political objections from scholarly science publishers who feel that it tramples their publication rights. The NSF’s policy has been less controversial, but remains extremely general, and lacks any specifics on archiving, metadata, or policy enforcement. The earth sciences have a reasonably well-established protocol for data sharing, thanks in part to an existing global system of data centers for this kind of information (and, one suspects, in part to the fact that geospatial data tends not to implicate human subjects issues).

Human subjects issues and proprietary data sets create larger roadblocks to data sharing in other research disciplines, particularly health and social sciences. Nevertheless, the major US federal supporters of these types of research, NIH and NSF, continue to push forward in developing data sharing policies.

Abroad, the situation is quite different. In some countries, data policies have become national priorities: Australia, for example, recently implemented a nationwide mandate for data sharing within state funded research.

Further reading:

A selection of data policies and similar documentation:

United States

Europe/Australia/International

  • Listing available at SHERPA JULIET, a project of Research Libraries UK (formerly CURL)

8.    WHAT IS THE CONNECTION BETWEEN OPEN ACCESS AND OPEN DATA?

Open Access and Open Data share strong ideological ties, but diverge in the content being shared and the arguments for and against such sharing.

In Open Access, the object of sharing is generally scholarly literature, conventionally defined: that is, journal articles, conference presentations, and other more or less “finished” scholarship. In Open Data the focus is different; as the name suggests, open data policies and initiatives focus on increasing access to data—that is, the underlying geospatial codes, laboratory measurements, and other “raw” information produced in the course of conducting research—so that others can review, repurpose, and/or aggregate that information to improve the quality, utility, and reach of the underlying research, or to build it into something new.

Like Open Access, Open Data has proven controversial, yet the sources of controversy differ between the two movements. For Open Access, the most forceful objections have been raised by the existing scholarly publishing industry, who object to policies that they see as a challenge to their business model. For Open Data, the complaints emerge not from the publishing industry, but from researchers and research institutions. The objections raised against Open Data are quite distinct from those leveled against Open Access, among them:

  • Having to share data before the individual researcher/research group/institution has fully exploited it might reduce the incentive to produce the data in the first place.
  • Different legal systems afford different protections for databases and datasets; effective sharing creates thorny international intellectual property issues, and in some cases may directly clash with particular pieces of database protection legislation.
  • Particularly in medical fields and others dealing with human subjects, data sharing creates complicated confidentiality issues.
  • The formats of research datasets are insufficiently standardized to enable their integration, and attempting to increase standardization might create a disincentive for healthy variation in methodological choices.

Though the two movements arise from a common desire to broaden access to scientific work, the obstacles that they face—and the parties raising concerns about them—could hardly be more different.

Further reading:

Related terms: CODATA, Open Data

ARL E-SCIENCE WORKING GROUP (2008–09)
Wendy Pradt Lougee, Chair, University Minnesota
Pam Bjornson, Canada Institute for Scientific and Technical Information (CISTI)
Clifford Lynch, Coalition for Networked Information
Becky Lyon, National Library of Medicine
Carol Mandel, New York University
James Mullins, Purdue University
Gary Strong, University of California, Los Angeles
Betsy Wilson, University of Washington
Eric Celeste, Consultant to the Working Group
ARL Staff Liaisons, Crit Stuart & Julia Blixrud

Hey Santa

For Christmas, can I have a copy of this?

 

Yes, this academic book. Yes, I am a nerd.

Growth

It seems like such a short time ago that I was sitting around a little round table in UM Media Relations & Public Affairs, sifting through blog posts and working out talking points “buckets” about Google Book Search, with Molly Kleinman sitting right next to me doing much the same thing.  At the time, we were both first year Master’s students, with extraordinarily similar interests; sometimes it actually seemed like we were following each other around, because we’d end up taking the same classes, attending the same talks, arguing about the same issues with many of the same friends…

Since then, though, we’ve followed pretty different paths: I went towards academia, entering my current doctoral program and generally committing to several more years of education; Molly (before she’d even graduated, mind you) went to work for the University of Michigan Libraries, first as a Copyright Specialist, and now also as a Special Projects Librarian. 

Why am I telling you all of this?

Because Molly has been doing some really cool things lately (particularly about Creative Commons licensing in academia), and as a result has started popping up all over the web (especially in my Google Reader).  And since, ironically, I haven’t yet published much (or, ahem, at all) in my own interest area since leaving Michigan, I thought I’d put in a plug for the awesome stuff that Molly has been doing. 

In particular: 

One of these days, I hope to have a few of these of my own to point to.  And it shouldn’t be too long.  I’m working on two potentially publishable things right now: a paper/article on the ethics of Open Data mandates in science, and a summary article on Reinventing Science Librarianship. The latter should appear in the ARL Bimonthly Report before too long; the former…who knows? But I hope somewhere, if it turns out alright.

But in the meantime, read Molly’s!

Older Posts »