Thursday, November 8, 2012

Obama's Super Power: Compatible Data

Campaign artwork, photographed at
180 Avenue of the Americas, New York
photo: Micki McGee
Many stories will be told about Tuesday's Democratic victories across the United States, but for those of us interested in data interoperability, compatible databases is the big story. Michael Scherer at Time magazine tells the story:
"For all the praise Obama’s team won in 2008 for its high-tech wizardry, its success masked a huge weakness: too many databases. Back then, volunteers making phone calls through the Obama website were working off lists that differed from the lists used by callers in the campaign office. Get-out-the-vote lists were never reconciled with fundraising lists. It was like the FBI and the CIA before 9/11: the two camps never shared data. “We analyzed very early that the problem in Democratic politics was you had databases all over the place,” said one of the officials. “None of them talked to each other.” So over the first 18 months, the campaign started over, creating a single massive system that could merge the information collected from pollsters, fundraisers, field workers and consumer databases as well as social-media and mobile contacts with the main Democratic voter files in the swing states."
If data interoperability and analytics can help elect the president, what will it be able to do for historical and sociological research?  Super powers indeed.

Monday, August 6, 2012

From SNAC to NAAC, or Toward a National Archival Authorities Infrastructure

It’s not often that one sits in a room and feels that history is being made.  But back in May I had the tingly history-is-happening-here feeling.

David Ferriero, Archivist of the United States,
introduces Daniel Pitti, seated, and launches the
National Archival Authorities Cooperative meeting.
May 21, 2012.


The occasion was a meeting at the National Archive organized by archivist Daniel Pitti and his collaborators on the SNAC project. Its purpose: to establish the critical mass to develop a National Archival Authorities Cooperative (NAAC) – a professional group to develop and maintain what librarians and archivists call “controlled vocabularies.” NAAC aims to develop an archival authorities infrastructure, or authoritative records, for archives in the United States and abroad.

On the face of it, this may not seem like a history-makin’ moment.  In fact, for most people, it might elicit a long, refreshing yawn.  But stay with me here a moment.

Controlled vocabularies are the standard names that have long allowed researchers to look up someone or something in a card catalog and have a reasonably good chance of finding relevant materials. They also serve to provide the historical context, for example, information such as a birth date that allows for disambiguation. With authoritative records, you can be sure that the John Muir you’re looking up was the grandfather of the American environmental movement, rather than the fellow who published a Volkswagen repair manual (to use an example provided by SNAC collaborator Ray Larson in his talk). 

Ray Larson, University of California Berkeley,
describes the role of archival authorities in
disambiguating names.

So what would happen if there were uniform names for the folks in all the archival records in the United States?  You’d be able to see how everyone (or at least everyone who’s recorded in an archival record) is connected to everyone else. Many projects, from SNAC to The Crowded Page, to Linked Jazz, and Yaddo Circles are all working toward creating the maps (and underlying data structures) that will help us visualize the intricate links in various communities. Then – to paraphrase John Muir (the environmentalist, not the Volkswagen repair guru) – when we pick out one person by themselves we'll find them hitched to everyone else in the universe.

Thursday, February 16, 2012

METRO, NYPL Labs, and NYU to Collaborate on Linked Open Data Meeting


Photo via @samuel-huron on Flickr and the METRO

METRO, the Metropolitan New York Library Council, along with NYPL Labs, and NYU are hosting a Linked Open Data Meeting next Thursday, Feburary 23rd at the NYPL's Berger Forum Room (42nd and Fifth).  For anyone interested in interoperability of archival and other records, this meeting is not to be missed.  This meeting picks up where the amazing LOD-LAM Summit of last spring left off.

Wednesday, November 16, 2011

Compatible Data Roundup

Working through ideas in the NYPL Berger Forum room: Left to right at tablet: Katy Börner (Indiana University-Bloomington), Susan Brown (University of Victoria), Rob Weidman (Lehigh University), Aditi Muralidharan (UC Berkeley), and Charles Forcey (Historicus, Inc), September 24, 2011.

The inaugural meeting of the Compatible Data Initiative at Fordham University and the New York Public Library led to a number of posts around the web from meeting participants Craig Dietrich (USC), Jon Ippolito (University of Maine) and Elizabeth Cornell (Fordham University/HASTAC scholar).  Have a look at them for reporting on that September event. Since then we've been working on next steps, so stay tuned for updates in the coming month!

Saturday, September 10, 2011

Compatible Data Meeting to Highlight Network Visualization Challenges

The Compatible Databases Initiative will meet September 23-25th in New York City to consider how to foster interoperable data for network mapping and data visualizations of intellectual and creative communities.

Data visualization leader Katy Börner, Victor H. Yngve Professor of Information Science at the School of Library and Information Science at Indiana University, will launch the weekend meeting with a public keynote address on “Envisioning Scholarly Data” on Friday evening, September 23rd at Fordham’s Lincoln Center campus.

Saturday will kick-off with a keynote by Daniel Pitti, Associate Director of the Institute for Advanced Technology in the Humanities at the University of Virginia, focused on two projects he leads: the Encoded Archival Contexts Initiative and the Social Networks Archival Context Project.  Pitti's talk will be followed by presentations on several data visualization and research platforms, including The Crowded Page, The Orlando Project, Phylo Project, Project RoSE, and Yaddo Circles. The weekend will conclude with breakfast planning session on Sunday morning.

Yaddo Circles alpha. 
Programming/design: Aditi Muralidharan, Charles Forcey,
and Asik Pradhan. Project director:  Micki McGee.
Why compatible data?

At a recent meeting of digital humanities scholars, a representative of a major university library noted: “We no longer do projects based on scholars’ databases because we want to do projects that are first of a kind, not one of a kind.” The multiplicity and incompatibility of data and database formats that individuals and scholarly teams have independently developed have begun to encumber the development of the field. There is a pressing need to bring together the architects of these projects to consider how to address these questions of data(base) interoperability. The Compatible Data Initiative aims to begin this conversation.

What are our goals for this meeting?

The Compatible Databases Initiative inaugural meeting aims to identify and prioritize the key issues and needs facing the architects of database-driven digital humanities projects mapping intellectual and creative communities and other key figures in database architecture, especially: (1) archivists involved in the development of the recently released standards for archival contextual information, and (2) data visualization experts, regarding data design standards and practices. This working group will exchange best practices in this growing and innovative field and encourage all stakeholders in this area—scholars, archivists, designers, and programmers—to employ inoperable data in their new and ongoing projects.

The Crowded Page proof of concept, developed by
Andrew Jewell, Edward Whitley, and Jeff Heflin.
Questions that may be considered in the working meeting include: While many scholars continue to work in databases, what new possibilities are opened by considering data design within the context of the emerging possibilities for the semantic web and the linked open data movement?  How are relationships of artistic, literary, or intellectual influence represented or recorded in a data set? What standards of evidence are required for suggesting a relationship of influence? When multiple relationships exist amongst two individuals, are all the relationships recorded, or are some privileged over others? How are digitally encoded data sourced back to primary documents?








Participants include:
  • Katy Börner, Indiana University-Bloomington (Spaces and Places and multiple projects)
  • Susan Brown, University of Alberta (Orlando Project, Mandala)
  • Terry Capatano, Columbia University
  • Craig Dietrich, University of Southern California (Scalar, ThoughtMESH)
  • Richard Edwards, Indiana University (Yaddo Circles, Yaddocast)
  • Charles Forcey, Historicus, Inc. (Exploring Thomas Cole, Yaddo Circles)
  • Jon Ippolito, University of Maine (ThoughMESH, The Pool)
  • Alan Lui, UC Santa Barbara (RoSE Project)
  • Micki McGee, Fordham University (Yaddo Circles)
  • John Melson, Brown University (Women Writers Project)*
  • Aditi Muralidharan, UC Berkeley (WordSeer, Yaddo Circles, New York Times Visual Explorer)
  • Daniel Pitti, University of Virginia (SNAC: Social Network Archival Contexts Project)
  • Asik Pradhan, Indiana University (Yaddo Circles)
  • Harvey Quamen, University of Alberta*
  • Doug Reside, New York Public Library*
  • William Stingone, New York Public Library
  • Chris Alen Sula, Pratt Institute School of Library and Information Science (Phylo Project)
  • Ben Vershbow, New York Public Library
  • Robert Weidman, Lehigh University (The Crowded Page, The Vault at Pfaff’s)
  • Edward Whitley, Lehigh University (The Crowded Page, The Vault at Pfaff’s)

The Compatible Databases Initiative meeting is hosted by Fordham University and the New York Public Library in collaboration with The Corporation of Yaddo. The meeting is made possible by the generous support of the National Endowment for the Humanities, through a Digital Startup Grant to Fordham University; the New York Council for the Humanities through a grant to the Corporation of Yaddo; the Fordham University’s Dean of the Arts and Sciences Faculty; and the generosity of the New York Public Library's NYPL Labs.

For additional information, please contact Micki McGee at mmcgee [at] fordham [dot] edu.

* not confirmed

Thursday, September 1, 2011

Information Visualization Leader Katy Börner to Keynote Compatible Data Meeting

Information visualization leader
Katy Börner will keynote the
Compatible Data Meeting
Katy Börner, an international leader in information visualization, will present the keynote lecture on "Envisioning Scholarly Data" at the Compatible Data Initiative meetings at Fordham University, September 23-25th.  Börner's keynote will focus on developing visual representations of intellectual and creative communities.

Dr. Börner is the Victor H. Yngve Professor of Information Science at the School of Library and Information Science at Indiana University. She also serves as Adjunct Professor at the School of Informatics and Computing, Adjunct Professor at the Department of Statistics in the College of Arts and Sciences, Core Faculty of Cognitive Science, Research Affiliate of the Biocomplexity Institute, Fellow of the Center for Research on Learning and Technology, Member of the Advanced Visualization Laboratory, and Founding Director of the Cyberinfrastructure for Network Science Center at Indiana University.

Dr. Börner is also the curator of the Places & Spaces: Mapping Science exhibit. Her research focuses on the development of data analysis and visualization techniques for information access, understanding, and management. She is particularly interested in the study of the structure and evolution of scientific disciplines; the analysis and visualization of online activity; and the development of cyberinfrastructures for large scale scientific collaboration and computation. Börner is the co-editor of the Springer book on “Visual Interfaces to Digital Libraries” and of a special issue of PNAS on “Mapping Knowledge Domains” (2004). Her book “Atlas of Science: Guiding the Navigation and Management of Scholarly Knowledge” was published by MIT Press in 2010. She holds a MS in Electrical Engineering from the University of Technology in Leipzig, 1991 and a Ph.D. in Computer Science from the University of Kaiserslautern, 1997.

Dr. Börner's lecture will take place at Fordham's Lincoln Center Campus, Lowenstein Building, 12th floor on Friday evening, September 23rd at 6:30pm. The Lowenstein Building is on the northwest corner of 60th Street and 9th Avenue. ID is required for entry. This event is free, but registration 
is recommended. Map: Fordham University–Lincoln Center.

This event is made possible by the support of the National Endowment for the Humanities, a federal agency, through a grant to the Compatible Databases Initiative, and by the Office of the Dean of Faculty, Fordham University.

Wednesday, May 11, 2011

National Humanities Endowment Supports Compatible Data Meeting

The National Endowment for the Humanities Office of Digital Humanities awarded a Digital Start-Up grant to convene a September 2011 meeting on fostering open access interoperable data. The Compatible Data Initiative, or CompDB, was one of just 22 projects funded in the competition. CompDB aims to focus scholars working in digital network mapping projects on developing conventions that will will make their data interoperable to allow for cross-project connections.  

Humanities scholars, information scientists, librarians and archivists from the University of Nebraska, the University of Southern California, the University of California-Berkeley, Indiana University-Bloomington, Lehigh Univerisity, and the University of Virginia will meet to brainstorm on data standards. This project has been developed by Micki McGee and Richard Edwards in collaboration with The Corporation of Yaddo, one of America's oldest and most distinguished artists' retreats, and the New York Public Library's Division of Manuscripts and Archives, where Yaddo's records are housed.