Raymond Yee led an introductory group session on mashups and APIs, drawing examples from Google Maps and Flickr. We also had a round of lightening presentations.
How do we allow ordinary users to make mashups?
Commercial APIs are not purely open. Often they encourage a read-only mode rather than a read-write mode. They constrain opportunities for use, selectively closing things down.
It really matters what you deliver in terms of data because that constrains what people can do when they share or reuse. e.g., using a date stamp.
Why isn’t it as fun to play with humanities resources as it is to play with Flickr? We are behind the times in terms of trying to brand data and close access, and this is holding us back. What need is there for resource ownership? Is it a need for authority? Identity? The question goes to the heart of the funding system and the ways that academics earn a living from decade to decade. How do we give things away and still benefit? Still maintain a brand or an identity? Researchers create data and then sit on it because they are in competition with other researchers. Also issues of licensing, copyright, orphan works: for repositories that contain material that they do not own, access is one thing, but sharing is another.
Data may not be well documented. Often we don’t know the assumptions under which it was created; we don’t have data dictionaries. e.g., Wordstar format, 80 columns, ad hoc compression schemes.
Reliability, continuity, robustness. Google can do it, but we can’t. Project management and sustainability are key issues.
If we give away data we can monitor how it is used. When other people hack it, we can’t monitor that as easily. Sometimes you give away something that no one wants.
We have the opportunity to bring in philanthropists, to teach us how to give away things (and maybe give us some money in the process.)
Working with colleagues on Recovery.gov (US Government transparency re: Recovery Act). Web services, syndication feeds: how to help Americans understand what is happening to their money.
- Cohen: a successful API enables uses of content or services that the producer of that content or service hasn’t anticipated. Ideally, it also enables a wider range of developers to produce those new applications.
- Krishnan: the best APIs are two way… e.g., Google Maps doesn’t allow reverse data flow to their service.
- Ramsay: most humanities content providers are overly protective of their content.
- Hitchcock: agrees that there’s a economic/political issue here before you get to the tech.
- Chudnov: Library of Congress is in the business of giving everything away.
- Rockwell: worried about the robustness of academic APIs. Can we depend on them in the long run?
Our Ontario (inherited from Alouette Canada). Plug-in widgets, e.g., take lat/lon from records and send KML to Google Earth. Canadian index, Ontario view. Change URL to change the view: HTML, RSS, XML Dublin Core, RDF, JSON, MODS. Solr, UnAPI, Apache Lucene. Solr is a layer on top of Lucene that allows efficient faceting.
Omeka is ‘WordPress for digital collections’. Plug-ins. PHP. Ask Jeremy Boggs for help with Omeka plug-ins. Data model, Dublin Core, modeled around items. Output JSON, XML, RSS2. Import CSV.
Josh Greenberg and Shekhar Krishnan
1) Digital Gallery. Undocumented JSON, Atom API. 750,000 digitized images with robust metadata.
2) Geodata. NYPL map digitizer / rectifier / warper. Easy to pull data out of the system in KML etc. GeoRSS feed. Third phase gazeteer, consume GeoRSS from Flickr, Zotero. Voicethread.com for K12 education allows students to make annotated presentations.
3) Relation. Yaddo exhibit never really went anywhere. FOAF; relationship and social network views.
1) O3d. API for 3D within browser. Need a pretty good graphics card. Build something like Wikipedia for 3D models of historic theatres (cf Sketchup).
Stéfan Sinclair and Geoffrey Rockwell
VOYEUR. Content-oriented (e.g., Flickr, Twitter) vs. tool-oriented (TAPoR) APIs. Large scale. First, create or customize tool. Export as HTML code that can be embedded in an iframe.
“Web as API”. The web makes a really good API all by itself. If you do that you find yourself drawn to linked data. Build apps that scale in weird ways. e.g., World Digital Library (wdl.org) 9000 requests per second, 1.5 Gigabits throughput. e.g., Chronicling America makes newspaper data more generally available; 140,000 title records, 1.5 million pages, both OCR and images. On Chronicling America try searching for content, finding thumbnails and viewing source. Clean URIs spell out what they do and are guaranteed to be stable. Clean URIs make the site amenable to caching and are friendly to bots. Interesting links in the header when you view sources; rel links, HTML standards. Subscribe to batch feed–allows text mining, crawling, shows completely raw view. You can use wget -m to slurp the raw data. Data views, alternate views; all pages that make an issue; all issues that make a batch, etc.
Google for Tim Berners-Lee “link data issues” document. Important to use URIs as names of things, use HTTP URIs, give people useful info when they visit, give them links to related / interesting things.
Latest posts by NiCHE Administrators (see all)
- CHESS 2017 Keynote Address: Bonnie Devine, “Claims, Names, and Allegories” - May 23, 2017
- Day of Canadian Environmental History at CHA 2017 - May 15, 2017
- Chicago: The Conference - April 7, 2017