[This article also published on the Trading Consequences blog]
On June 1st, Colin Coates, Bea Alex, Jim Clifford and myself presented the Trading Consequences project as part of CHESS’13, the Canadian History & Environment Summer School that took place in Nanaimo on Vancouver Island, Canada, from May 31-June 2, 2013.
CHESS provided us with the unique opportunity to present our progress on Trading Consequences to a wider audience of environmental historians to gain feedback on our current prototype, and to engage in a broader discussion on our general approach of combining text mining and information visualization to support research in environmental history.
As part of CHESS, we ran a half-day workshop. We first presented the goals of Trading Consequences and introduced the idea of leveraging computational methods (in our case, text mining and information visualization) to support history research and research in the humanities in general.
We then introduced our current visualization prototype to the CHESS participants (all environmental historians). We explained the visualizations’ core functionalities and how the underlying document corpus can be explored along geographical, temporal, and topical (i.e., commodity terms) dimensions.
For the rest of the workshop, historians’ freely interacted with the visualization prototype in groups of 2-3 people. We gave them small pointers of what to focus their exploration on. For instance, we asked them to explore the commodities “cinchona” and “cheese” and to zoom into locations that seem of interest in the context of these commodities. Explorations were always followed by brief discussions with the entire group.
As part of their exploration, some historians immediately started to focus on Vancouver Island as the geographic location where CHESS took place, and verified the mention of commodities there that had been discussed as part of other workshop presentations. Others experimented with commodities and locations related to their own research, and from, there, tried to assess the capabilities of the visualization and underlying data.
Workshop discussions focused mostly around 3 different themes: (1) the general functionality of the prototype and what the visualizations actually represent, (2) the underlying dataset and, closely connected to this, what kind of insights can be drawn from the visualizations, and (3) the potential of our approach in general.
Comments about the Visualizations: The historians quickly understood the general purpose and functionality of the visualizations. The basic visualization components, the geographic map, the temporal bar chart, the commodity tag cloud, and the commodity graphs were easily understood from a high level. There was some confusion, however, about lower level details represented in the visualizations. For instance, the meaning of the size and number of clusters in the map was unclear (e.g. do they represent number of documents, number of occurrence of a particular commodity, number of commodity mentions?). Some historians tried to drill down further into the visualizations and watch changes to make sense of these questions – sometimes this strategy clarified things, sometimes it added to the confusion. We gathered all comments and suggestions regarding the visualization design and are currently working on improving the prototype. One important part will be the addition of tooltips and legends to clarify the meaning of the represenations.
Insights Gathered from the Visualizations: A large part of the discussions focused on what kind of insights can be gathered from the visualizations and from the data set that we are generating in Trading Consequences. Some historians made a point that what the visualizations really represent is the rhetoric around commodity trading in the 19th century: what is shown is where and when adialogue about particular commodities took place; the visualizations do not necessarily provide information about the occurrence of commodities in certain locations or amounts that were traded from one location to the other. This raises the question of how we can clarify what the visualizations represent exactly and what kind of data they are based on (e.g., by adding more elaborate legends). One perceived strength of the visualizations that was mentioned is the fact that they provide an overview of the documents from a meta-level, in at a scale that humans do not have the capacity of.
Reactions to our Approach: The historians at CHESS were generally positive about our approach of combining text mining and visualization to help research processes in environmental history and they clearly saw the potential. There was some skepticism of how much a tool like this can actually produce profound outcomes (e.g., because of the noise in the data), and the stability and performance of the visualization prototype has to be improved to support a fluid “dialog” with the data. Some historians appreciated the use of visualizations as a visual search engine that can help to identify relevant documents in the corpus. Others suggested to add visualizations that can help to analyze particular patterns in the data (e.g. relations between different commodity terms and how these have change over time. We are currently working on visualization prototypes that focus on this latter aspect.