A central component of my CHI project is working with historical data. The creation of a database from historical documents is a long and tedious process, so I have decided to use one already available online. A group of economic historians published the African Commodity Trade Database (ACTD) by working with the Rural Environment History Group at Wageningen University. One of their datasets includes more than nine thousand commodities exported from Africa from the early nineteenth century to World War II. I plan to build from their work by transforming some of their data into visualizations on a website. This process includes several decisions that will shape my final visualizations and the historical contextualization I will contribute to explaining them. I’ve described part of that process below by highlighting some of the decisions I’ve made in selecting which data to use.

The dataset, published as an excel spreadsheet, contains the nine thousand entries, summaries, analysis, and associated metadata. My interest in the dataset is the primary data, the nine thousand entries contained in one sheet that includes annual trade information for different commodities exported from African colonies. Not every entry, however, contains complete information about the cost or quantity of the commodity, for example. These omissions illustrate a challenge with creating a database from historical documents that are rarely consistent over long periods of time. Nonetheless, the database contains enough consistency whereby the publishers converted several different currencies into British pounds. They have also standardized non-metric units of quantity into kilograms, where possible.

The cost information is important to economic historians who measure the imbalance of trade as expressed in monetary terms between Europe and Africa. I have instead decided to focus on quantity information to understand how much of a commodity was exported from Africa or from a specific colony over time. The three main commodities that interest me (i.e. animal hides, peanuts, and kola) include metric measurements of quantity, so I decided to standardize my visualizations by only using observations with metric conversions. For the 1,030 observations related to animal hides, for example, 774 of them (75%) have quantities in kilograms. This decision will help me to create better visualizations that illustrate how the export amounts changed over time. However, my exclusion of non-converted quantities does exclude specific regions. A majority of the animal hide observations for the Cape of Good Hope (99%) and all of those for Somalia do not include the metric conversion.

My CHI project will include a discussion of the limitations of working with the ACTD and with historical databases more generally. One clear limitation from the above example is the difficulty in standardizing measurable quantities across different regions, export practices, and documentary regimes. Although necessary for creating easy-to-read data visualizations, this process necessarily privileges some of the available information while excluding others.