May 5, 2017

Launching the database

May 5, 2017

Before embarking on this project, Dr. Watrall said that making a database in SQL and taking online with PHP would involve too steep a learning curve to climb within the context of my participation in CHI this year. He was right. Ultimately, we decided that the most realistic goal would be to complete the database. The database provides a unique basis for thinking about the history of hydroelectricity in East Africa – namely, a quantitative basis – but unfortunately it is not yet freely available. Additionally, I can now use this SQL code as a template for expanding the database to include information about related environmental and economic phenomena in the Lake Victoria basin. I have not abandoned the hope for the database to be accessible online, but I think that the only time-efficient way for the database to get online is for me to collaborate with someone who is specifically trained to complete this kind of work.


Apart from the technical challenges associated with the project, there was conceptual work to be done as well in order to effectively digitize and integrate the array of sources that I used, i.e. data cleaning. Cleaning the data involved a few steps. First, I had to simplify most of the tables by dissembling them into discrete data points and reassembling them into multiple new tables. This step was necessary because the tabular data that British colonial technocrats sent to each other often consisted of complex, multifactorial tables that correlated various arrays of keys across time and space. I examined the individual data points that constituted these tables in order to distill a set of recurring keys. I used these keys as my bases for rebuilding the corpus into a database that would be amenable to computerized analysis.

The various keys map onto four axes of differentiation. The first axis compares different categories of users, with the most significant categories being “government,” “commercial,” and “railway.” It also distinguishes between the specific users that were active in the industrial towns that, in the late colonial period, began to dominate the northern end of Lake Victoria. The various entities included mines, factories, and housing complexes.

The second axis of comparison distinguishes between two different ways to use for electricity: power, measured in horsepower, and light, measured in kilowatts. Many electricity users deployed both forms, but many also restricted their use to one only. This axis also shows the database user that people used electricity to generate power for three specific types of use – namely, to run arc furnaces, to provide motive power, and to achieve steam raising.

The third axis provides a basis for differentiating between production and consumption, by reminding the database user that separate entities were responsible for each action and could only ever have partial knowledge of each other. The concept of “demand” is the key link included in this database between consumers and planners.

The fourth axis includes the terms that the hydroelectricity industry used to grapple with the difficulties of managing change over time in their development projects. Industry planners used several scales of temporal cycles in order to make these changes legible. These scales included variation within a year, which could be measured in terms of variables like “peak” and “average” use. Note that some keys force the database user to consider multiple axes of differentiation at the same time. Consider “estimated_demand,” which includes a relational element as well as a temporal element; this particular key offers to the database user a quantitative description of how people thought their relationships with each other would change over time.

Taken together, this set of keys that are shared across the corpus of statistical tables can give the database user multiple points of leverage over the social and technical contours of the hydroelectric industry and its variation across time and space. The user can produce quantitative representations of changing relationships in the consumption and production of hydroelectricity during the final years of colonial rule in East Africa.

