In general, regarding digital projects, it seems like common sense to say that version control in cultural heritage projects is both important and relevant.
Version control can bring greater efficiency through better organization. In recent exercises, we used GitHub to facilitate collaboration creating websites in small groups. GitHub systematized drafts of our work product, making it easy to see a chronology of the different drafts, to see the changes from draft to draft, and to see the individual contributions of group members in the drafting process. Additionally, by allowing controlled updates to our main drafts, GitHub’s workflow process helped minimize erroneous overwrites of our drafts. When erroneous overwrites did happen, the chronological records of different drafts made it a relatively quick process to identify what went wrong and how to fix things. All these features helped improve our organization and thereby our efficiency.
In thinking about my own research, I cannot help but think that an earlier familiarity with version control would have led to better organization and efficiency as well. In tracing legal changes, I look at federal bills, statutes, policies, and correspondence. Organizing sources based on a chronology of their creation is probably the quickest and easiest way to find sources later when I need to review them. However, creation dates and file numbers alone tell me nothing about the type or contents of a source. I have tried to remedy this by adding more information to file names and by embedding the files in nested folders, and it has worked to some extent. Yet, Microsoft Windows limits the number of characters that can be used in a filepath, and I have often had to change my organizational approach because I have reached a maximum number of characters in a filepath. Also, at times I have used too many folders, leading me to forget how I have organized things, which files are in which folders, and where they are.
While ease of access is an important principle for organizing sources, there is another equally important principle for organization: replicating the locations of sources in archives. The positioning of sources in and within different archival files can help reveal how historical actors thought about and utilized the sources. If source files are renamed and reorganized by chronology just to improve ease of access for researchers, much useful information will be lost to researchers. Accordingly, I have had to keep a second set of files to replicate the archival structure in which I have found sources.
It would have been great to have learned about version control before my research project started and to have developed an efficient plan to meet my multiple organizational needs from the get-go. The still-imperfect version of my records that exists today has been a product of trial and error, and it is functional. As I have acquired new sources, my methods of organizing the sources have changed, and some of the changes have been considerable. At times this organizational process has felt like substantially more work than the actual analysis of sources and the writing. Given this, it seems plain that some foundational knowledge of version control would have been both important and relevant.
There is no question that learning how to use GitHub involved a learning curve and inefficiency. From a new-user standpoint, it was a lot having to familiarize oneself with a web-based, desktop-based, and mobile-app-based version of GitHub, and to understand which versions were needed for which tasks. Understanding the workflow process and the jargon to describe it also took a bit of effort. Despite this, I think that the potential utility of using GitHub far outweighs the time lost as a new user.
I maintain that version control in cultural heritage projects is both important and relevant. That said, some things about GitHub concern me. Microsoft’s ownership of GitHub does not yet seem to have negatively impacted GitHub, but I generally would rather not use the products of tech giants if I can avoid it. Microsoft seems intent on pressuring Word and Outlook users (among others) to use its AI and to become more dependent on Microsoft products; I am interested in neither, and I wonder how much and how long GitHub will be able to act independently of Microsoft. Additionally, the more that data is consolidated on platforms owned by tech giants, the more uncomfortable I am. While GitHub users of course retain repositories on their hard drives if cloud-based data is unavailable, I do not know enough to feel assured that Microsoft or GitHub are responsibly securing data, even from their own access to it. (GitHub urged me to submit a copy of my transcript to verify I was a student to get free access to a student developer pack. What happened to this file after submission, for example?) Moreover, as GitHub continues to gain a more dominant market share (in part through Microsoft’s ownership of the company), what alternative version control services are we forgoing through the continued use of GitHub? For instance, I know many people use the Meta-owned WhatsApp not because it is the best platform or because they trust Meta’s management, but merely because of early adoption and now most of the people they know use the platform, and it is easier to use that same platform rather than trying to persuade everyone else to switch. How much of GitHub’s user activity is similar?
Certainly, all this is not to say that GitHub is a bad option or that it should be avoided. Rather, as with any digital tool, we should be thoughtful about which tools we choose to use and what the consequences of using them are. I undoubtedly need to learn more about GitHub as well as what other options are available. Irrespective of this, it seems clear to me that version control in cultural heritage projects is both important and relevant.
Recent Comments