The Process of Digitization
My past couple of posts have been more on the political and ethical side of digitizing materials for the Namibia Digital Repository. This post will approach the project from the other side: the process of digitization. For those who are conducting historical research, digitizing materials is a necessity if we are going to ever finish these dissertations in an organized and structured manner. So even for those who aren’t pursuing a digitization project for their CHI Fellowship, this blog post may help you in other ways.
As of 28 January 2016, when this post was written, I have the following materials digitized and uploaded to the repository with proper annotations:
Oral History Interviews (Filmed in Digital) – 11
Historical Films and Documentaries (Digitized) – 25
Photo Prints (Scanned) – 11
Missionary Reports and Memoirs (Scanned) – 1
NGO Reports and Working Papers (Scanned) – 3
Namibian Autobiographies (Scanned) – 2
Government Publications and Political Documents (Scanned) – 10
Out of Print Books and Secondary Resources (Scanned) – 29
Dissertations (Scanned) – 3
Magazine Articles (Scanned) – 1
The following were agglomerated from other websites and uploaded to the repository with proper annotations. I will dedicate a separate post to the process of finding existing Namibiana materials on the web, so stay tuned for the February blog post, when I will elaborate on this further.
Namibian NGO and Government Reports (Agglomerated) – 44
Foreign Institute Publications (Agglomerated) – 30
Dissertations (Agglomerated) – 2
Digitizing is a very time-consuming and labor-intensive process. While some time can be saved through clever use of software, these sort of projects must be a labor of love. Otherwise you’ll just go mad.
First and foremost, you must track down these rare and out of print materials. As we are based in Michigan (and not Namibia), our first port-of-call is the MSU collections. MSU Library has one of the largest Africana collections in the USA (and even the world), and a large amount of material can be taken from the shelves. Alternatively, use MSU’s WorldCAT subscription to find more materials, and built into WorldCAT is an ILLiad request form to eventually get the book or multimedia material from Interlibrary Loan. It seems absurd, but in the digital age of ProQuest, JSTOR, ProjectMUSE, etc., Interlibrary Loan has become one of the most under-utilized resources for graduate students conducting research.
I also find time to digitize materials I pick up in Namibian used bookstores, not to mention the published materials available in the National Archives of Namibia (NAN), the Namibia Wissenschaftliche Gesellschaft (NWG), and the Basler Afrika Bibliographien (BAB). I generally don’t have problems digitizing printed materials that are published and out-of-print, but I never make available digitized archival materials and documents. That is highly un-ethical unless permission is granted from the archive itself, as visitors to the archive is one of the main sources of revenue for the institution. Digitized archival materials (unless done under the auspices of the institution itself) decrease visits to the archive.
Once you have a big stack of books, magazines, films, photos etc. Now comes the process of scanning and digitizing. MSU’s LEADR Lab has some of the equipment to help you digitize the tapes. I donated on permanent loan a multi-region international VHS player (which are surprisingly hard to come by) last year, so tapes from abroad won’t be an issue. Install the ION Video 2 PC software and hook up the VCR to the computer using the USB component and let the tape play, eventually giving you a .avi file in output. Because Omeka repositories have a 1gb file size limit, longer tapes will require post-digitization manipulation in Adobe Premiere Pro to reduce the file size to something more appropriate. You can also run some cleanup of sound, picture and static if needed.
Regarding old films, I’ve run into some trouble regarding entering proper Dublin Core bibliographic data into the Omeka Repository. If you were able to find the tapes through WorldCAT, some of the data will be already in the record, although this data is often incomplete. What you tend to have to do is actually watch the opening and closing credits of each tape in order to paint the fullest bibliographic picture for each entry. Sometimes you have to leave some blank spaces (director or producer is often unknown), but the more information the better.
DVDs can be digitized a bit easier and with higher quality. A licensed DVDFab Software can immensely assist you.
Books and documents are far more time-consuming and labor-intensive. While proper flat-bed scanners are useful in providing higher quality scans, they are far too slow for mass digitization. I privilege photocopy machines that have a scan function. This just speeds things up so much, although a small amount of quality might be lost. Most copiers can scan to email or USB in either greyscale or full-color. I should note that these will be full-spread scans (i.e. both pages are in view). Some institutions have top-down scanners which often have settings for page splitting, which for PDF creation is preferred. These are hard to come by, though, especially at MSU.
One software that has revolutionized mobile digitization is Office Lens (credit must go to Ethan for putting me on to it). This software flattens and auto-crops images to fit the page. For example, look at the recent repository dissertation entry titled “Landwirtschaft und ihre Nebenbetriebe in Südwestafrika”. This was a 1913 doctoral thesis from Universität zu Heidelberg on agricultural policies and production during the German colonial period. I scanned this using Office Lens while researching at the NWG in Windhoek. Rather than having to manually crop each page and split the spread in two, Office Lens automatically detected the boundaries of each page and split the image accordingly. This Android App has also worked very well for my archival work as well, especially when papers are bound awkwardly. Using Office Lens is quite helpful because it decreases the amount of time I’d need to spend working with Adobe Acrobat Pro in editing my files.
It is important to be as specific as possible in entering your Dublin Core bibliographic data and in assigning search tags to each repository entry. Digitizing materials is a long labor-intensive battle, but if the items are not organized properly, the time may be wasted.