Final Narrative Report:
INSTITUTE OF MUSEUM AND LIBRARY SERVICES
LEADERSHIP AWARD LL80066
MISSOURI BOTANICAL GARDEN
PRESERVING AND DIGITIZING COLLECTIONS IMAGES:
LINKING COLLECTIONS IMAGES AND DATABASES FOR PUBLIC ACCESSProject Dates: December 1998 through November 2000
Narrative and Financial Reports Submitted February 05, 2001
The Missouri Botanical Garden successfully completed its IMLS project to preserve and digitize collections images and link them to a database for public access. The Research Division Web Group digitized more than 21,000 botanical images and linked them to their associated data in TROPICOS, our nomenclatural database. We are expanding the initial project into a long-term digitization program through use of equipment and staff acquired during the grant period. This program will continue to provide worldwide access to major portions of the Garden's image collections.
The Missouri Botanical Gardenís herbarium contains over 80,000 types, which are specimens upon which a unique plant name is based. As of November 30, 2000, approximately 6,000 type specimens were digitized from a wide variety of neotropical plant families including the Bignoniaceae, Fabaceae, Apocynaceae and Myrsinaceae. Furthermore, over 700 new neotropical Orchidaceae types and line drawings were scanned, as well as all types included in incoming or outgoing loan material to help protect against loss of the material.
Specimen Imaging Process
The Missouri Botanical Garden began photographing type specimens from its herbarium in 1997 in an effort to link images to TROPICOS. At that time the specimens were photographed on a copystand using a 35mm camera and were then digitized and stored on a Kodak PhotoCD. Each image was edited using Adobe Photoshop to correct brightness, contrast, sharpness, and color fidelity. To make them available on the web, each image was saved as a 72 dpi JPG. These images had an average file size of 200 k, which is relatively large for a web image. Not only did it take a long time to download the image, but the pixel dimensions were such that the image was larger than the computer screen.
We began investigating high-resolution digital cameras to provide us with a much more detailed image and a more efficient capture process. We decided to purchase the Kaiser Scando dyna A+ because of its 3648 x 4625 maximum resolution and relatively low price. This camera can produce images with a file size up to 75 MB, so we also had to invest in additional hard drive space for the computer to which the camera is attached.
To make an image captured with the Kaiser camera available on the web, we had two options; either compress and shrink the image to make it a reasonable download for our users or find a software solution that would allow us to serve the image in a multi-resolution format. We investigated several server software packages and browser plug-ins and decided to go with the MrSID technology from LizardTech, Inc. MrSID, which stands for multiresolution seamless image database, is an image format that compresses an image up to 50:1 and uses a combination of client-side and server-side software to display an image on which users can pan and zoom. We no longer have to sacrifice image quality to provide users with a highly detailed image, so we believe that the MrSID software package has been a great addition to our project.
One of our main goals at the outset of this project was to digitize slides from the collections of several prominent researchers at the Garden, including Alwyn Gentry, Ph.D., Thomas Croat, Ph.D., and Peter Goldblatt, Ph.D. among others. The slides were first examined for both their scientific importance and image clarity so to only digitize the best images. The slides were duplicated to create a copy we could scan and the originals were put into cold storage. As of November 30, 2000, over 9,000 slides from the following collections had been scanned and linked to their associated data in TROPICOS. What follows is a short description of each slide collection digitized as well as a representative image.
Slide Imaging Process
Our original intent was to send the slides to a photo lab to be digitized and stored on Kodak PhotoCD. Shortly into the project we had the opportunity to borrow a slide scanner from another department and decided that it would be more cost effective if we scanned the slides ourselves. We then purchased a Nikon Coolscan 2000 with an autofeeder that allowed our Imaging Technicians to batch scan 40 slides per hour. Each image was saved as a separate, uncompressed TIFF with a resolution of 2700 dpi (the maximum output available from the scanner) and a file size of 25 MB.
We used Adobe Photoshop to correct brightness, contrast, sharpness, and color fidelity and to save the image as a 72 dpi JPG. After we had begun using the MrSID software on type specimen images, our Imaging Coordinator and Technicians tested it on the slide images and found that not every slide collection was suitable for multi-resolution imaging. This is due to several factors, the most important being the quality of the film stock used to capture the image and the level of detail and sharpness present in the original slide. So far, only the Solomon, Croat, and Denison slide collections have warranted a multi-resolution format.
Illustrations and Additional Images
MBG Press has recently published several volumes of illustrations to accompany its Flora of China series. These illustrations were sent to us on CD by the firm printing the books and each image was cropped and saved as a 72 dpi GIF. These 3,000 images were linked to their associated data in TROPICOS in the same manner as the slides and specimens.
In addition, over 3,000 images of live plants taken in the field by various researchers were edited and put online. These images cover a wide variety of families and geographic locations.
Because our slide collections are used by staff, students, and visiting researchers, it was important to make duplicates slides for circulation and to store the originals in the appropriate manner. We purchased a Kenmore Elite 20.3 cu. ft. Upright, Frost Free Freezer to house the original slides, which were stored in archival cardboard containers, double sealed in polyethylene bags, and placed in the freezer with a humidity indicator visible inside each wrapped container.
We had planned on storing our digital master files on Kodak PhotoCD, but once we began digitizing the specimens and slides ourselves we had to implement an archiving strategy. Our department already had a CD-RW at its disposal, but since we were capturing our digital master files at such a high resolution we would have had over 2,000 CDs at the completion of the project. This was not a viable option because the CD burning process was too time consuming and unreliable for the volume at which we were producing images. Therefore, we decided to look for another media option. We were already using DAT tapes with a maximum capacity of 24 GB per tape to backup our workstations, so we came up with another backup routine to include the images generated from this project. We know that this is not a permanent solution because of the relative instability of DAT tapes, but it was the most efficient and cost effective solution for us based on the available software, hardware, and media when we began archiving. We have begun looking at other possibilities which include network storage, magneto optical drives, and DVD-RAM. We are committed to preserving our archived digital images and will continue to migrate them until we find a reliable medium.
A major component of our project was the creation of an image database application that can be downloaded and installed on a PC or web server. The application stores information about each image in the collection in an Access database and uses Active Server Pages (ASP) scripts to retrieve and display the information through a web browser. The database contains 15 fields that correspond to the Dublin Core Metadata Element Set, Version 1.1 and 2 additional fields describing the image's width and height.
There are very few options for writing an application that is platform independent. Java is one option since instructions are not generated for a specific platform when the program is compiled. We were looking to go one step further, though, and write an application that didnít have to be compiled at all and could easily be ported from a userís desktop to a web server. To accomplish this task, we designed an application that uses ASP technology to display content from an Access database.
This system can be downloaded from our IMLS Project web site at http://www.mobot.org/mobot/imls/database.asp and installed on a PC running either Personal Web Server or Internet Information Server/Services (free add-ons for all Windows users). A thorough analysis of our web server log files shows that over 75% of our users are using the Windows operating system. While the application we developed is not completely platform independent, it does meet the system requirements of the vast majority of our users and can be used on either one workstation or on a web server.
Our lasting gratitude is with the IMLS programs for giving us this excellent opportunity to initiate our digitization program. We hope to continue this mutually beneficial relationship through lending our expertise in reviewing proposals or any other way we can assist IMLS, keeping our program open to other institutions for site visits, and participating in future grant programs. We also plan to retain our relationships established with other institutions during this process, and form new ones throughout our digitization program.