Organization
Software
We propose that the collection be organized using The BioCASe Provider Software. As advised by the preservation plan (see Preservation section), this is 'an XML data binding middleware for publishing data from a relational database to an information network'. After installing BioCASe and configuring it for our given database, the published information will be accessible as a BioCASe web service, which means it can be retrieved with BioCASe protocol requests and will be freely and universally accessible on the Internet. The Provider Software is 'agnostic of the data model used for data publication and can be used in conjunction with any conceptual schema'. The core component is the 'PyWrapper, an XML/CGI database interface written in Python that allows a standardized access to a variety of database management systems and arbitrarily structured databases'.
This tool was chosen because 'even though BioCASe can be used for any conceptual XML data schema, its main field of application is the publication of occurrence data from specimen or observational databases to primary biodiversity information networks such as the Biological Collection Access Service BioCASe network, a transnational network of primary biodiversity repositories'. It links together 'specimen data from natural history collections, botanical/zoological gardens and research institutions worldwide with information from huge observation databases'. The data will also appear on the Global Biodiversity Information Facility, an 'international open data infrastructure, funded by governments that allows anyone, anywhere to access data about all types of life on Earth, shared across national boundaries via the Internet'.
Metadata
BioCase uses a more extensive metadata schema, which will deliver a reasonable and satisfactory amount of search results (Kennedy, 2008). They have developed an MS Access based application called the National Node Data Input Tool to manage collection metadata.
The application collects metadata about the collection:
It also collects information about the organization doing the collecting:
One can also add any keywords that they would like to link to their collection for searching:
Furthermore, the application takes in other, like related links, papers, or even other collections.
We shall adhere to the Guidelines for National Node data entry when entering data, striving to structure our data in a logical, easily understandable, and uniform manner. Another goal shall be to capture the hierarchical structure involved in the data. One example of this may be when a collection houses a sub-collection within itself. Null fields, should they exist, will be left empty as advised. The National Node Data Input Tool also allows for quality control after the initial entry of the data, which will ensure the future accuracy of the collection.
The collection developer hired should be familiar with the subject material, vocabulary, and schema, as well as the content management system (Kennedy, 2008). If possible, given the scientific nature of the collection, more than one cataloger should be hired; 1) a Lepidoptera specialist for vocabulary and material knowledge, 2) a content management system and schema expert.
We hope to use this Access database as a basis to develop a web-based metadata creation system that will directly connect to the Medusa preservation system (See Preservation section) and allow for a seamless description and preservation workflow.
------
Refences
Kennedy, M. R. (2008). Nine questions to guide you in choosing a metadata schema. Journal of Digital Information, 9(1). Retrieved from https://journals.tdl.org/jodi/index.php/jodi/article/view/226