Small GLAM Slam Pilot 1 project update
This is a project update for the SGS Pilot 1 project. This is a WMF funded project (ID: 22444585).
They have been created two relevant places:
- Phabricator: VerySmallGLAM · Workboard
- Meta: Very Small GLAM
You can observe que are now using the coined term «Very Small GLAM». This will be the activity scope from now as we consider it short and and precise. The Small GLAM Slam denomination will be kept for the ID: 22444585 project.
What the Very Small GLAM term refers to? It identifies GLAM entities of very small size. How small? We got the inspiration from the concept of VSE (very small entities) coined for the ISO/IEC 29110 Series for software development entities up to 25 members. To set a focus we, a bit arbitrarily, chose the number 5 as «up to 5» members an institution or team working in GLAM. Very Small GLAM non circunscribes to Open GLAM, but Open GLAM would probably be the best approach or complement for this teams with not so much resources.
Wikimedia LEADS spin-off
An unexpected spin-off has been the conceptualization of the new initiative, Wikimedia LEADS (Learning Ecosystems and Ameliorating Data Space) when attending EU’s Next Generation Internet (NGI) funding call. The goal is to develop an advanced learning free/open data space and software ecosystem for the Wikimedia Movement.
Wikimedia LEADS first goal is to attend the GLAM Wiki learning needs. GLAM Wiki also shares a lot of commonalities with the Europeana community.
By extension, all practices, tools and many of the specific contents will be applicable to all other areas of the human knowledge.
Project’s activity areas
The SGS Pilot 1 is now structured in these work-packages:
- WP1—IT system developed with the NAS killer concept with a free software stack for GLAM;
- WP2—configuration for a locally installed Wikibase suite;
- WP3—GLAM ontologies and vocabularies;
- WP4—GLAM practices.
- WP5—data import.
WP1—IT system, update
The selected operating system is UNRAID. The technical justifications are:
- it’s a Linux distribution;
- it features the ZFS file system, probably the best alternative for data preservation;
- has a graphic administration interface
- and run on any PC compatible hardware.
At this point the most important update about UNRAID is, since the grant approval, it changed their licensing model and fees.
Current project hardware details are:
- a received a donated HP Z400 system, which happily includes an SSD disk;
- procured 24 GB of ECC RAM, PC3-10600E, the maximum supported by the board.
In the next days we’ll buy the three HGST hard drives and other minor components for the firsts tests.
For the system configuration development phase, we’ll use another lend computer as a test server.
Software systems
This is a proposal of software architecture for a local installation of Wikibase in a GLAM context:
There is not advances to report about software.
WP2—Wikibase suite configuration, update
As we are not still familiar with the Wikibase ecosystem we are practicing setting up some instances in Wikibase.cloud. Also, we are starting to identify relevant Wikibase features. We are using wikibase.world as inventory of:
WP3—GLAM ontologies and vocabularies, update
Here we have the most juicy results for the moment. After a papers research we identified the CIDOC Conceptual Reference Model (CIDOC-CRM) as an international reference model for museums. It’s more relevant when you find it’s being used as a reference for mapping or extending to other domains like CRMdig (digitalization) and CRMsoc (social phenomena and constructs), which are relevant for the «Memorias del Cine» archive. Very relevant is the availability of a CRM OWL ontology (non official, but apparently up to date) and some minor Wikidata mapping (https://w.wiki/9r$s, 24 items) Also, we identified the Records in Contexts–Conceptual Model (RiC-CM), whose ontology is also published in OWL format. RiC-CM is a reference for archival and we found initial works for CRM <-> RiC-CM mapping. The current mapping with Wikidata is anecdotal (https://w.wiki/9sAh, 4 items).
In the context of models of practices we are learning about SEMAT Essence. The formal specifications are expressed in text and in a UML metamodel file (.xmi). The concept of metamodel is practically equivalent to the LOD ontology. It took a while but now we know more about how to manage this XMI formats using Magic Draw. The plan is to import the ontology to Wikibase using the same tools than for CRM and related. We found the Essence «Package Competency» model is relevant for populating a map of competences/abilities for the Movement, as Wikimedia LEADS proposes.
For managing this information we are getting familiar with tools like Protégé, Fuseki, Magic Draw and some others.
A very happily discovery has been a couple set of tools for mapping and importing to Wikibase ontologies based in CIDOC-CRM. They are output of the projects SAF-Lux and GeoKB. We expect we’ll make intensive use of them or their derivatives.
An open question is, do we need to create a new Wikidata property for CRM identifier? Probably yes. I’m keeping some notes about ontologies for archival in my Wikidata user page.
WP4—GLAM practices, update
There is not so much real work on this side, since we are not ready to work modeling with Essence. But we are collecting some relevant bibliography for the project scope:
- C. Matos, Manual práctico para la digitalización de colecciones para difusión digital, 2022.
- A. Salvador Benítez, Ed., Patrimonio fotográfico: de la visibilidad a la gestión. en Biblioteconomía y administración cultural, no. 280. Gijón: Trea, 2015.
- J. M. Sánchez Vigil, A. Salvador Benítez, y M. Olivera Zaldua, Colecciones y fondos fotográficos: criterios metodológicos, estrategias y protocolos de actuación, Primera edición. en Museología y patrimonio cultural. Gijón: Ediciones Trea, 2022.
- Collections Trust, Spectrum 5.1: UK Collections Management Standard, 2022.
- Centro de Fotografía de Motevideo, Guía del archivo fotográfico, 2017.
- L. Bountouri, Archives in the digital age: standards, policies and tools. en Chandos Information Professional Series. Cambridge, MA: Chandos Publishing, an imprint of Elsevier, 2017.
WP5—data import, update
There have been not activity. We’ll start the data import when we have a first server operating.
Dissemination
Strictly focused in the SGS Pilot 1:
- Wikibase workshop at the University of Murcia;
- a poster session proposal for Wikimania: Towards a Very Small GLAM entities solution;
- and some proposed activities for the Wikimedia Hackathon 2024:
- Towards a Very Small GLAM entities solution, presentation;
- curate features for Wikibase, a hacking session;
- importing ontologies/vocabularies into Wikibase and Wikidata, another hacking session.
Related to Wikimedia LEADS:
- a poster session proposal for Wikimania: Wikimedia LEADS a Learning Ecosystem and Ameliorating Data Space;
- and a session proposal for the Wikimedia Hackathon: Wikimedia LEADS a Learning Ecosystem and Ameliorating Data Space.
What’s next
In the next days we’ll procure the pending hardware component to set up the server prototype. Then we’ll define the configuration and procedure to set up an UNRAID server instance ready for data preservation tasks. Then we’ll migrate the multimedia archive of «Memorias del Cine» to the server. The fun part of cataloging the archive in Wikibase would start as soon as we have an stable ontology model for digital archives.
Also, in May I’ll be attending the Wikimedia Hackathon in Tallinn and the AI Sauna in Helskinki. Reach me there in person if you are interested in our work.
PS: Adding references to WP5 (20240430).