Topic Modeling and Facet Analysis of an Emerging Domain: Research Data Management and Data Curation
DOI:
https://doi.org/10.7152/nasko.v7i1.15623Abstract
Research data management (RDM) is often seen as the overarching field that permits research data to be managed, and is related to the field of data curation (DC), a subset of digital curation. Together, RDM and DC (RDM/DC) allow information professionals to work with clients and each other to make data available in support of the research enterprise. An emerging area of scholarly communication, RDM/DC represents a rich area of study from the perspective of knowledge organization (KO). This paper explores the following research question: What can facet analysis tell us about the emerging field of RDM/DC? First, the MAchine Learning for LanguagE Toolkit (MALLET) implementation of Latent Dirichlet Allocation (LDA) is used for topic modelling of abstracts of the RDM/DC scholarly literature. A preliminary analysis of this empirical data by the research team yields a number of topics and, when possible, their relevant aspects or contexts. Facet analysis principles are next applied to these results, producing four general facets: Practice, Stakeholders, Resources, and Study of RDM/DC; however, complex notions infused throughout the field such as “services” and “metadata” do not appear outright in the analysis. Each facet is then further explored through logical division, and the resulting system is encoded in Protégé and visualized using WebVOWL. We conclude that the major areas of emphasis in this data-intensive field will be fundamentally of interest to those in LIS, in scholarly communication, and perhaps increasingly, in KO and other fields that manage and make available data of all kinds.Downloads
Published
2019-09-23
Issue
Section
Papers
License
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).