Repositioning the Base Level of Bibliographic Relationships: or, A Cataloguer, a Post-Modernist and a Chatbot Walk Into a Bar

Designers and maintainers of library catalogues are facing fresh challenges representing bibliographic relationships, due both to changes in cataloguing standards and to a broader information environment that has grown increasingly diverse, sophisticated and complex. This paper presents three different paradigms, drawn from three different fields of study, for representing relationships between bibliographic entities beyond the FRBR/LRM models: superworks, as developed in information studies; adaptation, as developed in literary studies; and artificial intelligence, as developed in computer science. Theories of literary adaptation remain focused on “the work,” as traditionally conceived. The concept of the superwork reminds us that there are some works which serve as ancestors for entire families of works, and that those familial relationships are still useful. Crowd-sourcing projects often make more granular connections, a trend which has escalated significantly with current and emerging artificial intelligence systems. While the artificial intelligence paradigm is proving more pervasive outside conventional library systems, it could lead to a seismic shift in knowledge organization, a shift in which the power both to arrange information and to use it are moving beyond the control of users and intermediaries alike.


Introduction
Designers and maintainers of library catalogues are facing fresh challenges representing bibliographic relationships, due both to changes in cataloguing standards and to a broader information environment that has grown increasingly diverse, sophisticated and complex.In this paper we present three different paradigms, drawn from three different fields of study, for representing relationships between bibliographic entities beyond the FRBR/LRM models: superworks, as developed in information studies; adaptation, as developed in literary studies; and artificial intelligence, as developed in computer science.While the superwork and adaptation paradigms are more consistent with current cataloguing practice, the artificial intelligence paradigm is proving more pervasive outside conventional library systems, and could lead to a seismic shift in the way professional library cataloguers identify themselves and their professional skills.

Document Families, Description and Exploitation
In addition to issues of identity as related to our understanding of ourselves and other individuals, groups, and entire cultures, knowledge organization also deals with powerful but elusive forces of identity within information itself, particularly the identity of what Smiraglia has called "document families" (Smiraglia & Leazer 1999).Ever since the earliest catalogues, libraries have dealt with the unavoidable fact that the works in their collections cohere together in networks of relationships, and have striven to do justice to those relationships through cross-references, controlled access points and name authority control.These practices constitute what Patrick Wilson has called "descriptive" power: an ability to line up a population of writings in any arbitrary order, to make the population march to one's command" (Wilson 1968, 25).They exert this ability in the service of what Wilson calls "exploitative power": the ability "to make the best use of a body of writings" (25).Ever since Panizzi, libraries in the Anglo-American tradition have used a constantly-evolving suite of tools and rules to heighten the user's exploitative power by creating links and connections among both bibliographic records and authority records to cause "familial" relationships to emerge.
Nonetheless, times are changing, and our understanding of bibliographic relationships is facing fresh challenges.The affordances of current large information systems-wiki resources, recommender systems in e-commerce, and search engines-have dramatically changed the exploitative power that users expect, and consequently what they need from catalogues.In particular, we've come to expect information systems to answer questions directly, rather than directing us to documents that contain those answers, resulting in what David Weinberger has perceived as a shift in the relationship between data and metadata (Weinberger 2006).Whereas we used to use the library catalogue in order to locate a copy of King Lear and subsequently encounter the phrase "How sharper than a serpent's tooth," we now also search that phrase as metadata to locate the bibliographic record that will tell us where it came from.
Along with this shift in the relationship between data and metadata, we are finding a shift in the base level of categorization in bibliographic description.In Cutter's time, "the book" was held to be the key unit; with the influence of Julia Pettee (1936) and Patrick Wilson (1968), the "book" transformed into two entities: the "bibliographic unit" and the "literary unit," or 'the work."With FRBR, the two entities transformed into four: the "work," the "expression," the "manifestation" and the "item."But now we are transforming yet again: the massive increases in computing power and full-text access have exposed the wealth of relationships that lie beneath the level of the document.Patterns of quotation, citation, allusion, repetition, homage, echo, parody, pastiche and even coincidental resemblance, which were formerly far outside the cataloguer's purview, are increasingly accessible, and an increasingly important facet of exploitative power, requiring descriptive power to match.
Where is this power to come from?Some of it may indeed come from the field of knowledge organization, by stimulating potential that lies dormant in our traditions.But other fields may provide sources as well.This paper presents three possible paradigms for these bibliographic relationships, drawn from three different disciplines: knowledge organization, literary theory and computer science.In each case, we find that the "family" metaphor looks slightly different, due to differences in the relationship between descriptive and exploitative power in each case.

Knowledge Organization and the "Superwork"
Within the field of knowledge organization, scholars have developed the concept of the "superwork," based on Akós Domanovsky's principle of works that proceed from a common origin (1973).Extending this concept with Julia Pettee's concept of the "literary unit" (1936), Elaine Svenonius defines the superwork as an entity that contains "any number of works as subsets, the members of which while not sharing essentially the same information content are nevertheless similar by virtue of emanating from the same ur-work" (2000,38).Smiraglia explicitly relates the superwork concept to that of the document family, suggesting that in cataloguing practice, creating a superwork is akin to creating a "family ancestor" around which the various offspring can be collected (Smiraglia 2007, 74).The superwork concept works makes the most immediate sense within the context of FRBR: it provides a meaningful way to link together the different D. Grant Campbell & Alex Mayhew. 2023.Repositioning the Base Level of Bibliographic Relationships: or, A Cataloguer, a Post-Modernist and a Chatbot Walk Into a Bar.NASKO, expressions and manifestations of a work in a manner that is more powerful than was possible in traditional catalogues, but which is fully consistent with the principles of traditional catalogues at least since the Paris Principles of 1961.
If we expand our view of relationships beyond the strict confines of FRBR, particularly with adaptations of works, the superwork takes on additional, albeit familiar complexity.As seasoned cataloguers know, bibliographic relationships can extend beyond the traditional limit of "the work," and our older catalogue rules governing the choice of main entry were devoted to negotiating that boundary between one work and another.A French translation of Samuel Richardson's novel Clarissa, for instance, would be filed under Richardson as an expression of the work, while Margaret Doody's stage adaptation of the novel would be considered a new work, filed under Doody's name.
The superwork concept can survive this complexity, as long as the family "ancestor," as Smiraglia puts it, is sufficiently obvious and clear to function as the basis of the superwork.This is not always the case.As with humans, even the most influential works prove to be the offspring of others.Shakespeare's As You Like It is a retelling of Thomas Lodge's Rosalynde, and The Winter's Tale of Robert Greene's Pandosto.Furthermore, as with humans, works are rarely the offspring of a single ancestor.While we could justly call the stage musical Spamalot to be the offspring of the film Monty Python and the Holy Grail, the movie itself is the parodic offspring of multiple works, including Sir Gawain and the Green Knight and Le Morte d'Arthur.The concept of the superwork, as originally envisioned, presupposes a single work sufficiently influential and sufficiently primary to be conceived of as the source for the wealth of works and expressions that follow after it.
Recent work on the superwork, moreover, is taking the superwork in fresh directions.Smiraglia and Lee suggest that the "superwork" could take many forms: as a collective work under a uniform title; an original text that spawns a series of other documents; or even, in the case of the Chinese catalogue, The Seven Epitomes, a moral principle: The deep conviction of authorship identification in the Seven Epitomes does not extend into an organizing principle.That is, Author or Title in no manner serves as a function in organizing entries for retrieval.The only retrieval mechanism in the catalogue is a classification that is predominantly a ranking of moral values.(Smiraglia & Lee 2012) Imagining a moral principle as the basis of a superwork takes us into very new territory, and one more consistent with modern granular principles of collection.In the case of The Seven Epitomes, the "superwork" definition holds, because the moral principle invoked in the documents can be traced back to a bibliographic entity that can stand as the superwork.But not all granular relationships do so.As we will see later, the explosion of widespread communication through social media has created a wealth of recurring patterns generated from less prominent or important sources.The image of Bernie Sanders and his mittens that saturated Facebook after the 2021 Inauguration Day is better characterized as a meme, as the term was originally coined by Richard Dawkins in The Selfish Gene (1989): an abstract concept that becomes embodied in a series of documents, in which the impact rests on the multiplying instances, rather than its reference to a significant original instance.

Literary Studies and the Theory of Adaptation
D. Grant Campbell & Alex Mayhew. 2023.Repositioning the Base Level of Bibliographic Relationships: or, A Cataloguer, a Post-Modernist and a Chatbot Walk Into a Bar.NASKO, The field of literary studies has made an industry of tracing themes and motifs across time and text.Ulysses, the peripatetic hero of Homer's Odyssey reappears in Dante's Inferno and Tennyson's Ulysses and countless other references to wandering across the sea.We have studies of the picaresque figure in fiction, of weather in Romantic poetry, of classical references in neoclassical poetry and the heroic ideal in drama.Many of them adopt the notion of the superwork: Cervantes's Don Quixote is often held as the progenitor of picaresque novels, and Richardson's Sir Charles Grandison as the embodiment of masculine virtue in novels of the succeeding decades, including those of Jane Austen.In this sense, literary scholars have spent a long time in the granular details that knowledge organization practitioners, particularly cataloguers, have only recently been able to explore.
More recent literary scholarship, however, has paradoxically stepped back from these details to look at the practice of adaptation, in which an entire text is, to some degree, a full-scale response to a previous text.Many texts have echoes of and allusions to The Odyssey, but few have actually recast the entire story within a single day's wandering through the streets of Dublin, as did James Joyce in Ulysses.In the field of literary studies, Linda Hutcheon has proposed a theory of adaptation which works on this scale in which she attempts to challenge a long-held assumption that adaptations, by their nature, are less worthy of consideration than their originals.Hutcheon identifies three distinct but related descriptions of adaptation: • An acknowledged transposition of a recognizable other work or works; • A creative and an interpretive act of appropriation/salvaging; • An extended intertextual engagement with the adapted work (Hutcheon 2006, 8) Hutcheon's theory differs from the superwork theory in two ways.First, Hutcheon remains stubbornly focused on works in their entirety.While superwork scholars like Smiraglia and Lee are investigating the world beyond FRBR, literary adaptation theory situates itself at the boundaries that were covered by our older rules, probing the decisions and the problems that cataloguers face as a matter of course.Does Pride and Prejudice and Zombies actually share a fundamental identity with Pride and Prejudice?To what extent would we consider a printed book, its audiobook recording, its ebook manifestation and its film adaptation the same work?
Second, Hutcheon, like many literary theorists, rests her theory on a fanciful notion of dialogue: not between authors, necessarily, but between texts, suggesting that literary works, as we experience them, engage in conversation with each other.While one text may be derived from the other, the notions of "origin" and "derivation" are far less important than the concepts of interpretation and intertextual engagement.The television series The Last of Us may derive from a video game, but for post-modernist theorists like Hutcheon, the interest lies, not in seeing the video game as the origin, but rather in studying the TV series as a related entity that preserves a common identity with the video game while making deliberate alterations, resulting in a "conversation" of sorts between the two.Cataloguers have traditionally been forced to treat such relationships as problems in consistently sorting and assigning entries.Post-modern theorists like Hutcheon offer up these relationships as a creative, imaginative and often playful interaction between texts across time and space.

Computer Science and Artificial Intelligence
D. Grant Campbell & Alex Mayhew. 2023.Repositioning the Base Level of Bibliographic Relationships: or, A Cataloguer, a Post-Modernist and a Chatbot Walk Into a Bar.NASKO, Artificial intelligence, in the form of intelligent agents and recently-released chatbots such as ChatGPT, take inter-textual relationships in the opposite direction: mining the data within existing documents to the point where the very definition of "document" begins to break down.In Large Language Models, the core unit is the "token": core units of text or code that process and generate language (Maeda & Bolanos 2023).Tokens are not works; nor are they expressions, manifestations or items.What is more, they are assembled not by human choice but by abstract correlation.They divide those assemblages that we call "documents"-be they items, manifestations, expressions or works-into constituent parts that can be combined and re-combined on the fly in response to direct prompts and questions.Earlier AI manifestations such as Google Home, Siri and Alexa promote an information interaction that is closer to conversation than traditional retrieval: instead of asking for a document that could answer our questions, we pose our questions directly to the agent, which extracts data from multiple sources to provide an answer.Chatbots extend this functionality with sentiment analysis and text classification to simulate human conversation (Iuchanka 2022).If the superwork concept traditionally rests on the notion of families and family ancestors, current AI treatments of "inheritance" more closely resemble the memetic concept defined by Dawkins and explored by Smiraglia and Lee.However, current Large Language Models form relationships in a way that has little to do with knowledge organization in the conventional sense.What useful connections, therefore, might we find?
There are precedents for such an approach in knowledge organization.Since the emergence of online databases in the 1970s, pre-coordinate indexing designed for printed catalogues have increasingly given way to post-coordinate indexes that enable users to combine concepts at the point of document retrieval, rather than having them combined for us by the cataloguer.Similarly, the rise of structured linked data attempts to shift the core retrieval unit from the document to the data element.In such cases, the many granular index terms that denote single concepts are ordered around a predefined structure of relationships in the form of thesauri or ontologies which enable the user to navigate along principles of hierarchy and association.
In addition to these professional precedents, we are witnessing a dramatic rise in the size and importance of crowd-sourced online resources such as Wikipedia: resources which link data elements together, not through the professional scaffolding of a thesaurus but by the perceived importance and relevance of those who participate in the common enterprise.The wiki site TV Tropes, for example, has grown into an enormous archive of tropes extracted from a range of media including not just live-action television but also such sources as film, theatre, anime, comic books and video games.The primary index divides tropes into four categories: Genre, Media, Narrative and Topical.Among the narrative tropes we find a "Character Flaw" index, leading to the trope of "Age Insecurity," a trope which the industrious enthusiasts have populated copiously, enabling a connection to be forged between Citizen of the Galaxy, Golden Girls, Interview with the Vampire and Auntie Mame.Such connections are inconceivable in any library catalogue of current design.It is made possible only by the active participation of interested end users, who not only contribute material but share in the tasks of curating it.In that sense, crowd-sourced projects place both exploitative and descriptive power, at least to some extent, in the hands of end users, although most of them rely on at least some intervention from information intermediaries.D. Grant Campbell & Alex Mayhew. 2023.Repositioning the Base Level of Bibliographic Relationships: or, A Cataloguer, a Post-Modernist and a Chatbot Walk Into a Bar.NASKO, However, resources such as TVTropes and Wikipedia are only the beginning.Algorithmic methods of artificial intelligence absorb these texts, and massive amounts of more texs besides, from a multitude of sources.Through a training process, the system, given a starting token, predicts what will be the next likely token through the statistical correlation between various strings of tokens.In so doing, they draw on multiple sources of information as well as sophisticated programs for ranking and weighting data and learning from previous iterations and uses.While the text itself may come from human beings, both the power to describe it and the power to exploit it have been relocated within learning systems that effectively combine these two kinds of power together according to statistical processes of correlation that, while they mimic human reasoning to some extent, remain fundamentally different from human reasoning.
Librarians, then, have always known that documents have family resemblances and family relationships.Literary theorists have always held that documents have conversations as well: intertextual interactions that play out among those who read them in various sequences and at various times.Crowd-sourcing projects detect and enumerate relationships on a scale beyond the scope of bibliographic description.Artificial intelligence practice has shown us also that documents can emerge, adapt and change, through systems that can mimic both description and exploitation in ways that we do not fully understand.

Implications: Adaptation
In terms of current, conventional cataloguing practice, Hutcheon's literary model of adaptation is by far the easiest to implement, and indeed has been implemented at least since the days of the Anglo-American Cataloguing Rules.Cataloguers have always been able to provide cross-references in bibliographic records to records for related works such as adaptations.With the advent of RDA, relationship designators provided cataloguers with the means of far more informative cross-references; the latest iteration of RDA has gone further and turned relationships into entities in their own right.It is possible to create semantically-meaningful cross-references between texts, without suggesting that one has predominance over the other.
The adaptation paradigm, therefore, represents an enhanced descriptive power: the capacity to create links between texts that would enhance the exploitative power of catalogue users.However, this descriptive power typically lies beyond the formal training of most cataloguers; it evolves within specialized knowledge domains, such as literary studies.Some form of collaboration between subject and domain specialists would be needed to implement such connections in a meaningful and consistent way.

Implications: Superworks
Superworks have always had a steady supply of devices within cataloguing rules to support them: devices which online environments have frequently failed to exploit effectively.In card cataloguing environments, the main entry rules were designed to ensure that works were collocated under the author, together with the editions of each work.While online catalogue interfaces have provided uneven support for such collocation, the FRBR-ization of cataloguing has created support for future interfaces that can represent works, particularly those that stand in the relation of superwork to others, more effectively.RDA's definition of "entity supertype" (Oliver 2021) can potentially serve as the means of defining a superwork.With new interfaces and new encoding methods, it is possible that the relationships within the FRBR paradigmcollecting expressions of works and manifestations of expressions-will gain a richer descriptive power, enhancing the navigation of the catalogue that forms the basis of users' exploitative power.Such a practice, however, if extended beyond the boundaries of FRBR to embrace wider family relationships, would also require further articulation of policies.What constitutes the official "family ancestor" that begins it all?How far back does one go in a bibliographic universe in which every work, like Dickens's Chuzzlewit family, can boast of an inheritance that stretches back to Adam and Eve?
Implications: AI Adoption of AI paradigms presents by far the most formidable challenge to current cataloguing practice, and one which lies beyond our current practices of descriptive cataloguing.Adopting a crowd-sourcing would involve identifying recurring component memes and tropes within documents, in a fashion similar to Vladimir Propp's enumeration of features of fairy tales (1968), or the Aarne-Thompson-Uther Index of folktale types.Such components would be encoded into the records to enable intelligent agents to construct flexible and intricate relationship networks.
While this clearly lies beyond the capabilities of current technical services departments, there are precedents for alternative configurations that might make such a scenario possible.The possibility of integrating linked data resources into bibliographic description has been a topic of interest for some time (Campbell & Fast 2004).The advent of wiki technology makes it much easier to create RDF-enabled data stores through crowd-sourced environments such as TV Tropes.If it were possible to use a lamination service to orient bibliographic records around such data when needed (Campbell & Mayhew 2020), the task of cataloguers might conceivably change to one of curating and analyzing the credibility and reliability of such data sources, rather than laboriously attempting to replicate that data within catalogue records.
Perhaps the most important link between such crowd-sourced techniques and traditional cataloguing lies in the fact that they preserve the visibility of both descriptive and exploitative power.Unlike the adaptation paradigm, which places descriptive power in the hands of subject specialists, or the superwork paradigm which places it in the hands of descriptive cataloguers, resources which rely on very simple technology to foster widespread user participation place both descriptive power and exploitative power in the hands of the users who create, contribute to and maintain such resources.And because of the visibility of this power, it is possible, at least in theory, for information professionals to serve as intermediaries that can facilitate the best of group commitment and creativity while curtailing some of its negative effects.
Resources based on algorithms such as intelligent agents and other AI innovations that are now surfacing bear a certain resemblance to crowd-sourced resources and linked data archives, in the sense that they too work on a far more granular level than does conventional cataloguing, parsing larger document assemblages and extracting smaller units of data to be combined in various ways.However, the opacity of many algorithms, and the inability of even their creators to explain how they work or predict their behaviour, means that the descriptive power that they wield-the power to arrange and present information-is outside our grasp or control.Furthermore, many of these algorithmically-based programs have been built for the purpose of advertising, or otherwise enriching their corporate owners, rather than serving the end user's own goals.As such, the exploitative power of such programs-the ultimate effective use of the information-is equally elusive.

Conclusion
Knowledge organization professionals have various options for recent changes, some developed within its own discipline, and some evolving beyond it.In this paper, we would argue that however we choose to represent and facilitate bibliographic relationships that extend beyond the FRBR paradigm-through the superwork, through theories of adaptation, through crowdsourcing projects or through AI developments-it is essential that we monitor who has the power to describe information, and who has the power to exploit it.Equally important, we must ensure, as far as possible, that such power remains visible.Theories of literary adaptation remind us that, despite the current fascination for granular manipulation and access, there are occasions when "the work," as traditionally conceived, remains deeply useful and relevant.The concept of the superwork reminds us that there are some works which serve as ancestors for entire families of works, and that those familial relationships are still useful.Crowd-sourcing projects have, with the right structure and design, significant potential for harnessing widespread participation.
And what of the new AI developments?In a recent article for The Walrus, Nicholas Hune-Brown writes of how libraries defied predictions of "the death of the library" in the wake of new technologies and transformed themselves to meet new challenges and fill new roles (2023,63).With the advent of these new developments in information generation and information control, it is tempting to cite such precedents as a means of reassuring ourselves that we've met challenges before, and will do so again.At the same time, the Future of Life Institute, in its celebrated open letter, warns us that developments in AI could constitute "a profound change in the history of life on Earth" (2022).We are facing an information environment in which both of Wilson's two kinds of power-the power to arrange information and the power to use it-are sliding out of our grip.