Large Language Models (LLMs) and Cataloging: Exploring How ChatGPT and Copilot Assign Subject Headings and Call Numbers

Authors

  • Chris Holstrom University of Washington

DOI:

https://doi.org/10.7152/nasko.v7i1.95648

Abstract

Large Language Models (LLMs) have demonstrated some facility in language- and knowledge-intensive tasks that require domain knowledge, such as writing and computer programming. These syntactic facilities suggest that general purpose LLMs might be able to perform subject cataloging tasks like assigning subject headings and class numbers. This paper investigates how two commercially available LLMs (ChatGPT and Copilot) assign subject headings using the Library of Congress Subject Headings (LCSH) and the Sears List of Subject Headings, class numbers using Library of Congress Classification (LCC) and Dewey Decimal Classification (DDC), item numbers using Cutter numbers, and MARC fields for subject headings and call numbers. The paper finds that the LLMs show promise as automated catalogers, but exhibit numerous shortcomings, including: lacking specificity, using unauthorized terms, incorrectly assembling synthetic headings, assigning inaccurate headings and classes, and formatting MARC records incorrectly. Based on these findings, potential cataloging applications for current LLMs are primarily as aides and teaching tools, not as fully automated cataloging solutions. Additionally, collections considering LLMs for cataloging tasks should be aware of issues associated with these technologies, including environmental harm, de-skilling, intellectual theft, and bias.

Downloads

Published

2025-09-05