The Future of Undifferentiated Personal Name Authority Records and Other Implications for PCC Authority Work John J Riemer Philip E Schreur February 14, 2012; end notes updated May 8, 2012 Part One: Desirability of Splitting Up the Records When a unique heading cannot be achieved for an entity in the LC Name Authority File (NAF), the result has been the creation of or addition to an undifferentiated personal name authority record The question of ending the practice and splitting up such authority records into individual, non-unique headings arose last year on the RDA-L listserv The problems posed by these authority records to those managing bibliographic data were noted as long as years ago But before any assessment of the degree of difficulty involved in breaking up personal name authority records is begun, a more basic question should be addressed: Is it a worthy goal to achieve? The library community has always valued differentiating names Undifferentiated authority records have only been created when insufficient information of the correct type was available to formulate a unique text string for the heading, and personal names are the only category of heading for which undifferentiated name authority records are allowed This occasional inability to create separate identities/NARs for each person included in an undifferentiated authority record is increasingly interfering with the human and machine uses of authority data, such as: Inclusion of all personal names in the LC NAF in the Virtual International Authority File (VIAF) Undifferentiated personal NARs are rejected by VIAF because they not refer to a unique entity Population of authority records with additional/new MARC fields (e.g 046, 053, 37X-38X, 856, 880) These additional fields can help us such things as bridge between the forms of an author’s name found on a book title page and an article for those searching in a broader discovery environment Chinese names in particular can pose additional problems but those not differentiated in the Romanized heading string could often be distinguished by the original script form of heading in an 880 field These MARC fields, however, can only be used in authority records for unique entities Connection of the headings in bibliographic descriptions to related authority data The “Control Headings” functionality in OCLC’s Connexion facilitates keeping headings in bibliographic data synchronized with changes in the associated authority record The software will not control a heading to an undifferentiated authority record and so the multiple entities contained within them cannot be synchronized A similar linked-data strategy could maintain the currency of access points in digital library project metadata but again a distinct heading must be available for each entity to be synchronized Supporting the semantic web through addressable authority data (http://id.loc.gov ) Humanities scholars marking up documents via the Text Encoding Initiative (TEI) have expressed a desire to embed demographic characteristics with the names they encounter 4; this aim could be achieved much more efficiently if they could point to library authority data that includes the new 37X fields Once again, however, these new MARC fields can only be used in authority records for unique entities Recommendation: Pursue the break-up of undifferentiated name headings according to the techniques proposed in Appendix C of the PCC Task Group on AACR2 & RDA Acceptable Heading Categories in conjunction with the PCC Day One for RDA Authority Records Part Two: Implementation Challenges and Implications of the New Policy A Long-standing NACO normalization rules6 would need to change Currently, the 1XX field in an authority record must not exactly match that in any other NAR (or a 4XX field in any record) The detection of non-unique heading fields shared by multiple NARs is a key means of uncovering the presence of duplicate records for the same entity Resolution: As John Attig has proposed, “The Undifferentiated Personal Name Indicator which is currently used to indicate that the authority record contains information about more than one person would be used to indicate that the 1XX heading is not unique within the NACO file.” By redefining the meaning of value ‘b’ in 008/32 in this way, we could both identify those headings whose 1XX forms are knowingly identical and preserve the capacity to detect inadvertent duplication B A number of stakeholders currently assume/depend on a unique text string in the 1XX of authority data in order to perform work, including providers and users of ILS software, authority vendors, and bibliographic utilities/library cooperatives Resolution: Explore the use of the ID number of the separate authority record as “the differentiating characteristic of last resort.” There are many precedents for the use of qualifiers Non-unique serial titles have been qualified in 130 fields of CONSER records Topical LC Subject Headings have been qualified as needed, e.g Power (Christian theology), Power (Mechanics), Power (Philosophy), Power (Psychology), Power (Social Sciences) One ILS software provider deals with the challenge of the same 4XX strings pointing to different 1XXs, e.g the initialism ‘ALA’ pointing to more than one full form of name, by adding the 1XX form to the non-unique 4XX.9 The ID number could be added permanently in a subfield to the 1XX in the authority record or could be added as a preprocessing step only in the files or environments where it is needed The ID number need not be displayed to users Recommendation: Consult with vendors and large file providers as to the best method of implementing undifferentiated headings into their systems by the PCC Day One for RDA Authority Records Part Three: Paradigm Shift in the Nature of Authority Work Historian of science Thomas S Kuhn pointed to anomalies that existing theories cannot adequately explain as occasionally triggering a crisis that results in a paradigm shift or a revolution 10 A scientific revolution occurs, according to Kuhn, when scientists encounter anomalies which cannot be explained by the universally accepted paradigm within which scientific progress has thereto been made The paradigm, in Kuhn's view, is not simply the current theory, but the entire worldview in which it exists, and all of the implications which come with it.11 Perhaps the anomaly of undifferentiated personal name authority records will be the catalyst to push us to consider a major change in the nature of authority work Historically, the primary focus of authority work has on been heading construction but this may become secondary to the function of identifying and differentiating entities With recent changes to the MARC format for authorities, additional information may be added to an authority record in machineactionable fields without the need to alter the heading itself This shift of focus from correct heading formation and updating (with attendant bibliographic maintenance) to entity registration and the storing of identifying information in the authority record itself may resolve a growing number of problems, some of which are articulated below: Case 1: In the summer of 2005, the Library of Congress called for comments on the advisability of allowing the addition of dates to existing personal name headings A number of respondents suggested placing information elsewhere in the authority record or only adding death dates to name headings with open dates or limiting the addition of dates to headings for “prominent” individuals.12 Concerns about the workload of bibliographic file maintenance (i.e keeping headings up to date on numerous copies of bibliographic records) seriously competed with the library community’s desire and ability to present users headings that showed an awareness of recent obituaries Case 2: Some libraries are seeking ways to collect names from institutional repositories and other campus sources and to integrate them with those found in online catalogs (Massachusetts Institute of Technology) Because these names have not been formulated according to defined NACO principles, they cannot be easily integrated into the library’s more structured data Case 3: There is interest in the re-use of library authority data in semantic-web applications like VIVO (Cornell)13 but in order for this approach to be successful, many more local names would need to be registered in the library’s authority file The California Institute of Technology is seeking to expose name identifiers for its current faculty through establishing all of the names through NACO.14 As linked data comes to the fore on the semantic web, ever larger, enriched authority files become of increasing importance Case 4: The growing information recorded in authority structures is attracting interest in its own right Considering presentations of authority data such as WorldCat Identities, 15 it is easy to envision authority records attracting direct searches from users, just as bibliographic records always have Case 5: The shift of focus away from a technically correct, tightly controlled form of heading would open participation in identity registration to a much broader group of individuals and present new opportunities for collaboration Recently, the OCLC Office of Research has reported the emergence of a new “Syriac Funnel” in which scholars are able to annotate and contribute non-Roman script to the VIAF cluster for particular names 16 Finding new constituents for the recording of authority data makes this activity more valuable and more supportable as an essential activity All of these cases suggest a tremendous role for authority data as library metadata transitions to the web An increasing demand for authoritative match-points makes the inclusiveness of the authority file imperative By no longer focusing on the uniqueness of the text string of an authority heading itself but rather on the uniqueness of the entity it represents, the creation of an authority record becomes much simpler and the ability to create headings can be opened to a much wider audience This broader participation will become essential as authority data becomes under increasing demand and the completeness of the file of crucial importance Recommendation: Have the PCC NACO program review the impacts that the elimination of the need for a unique text string in an authority record would have on training and program participation Evaluate the possibility of incorporating these changes into program documentation being done to support the RDA transition and preparation for the PCC Day One for RDA Authority Records John Attig message to RDA-L, April 26, 2011 http://www.mail-archive.com/rda-l@listserv.lacbac.gc.ca/msg05306.html PCC Standing Committee on Standards Task Group on the Function of the Authority File Final Report, April 1, 2003 http://www.loc.gov/aba/pcc/scs/documents/tgauthrpt_fin.pdf “The dynamic nature of undifferentiated name records results in extremely fluid relationships between LCCNs and bibliographic identities that wreak havoc in systems with linked authorities functionality Bibliographic identities can be added to, and removed from, undifferentiated name records over time, causing these records to toggle back and forth between undifferentiated status and unique status.” (p 16) Program for Cooperative Cataloging “Frequently Asked Questions on Creating Personal Name Authority Records (NARs) for NACO, last updated April 14, 2011 http://www.loc.gov/aba/pcc/naco/personnamefaq.html#21 TEI-XML Workshop/Seminar led by Julia Flanders and Syd Bauman of Brown University, at UCLA, April 21-22, 2011 http://www.humanities.ucla.edu/eventstalks/icalrepeat.detail/2011/04/21/411/-/Yzg0NzY1NWI0MDNiYzAxYjU2ZTg4N TAzODRiNDJiMzg= PCC Task Group on AACR2 & RDA Acceptable Heading Categories Final report, Aug 2011 http://www.loc.gov/aba/pcc/rda/RDA%20Task%20groups%20and%20charges/Report%20of%20the%20Task%20Group %20on%20AACR2%20&%20RDA%20Acceptable%20Headings-1.docx Program for Cooperative Cataloging “Authority File Comparison Rules (NACO Normalization),” revision approved Nov 2007 http://www.loc.gov/aba/pcc/naco/normrule-2.html Attig RDA-L message Attig Ibid ExLibris ALEPH Authority Headings in Conflict: Functional Specification Description, Apr 4, 2005, p http://www.google.com/url?sa=t&rct=j&q=site%3Afclaweb.fcla.edu%2Fuploads %20ambiguous&source=web&cd=2&ved=0CCcQFjAB&url=http%3A%2F%2Ffclaweb.fcla.edu%2Fuploads%2FLynda %2520Preston%2FAmbiguous_headings.doc&ei=bJknT5a7Bcik2gXUrNm8Ag&usg=AFQjCNGTTNeyJmC7ftmUpWvrU4GXuKIFA 10 Kuhn, Thomas S The Structure of Scientific Revolutions, 2nd ed., enlarged University of Chicago Press, 1970, chapters 6-7 11 Wikipedia article “Paradigm Shift,” http://en.wikipedia.org/wiki/Paradigm_shift (viewed Feb 5, 2012) Alexander Bird notes “The revolutionary search for a replacement paradigm is driven by the failure of the existing paradigm to solve certain important anomalies,” in his entry "Thomas Kuhn"in The Stanford Encyclopedia of Philosophy (Winter 2011 Edition), Edward N Zalta (ed.) http://plato.stanford.edu/archives/win2011/entries/thomas-kuhn 12 Library of Congress Cataloging Policy and Support Office “Summary Analysis of Comments Received on ‘Dates in Personal Name Headings’ Proposal and Corresponding Decisions,” Sept 26, 2005 http://www.loc.gov/catdir/cpso/deathdates.pdf 13 Responses to OCLC Research Library Partners Metadata Managers Focus Group round robin question on authority data uses in new discovery environments, discussed at ALA Midwinter Meeting, Jan 20, 2012 14 Laura Smart’s presentation at the PCC Participants’ Meeting program, “Eyeing the Future: What’s on the Horizon for NACO?” Jan 22, 2012 15 OCLC Office of Research WorldCat Identities http://www.worldcat.org/identities The displays are created through data mining http://www.oclc.org/research/activities/identities/default.htm 16 Karen Smith-Yoshimura, speaking at the ALA Midwinter meeting of the OCLC Research Library Partners Metadata Managers Focus Group, Jan 20, 2012 See also slides 32-33 of Thomas Hickey’s presentation “Authorities in a Connected World” at the Indiana Library Federation, Nov 16, 2011 www.oclc.org/research/presentations/hickey/indiana-lf2011.pptx ... fields shared by multiple NARs is a key means of uncovering the presence of duplicate records for the same entity Resolution: As John Attig has proposed, “The Undifferentiated Personal Name Indicator... which it exists, and all of the implications which come with it.11 Perhaps the anomaly of undifferentiated personal name authority records will be the catalyst to push us to consider a major change... Bibliographic identities can be added to, and removed from, undifferentiated name records over time, causing these records to toggle back and forth between undifferentiated status and unique status.” (p 16)