Date: Thu, 22 Apr 1993 17:04:51 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Carol Roberts Subject: Re: Trademarks ----------------------------Original message---------------------------- >----------------------------Original message---------------------------- >Is it only $5? >========= Yes, it's only $5. -- Carol Roberts Publications Services Cornell University cjr2@cornell.edu 607 255-9454 Practice random kindness and senseless acts of beauty. ========================================================================= Date: Thu, 22 Apr 1993 17:05:38 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Jean Dartnall Subject: Re: Signatures on EMail sent to INDEX-L In-Reply-To: <199304211412.AA18365@jculib.jcu.edu.au> ----------------------------Original message---------------------------- This complaint may have been addressed to me, in part. If this is so I apologise. I did not realise that my "from" address did not appear in your received messages since yours does appear in mine. I have been signing myself Jean Jean Dartnall Information Services Librarian James Cook University Townsville Queensland Australia lbjad@jculib.jcu.edu.au ========================================================================= Date: Fri, 23 Apr 1993 14:09:41 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Carol Roberts Subject: Fax Machines ----------------------------Original message---------------------------- I was just wondering whether indexers (especially those who are exclusively indexers) have much use for fax machines. (A friend has suggested I should have a fax-modem if I'm operating a small business.) I can't imagine the indexer or the publisher wanting to fax an entire page-proof set. I *suppose* you might be in a hurry during the negotiating process and want to fax an agreement/specs for approval. Am I missing something? Feel free to respond privately so as not to pester everybody with trivial mail. -- Carol Roberts Publications Services Cornell University cjr2@cornell.edu 607 255-9454 Practice random kindness and senseless acts of beauty. ========================================================================= Date: Tue, 27 Apr 1993 10:12:10 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Charlotte Skuster Subject: Fax Machines > I was just wondering whether indexers (especially those who are exclusively > indexers) have much use for fax machines. (A friend has suggested I should > have a fax-modem if I'm operating a small business.) I can't imagine the > indexer or the publisher wanting to fax an entire page-proof set. I > *suppose* you might be in a hurry during the negotiating process and want > to fax an agreement/specs for approval. Am I missing something? Feel free > to respond privately so as not to pester everybody with trivial mail. > -- Fax is fine for communication and correspondence, but not too good for proofs, etc. - and think of the cost in fax paper if someone tried to fax you a whole book. Unless you have a plain paper fax (expensive), faxed material is horrible to work on, for editing or indexing. Also, especially for technical material, the quality is often too poor to be properly legible. I'd come down on the side of having the fax available, for speed of communication and receiving last-minute alterations, etc., but avoid use of it for working material except in the direst emergency. ====================================================================== Kathleen M. Lyle Technical Editor, Applied Probability Trust, Hicks Building, The University, Sheffield S3 7RH, UK Phone +742 824269 Fax +742 729782 ====================================================================== ========================================================================= Date: Tue, 27 Apr 1993 10:15:21 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: "R.S. Etheredge" Subject: Re: Fax Machines ----------------------------Original message---------------------------- Howdee, A fax modem would allow one to download a file from some site remote via modem connection, or upload a file from local site to some remote site via modem connection. For the local site, this fax modem allows the received file to be printed on the local printer, at whatever the local printer quality. Have a happy weekend... Rusty Etheredge rse8135@dewie.tamu.edu ========================================================================= Date: Tue, 27 Apr 1993 10:15:50 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: robert hadden Subject: Re: Indexing research needs ----------------------------Original message---------------------------- To see what is needed to make an index useful, try looking through older indexed materials, and see how things were done in the past. To find a name in a long list, such as a regimental roll from Army records in the 1880s, all the names were grouped first by company, and then by first letter of the last name after rank. To find someone, the searcher needed to know the company, and rank, as well as the name, other wise the whole list would have to be examined. Within the letter grouping, there were no obvious way to group names- all S's were together, and jumbled up so Smith comes before Sith but sometimes after Sword. By seeing the evolution of indexing to something more useable to the reader or searcher, you can see the trend of indexing. From that point on, it is a matter of extrapolating into the future.... lee hadden usgs library ========================================================================= Date: Tue, 27 Apr 1993 10:17:15 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Carol Roberts Subject: Bridge Burning ----------------------------Original message---------------------------- Hi, everybody. Please excuse me for taking up bandwidth with this personal posting. But I wanted to let you know that I've taken the plunge: I've given my notice at my full-time (editing) job so I can establish myself as a full-time freelance indexer, now that I've discovered how much I love indexing. It was only with the guidance, interest, and support of you indexers (and my husband) that I was able to make this important decision; I hope I will someday get to turn around and help another indexer get started. Many, many thanks to you all!!!! I hope to meet you at the upcoming ASI meeting in Alexandria. -- Carol Roberts Publications Services Cornell University cjr2@cornell.edu 607 255-9454 Practice random kindness and senseless acts of beauty. ========================================================================= Date: Tue, 27 Apr 1993 10:22:43 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Charlotte Skuster Subject: Draft of Indexing Standards Jim Anderson has kindly shared with us draft 3.1 of the NISO standards for indexes. As with the 2.1 version, I have divided the document into three parts (of approx. 1400 lines each) and will distribute one part per day---starting today. Charlotte Skuster Index-l Moderator ========================================================================= Date: Tue, 27 Apr 1993 10:42:36 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Charlotte Skuster Subject: Standards draft 3.1 (Part 1) Here is a copy of draft 3.1 of the NISO standards guidelines for Indexes Your comments are solicited! [Jim Anderson] * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Standards for Libraries * Information Sciences * Publishing National Information Standards Organization (Z39) ANSI/NISO Z39.4-199X Proposed American National Standard Guidelines for Indexes for Information Retrieval *****Alternative title: Proposed American National Standard Guidelines for Indexes and Associated Devices for Information Retrieval***** Draft #3.1, prepared by James D. Anderson, Chairperson, based on committee recommendations and discussion as of November 6, 1992 and subsequent comments and suggestions from committee members. 20 April 1993 DISTRIBUTION -- This draft is available for comment to all members of the indexing community. An electronic copy may be obtained via email from janderson@zodiac.rutgers.edu A paper copy may be obtained from Rutgers University for the cost of copying, postage and handling ($12.00). Make checks payable to Rutgers The State University. See the addresses in the COMMENTS section below. COMMENTS -- Please send comments regarding this draft to James D. Anderson, School of Communication, Information, and Library Studies, Rutgers the State University of NJ, 4 Huntington St., New Brunswick, NJ 08903, 908/932-7501, FAX 932-6916, internet janderson@zodiac.rutgers.edu NOTE -- @ has been used for accent codes; @@ has been used to mark italics; @@@ has been used to mark boldface type. After editing, these codes will be replaced with the appropriate accents or type-faces. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Committee Members James D. Anderson, Chairperson School of Communication, Information, and Library Studies Rutgers the State University of NJ New Brunswick, NJ 08903 Barbara Anderson DIALOG Information Services Palo Alto, California Catherine Grissom Department of Energy Office of Scientific & Technical Information Oak Ridge, Tennessee Nancy Mulvany Bayside Indexing Service Kensington, California Barbara Preschel Public Affairs Information Service (PAIS) New York, New York Deborah Swain IBM Research Triangle Park, North Carolina Hans Wellisch College Park, Maryland * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Abstract This standard provides guidelines for the content, organization, and presentation of indexes used for the retrieval of documents and parts of documents. It deals with the principles of indexing, regardless of the type of material indexed, the indexing method used (intellectual analysis, machine algorithm, or both), the medium of the index, or the method of presentation for searching. It includes definitions of indexes and of their parts, attributes and aspects; a uniform vocabulary; treatment of the nature and variety of indexes; and recommendations regarding the design, organization, and presentation of indexes. It does not attempt to set standards for every detail or technique of indexing. These can be determined for each index on the basis of factors covered in the standard, including the type of material indexed, the medium of the index, the method of presentation for searching, and the type of user for whom the index is designed. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Table of Contents Committee Members Abstract Table of Contents Guide to the Standard Normative References Bibliography 0. Proposed title change 1. Scope of the standard 1.1. General statement 1.2. Types of documents 1.3. Presentation of indexes 1.4. Choice of terms 1.5. Method of preparation 2. Definitions 2.1. document. 2.2. documentary unit. 2.3. index. 2.4. indexing. 2.5. displayed index. 2.6. non-displayed index. 2.7. term. 2.8. descriptor. 2.9. main heading. 2.10. entry. 2.11. locator. 2.12. cross-reference. 3. Function of an index 4. Types of index 4.1. Indexes by type of referent 4.2. Indexes by type or extent of indexable matter used to produce the index 4.3. Indexes by arrangement of entries 4.4. Indexes by method of term coordination for searching 4.5. Indexes by type, format, or genre of document or media indexed. 4.6. Indexes by medium of index 4.7. Indexes by periodicity of the index 4.8. Indexes by authorship 5. Design of indexes 5.1. Subject scope 5.2. Documentary scope 5.3. Domain 5.4. Multiple versus unified indexes 5.5. Codes and symbols 5.6. Display media 5.7. Documentary units 5.8. Indexable matter 5.9. Analysis method 5.10. Exhaustivity 5.11. Specificity 5.12. Syntax 5.13. Vocabulary management 5.14. Documentary unit surrogation; locators. 5.15. Display of documentary unit surrogates. 5.16. Index display and arrangement 5.17. Search interface 6. Vocabulary 6.1. Summary 6.2. Sources of vocabulary 6.3. Forms of terms 6.3.1. Parts of speech 6.3.2. Spelling 6.3.3. Capitalization 6.3.4. Singular and plural forms 6.3.5. Articles 6.3.6. Compound terms 6.3.7. Antonyms and associated terms 6.3.8. Terms consisting of more than one word 6.3.9. Proper names and titles of documents 6.3.9.1. Personal names 6.3.9.2. Corporate body names 6.3.9.3. Geographical names 6.3.9.4. Titles of documents 6.3.9.5. First lines 6.3.10. Romanization 6.4. Terms weights 6.5. Homographs 6.6. Synonymous and equivalent terms 6.7. Hierarchical relationships among terms 6.8. Other relationships 6.9. Changes in terminology 6.10. Display of vocabulary in indexes. 6.10.1. Vocabulary information in displayed indexes 6.10.1.1. Cross-references versus double entries 6.10.1.2. Cross-references to multiple terms or headings 6.10.1.3. Location of "see also" cross-references 6.10.2. Vocabulary information in non-displayed indexes 6.10.3. Scope and history notes 7. Syntax for entries, headings, and search statements 7.1. Summary 7.2. Entries in displayed indexes 7.3. Syntax in displayed indexes 7.3.1. Ad hoc syntax 7.3.2. Natural language syntax 7.3.2.1. KWIC indexes 7.3.2.2. KWOC indexes 7.3.2.3. KWAC indexes 7.3.3. Subject heading lists 7.3.4. Permuted indexes 7.3.5. String indexing 7.3.5.1. Rotated terms 7.3.5.2. Faceted indexing 7.3.5.3. Ad hoc coding 7.3.5.4. Chain indexing 7.3.6. Syntactic cross-references 7.4. Locators in displayed indexes. 7.4.1. Locators for printed documents 7.4.2. Locators for documents in other media 7.4.3. Multiple locators in print indexes to single documents 7.4.4. Methods of emphasizing locators in print indexes 7.4.5. Presentation of locators in print indexes 7.4.6. Presentation of other identifying data in print indexes 7.5. Syntax in non-displayed indexes 7.5.1. Boolean syntax 7.5.2. Weighted term combinations 7.5.3. Proximity operators, stemming, and truncation 7.5.4. Links and role indicators 8. Display of index arrays 8.1. Introductory note 8.2. Index display in print media 8.2.1. Arrangement of entries 8.2.1.1. Alphanumeric displays 8.2.1.2. Classified or relational displays 8.2.2. Recurring elements 8.2.3. Vertical spacing 8.2.4. Entry layout 8.2.4.1. Indented layout 8.2.4.2. Run-on layout 8.2.4.3. Hybrid indented/run-on layout 8.2.5. Running headlines 8.2.6. Scope headlines. 8.2.7. Continuation lines 8.2.8. Typography 8.2.9. Columns 8.3. Index display in electronic media 8.4. Electronic manuscripts 9. Alphanumeric order 9.1. Standards 9.2. Basic order 9.3. Initial articles 9.4. Subheadings 9.5. Headings with the same initial term 9.6. Cross-references 9.7. Word by word versus letter by letter arrangement 9.8. Numerals 9.9. Comprehensive example Glossary Index * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Guide to the Standard This standard consists of 9 sections. They are briefly summarized here: 1. Scope of the standard: describes aspects of index preparation and presentation addressed by the standard. Encompassed are principles, rather than detailed procedures, for the presentation of print and electronic indexes compiled by human analysis and by computer algorithm for the retrieval of all types of documents. Both displayed indexes, designed for searching by means of human visual inspection, and non-displayed indexes, designed for searching by means of electronic comparison and matching are included. 2. Definitions: only major and essential terms are defined here. Additional terms are listed and defined in a glossary appended to the standard. 3. Function of an index: gives an expanded definition of "index" in the context of information retrieval, in terms of the minimum functions an index ought to perform. 4. Types of index: continues and expands the definition of "index" in terms of the variety of types of index. 5. Design of indexes: summarizes the design of indexes in terms of decision options for 17 key features and attributes of indexes. For the most part, the standard does not favor particular choices or options, but instead states that decisions on options should be based primarily on needs, habits, and preferences of users; that publishers and producers of indexes should agree on feature and attribute options prior to the production of an index; and that all special or unusual features should be made clear to index users. 6. Vocabulary: recommends sources for and forms of terms used in indexes. The standard emphasizes the importance of linking alternative terms and forms of terms for the same or similar concepts. It recommends linking of terms for related concepts as well. In displayed indexes, the display of vocabulary information should be integrated into the display of the index. In non-displayed indexes, the search interface should provide for the display of vocabulary information and relationships at the time a search statement is created. 7. Syntax for entries, headings, and search statements: describes a wide variety of syntactic methods and styles for the combination of terms to create index headings and entries in displayed indexes and search statements for non- displayed indexes. The principal recommendation states that such combination is absolutely essential, regardless of the type of index. 8. Display of index arrays: lists options and recommendations for the display of indexes or parts of indexes, including arrays of retrieved entries or records from non-displayed indexes. 9. Alphanumeric order: contains rules for the arrangement of alphanumeric indexes. Appended to the standard proper are a comprehensive glossary of terms related to indexes and indexing, and a detailed index to the standard. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Normative References The following American National Standards Institute/National Information Standards Organization (ANSI/NISO) standards contain provisions that, through reference in this text, constitute provisions of this NISO standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and users of this standard are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below. ANSI/NISO Z39.19 -- 199x. @@American National Standard Guidelines for the Construction, Format and Management of Monolingual Thesauri.@@ ANSI Z39.29 -- 1977. @@American National Standard for Bibliographic References.@@ (Currently under revision.) ANSI/NISO Z39.59 -- 1988. @@American National Standard for Electronic Manuscript Preparation and Markup.@@ Also ANSI/NISO standards for romanization of Japanese (Z39.11-1972(R1989)), Arabic (Z39.12-1972(R1989)). The following International Organization for Standardization (ISO) standards and drafts are cited: ISO 690: 1987 (E) -- @@Documentation -- Bibliographic references -- Content, Form and Structure.@@ ISO 9115: 1987 (E) -- @@Documentation -- Bibliographic Identification (biblid) of Contributions in Serials and Books.@@ ISO/CD 999.4 @@Documentation -- Guidelines for the Content, Organization and Presentation of Indexes.@@ [*****Insert latest citation for ISO index standard] Note: There are no ANSI/NISO standards for the ordering of alphanumeric characters or other signs and symbols. The following rules for filing from the American Library Association and the Library of Congress function as de facto standards in the United States, but they are incompatible with each other. Rules in this standard are closest to the ALA Filing Rules. American Library Association, Filing Committee. @@ALA Filing Rules.@@ Chicago: American Library Association; 1980. ix, 50 p. Library of Congress, Processing Services. @@Library of Congress Filing Rules,@@ prepared by John C. Rather and Susan C. Biebel. Washington: Library of Congress; 1980. vii, 111 p. The de facto standard for the formulation of name headings, both personal and corporate, is the following: @@Anglo-American Cataloguing Rules,@@ 2d edition, 1988 revision. Prepared by the Joint Steering Committee for Revision of AACR; ed. by Michael Gorman and Paul W. Winkler. Chicago: American Library Association; 1988. For languages for which NISO has no romanization standard, ALA-LC Romanization tables may be used. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Bibliography This standard assumes basic understanding of indexing and indexes. The following publications will be helpful in providing background information. They are arranged in inverse chronological order. Mulvany, Nancy C. @@Indexing books@@. Chicago: University of Chicago Press, 1993. Bell, Hazel K. @@Indexing biographies and other stories of human lives@@. London: Society of Indexers, 1992. Fetters, Linda K. @@A guide to indexing software@@. 4th ed. Port Aransas, TX: American Society of Indexers, 1992. @@Index evaluation checklist: a guide for authors, editors, publishers, reviews, librarians@@. Port Aransas, TX: American Society of Indexers, 1991. Lancaster, F. W. @@Indexing and abstracting in theory and practice@@. Champaign, IL: Graduate School of Library and Information Service, University of Illinois, 1991. Wellisch, Hans H. @@Indexing from A to Z@@. New York: H. W. Wilson, 1991. @@Indexing: the state of our knowledge and the state of our ignorance@@. Medford, NJ: Learned Information, 1989. Salton, Gerard. @@Automatic text processing: the transformation, analysis and retrieval of information by computer@@. Reading, MA: Addison-Wesley, 1989. Rowley, Jennifer E. @@Abstracting and indexing@@. 2nd ed. London: Bingley, 1988. Craven, Timothy C. @@String indexing@@. Orlando, FL: Academic Press, 1986. Soergel, Dagobert. @@Organizing information: principles of data base and retrieval systems.@@ Orlando, FL: Academic Press, 1985. Milstead, Jessica L. @@Subject access systems: alternatives in design@@. Orlando, FL: Academic Press, 1984. Knight, G. N. @@Indexing, the art of@@. London: Allen & Unwin, 1979. Borko, Harold; Bernier, Charles L. @@Indexing concepts and methods@@. Orlando, FL: Academic Press, 1978. @@UNISIST: indexing principles@@. Paris: Unesco, 1975. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 0. Proposed title change: from "Basic Criteria for Indexes" to "Standard Guidelines for Indexes for Information Retrieval." Or: Standard Guidelines for Indexes and Associated Devices for Information Retrieval 1. Scope of the standard. 1.1. General statement. This standard provides guidelines for the content, organization, and presentation of indexes used for the retrieval of documents and parts of documents. It deals with the principles of indexing, regardless of the type of material indexed, the indexing method used (intellectual analysis, machine algorithm, or both), the medium of the index, or the method of presentation for searching. It includes definitions of indexes and of their parts, attributes and aspects; a uniform vocabulary; treatment of the nature and variety of indexes; and recommendations regarding the design, organization, and presentation of indexes. It does not attempt to set standards for every detail or technique of indexing. These can be determined for each index on the basis of factors covered in the standard, including the type of material indexed, the medium of the index, the method of presentation for searching, and the type of user for whom the index is designed. Note: In other contexts, the term "index" can be used to indicate other phenomena, for example, a consumer price index indicates the rise and fall of prices. The construction and display of such indexes that do not refer to the retrieval of documents is not covered by this standard. 1.2. Types of documents. This standard applies to indexes for single documents and for collections of documents. "Document" is used in the broadest possible sense. (See 2.1 "document" below.) The term "documentary unit" as used throughout this standard refers to complete documents or to parts, sections, paragraphs or sentences within documents, as well as collections of documents, depending on the design and purpose of a particular index. 1.3. Presentation of indexes. This standard is concerned with basic indexing principles and practices as they affect the presentation of an index, whether the index is a displayed index designed for searching by means of visual inspection or a non-displayed index designed for searching by means of electronic comparison and matching. Emphasis is on presentation of the index to human users, rather than on the way it is structured or stored electronically. All kinds of indexes for human use are considered, regardless of the medium on which the index is displayed or the method by which the index is presented for searching. The internal representation of computer-readable indexes (inverted files, for example) designed for electronic comparison and matching rather than human visual inspection is not directly addressed. Examples are illustrative, not prescriptive. 1.4. Choice of terms. This standard covers criteria for the choice and form of terms to be used in headings in displayed indexes, as descriptors in non-displayed indexes, and in the vocabulary management component of indexes. The standard permits the use of natural language terms extracted from natural language text, but it calls for the display of relationships among terms, whether natural language keywords or controlled descriptors or headings, in order to indicate synonymous, equivalent, hierarchical and associative relationships among concepts represented. In indexes using natural language keywords, recommendations on the choice and form of terms should guide the selection of terms to be used as preferred terms around which to gather synonymous, equivalent, and related keywords. (For the compilation of thesauri that may be used to facilitate the display of terms relations, see NISO Z39.19 @@Guidelines for the Construction, Format, and Management of Monolingual Thesauri.@@) 1.5. Method of preparation. This standard is relevant to the preparation of all types of indexes for information retrieval, regardless of whether they are produced on the basis of human intellectual analysis or by automatic or computer-assisted methods, whether they are searched by visual inspection or by electronic algorithm, and whether they are compiled by one indexer or by teams of indexers. This standard does not address indexing software. Such an attempt might have been contemplated if the standard addressed only a narrow spectrum of index types, for example, closed-end indexes for single documents such as back-of- the-book indexes. The enormous variety of computer software for the compilation and indexing of textual databases, for natural language processing and automatic indexing of text surrogates or full-text, for the conversion of single-concept descriptors into coordinated index headings (string indexing), for electronic searching, and for vocabulary management is beyond the scope of this standard. 2. Definitions. Only the most important terms used in this standard are listed and defined here as part of the standard. They are presented in a conceptual sequence based on the order in which they appear in the indexing process. First comes the document to be indexed, then the documentary unit to which index terms refer. Next comes the index itself, followed by the indexing process, and two fundamental types of index (displayed and non-displayed). Index elements conclude the list: term, descriptor, heading, entry, locator, and cross reference. Other terms are listed and defined in an appended glossary. Terms in ALL CAPS have their own definition, entered under the singular noun form, in this section or in the appended glossary. [Note: ALL CAPS are used for this draft -- defined terms will be converted to italics in order to conform to NISO practice when the draft reaches its final form.] 2.1. document. A MEDIUM on or in which a MESSAGE is encoded; thus, the combination of message and medium. The term applies not only to written and printed materials on paper or microforms (for example, books, journals, maps, diagrams), but also to nonprint media (for example, machine-readable records, transparencies, audio recordings, video recordings, and film) and, by extension, to three-dimensional objects or realia -- encompassing every kind of format and genre, including but not limited to treatises, literary works, patents, technical reports, charts, diagrams, tables, illustrations, music, performances, artistic works, and multimedia texts. [Note to committee: I have deleted the last sentence: "The term 'document' as used throughout this standard implies also parts, sections, even paragraphs or sentences within documents and to collections of documents, depending on the DOCUMENTARY UNIT." Throughout this revision I have tried to use "documentary unit" when that is what is meant in order to preserve the idea of "document" as a complete message unit. This was urged by Jessica Milstead and Bella Weinberg along with others. -- JDA] 2.2. documentary unit. The DOCUMENT, document segment, or collection of documen ts to which index ENTRIES refer and on which they are based. Examples of verbal documentary units include sentences, paragraphs, pages, articles, book-length monographs, complete serial runs, or entire library collections. The documentary unit determines the relative size of document to which an INDEX will point. 2.3. index. A systematic guide designed to indicate TOPICS or FEATURES of DOCUMENTS in order to facilitate retrieval of documents or parts of documents. Indexes include the following major components: (1) terms representing the topics or features of documentary units; (2) a syntax for combining terms into headings and ARTICULATED ENTRIES (in DISPLAYED INDEXES) or search statements (in NON-DISPLAYED INDEXES) in order to represent compound or complex topics, features, and/or queries; (3) links or cross-references among synonymous, equivalent, and related terms; (4) a procedure for linking headings (in displayed indexes) or search statements (in non-displayed indexes) with particular documentary units; and (5) a systematic ordering of headings (in displayed indexes) or a search procedure (in non-displayed indexes). 2.4. indexing. The operation of creating an INDEX for information retrieval. Indexing involves the selection and assignment of TERMS to, or the extraction of terms from, a DOCUMENTARY UNIT in order to indicate TOPICS, FEATURES, or possible uses of the unit; the combination of terms into headings and ARTICULATED ENTRIES or the tagging of terms for subsequent combination (in DISPLAYED INDEXES); the linking of synonymous, equivalent and related terms or headings; the linking of terms or headings to documentary units; and the arrangement of headings in a systematic order (in displayed indexes). 2.5. displayed index. An INDEX that is displayed in print, microform, or electronically for searching by means of human visual inspection. Electronically stored indexes may be displayed for human visual searching or may be NON-DISPLAYED INDEXES that are searched by means of computer algorithms. 2.6. non-displayed index. An INDEX that is searched by means of electronic comparison and matching controlled by computer algorithms. The complete index itself is not displayed for searching by means of visual inspection. Such an index will, however, provide a display and procedure for recording and entering search terms or requests and for reviewing descriptions or texts of retrieved documentary units. Word or term lists may be displayed to facilitate term selection, but these lists do not constitute a complete index if term combinations (ARTICULATED ENTRIES) are not also displayed. The crucial distinction between displayed and non-displayed indexes lies in the method of searching, whether by visual inspection or by electronic comparison and matching. 2.7. term. A word or phrase used to represent a TOPIC or FEATURE of a DOCUMENTARY UNIT in an INDEX. Topics include concepts treated; features include aspects such as authors and other persons or organizations responsible for the documentary unit, sources, publishers, style, methodology, quality, usefulness, level of complexity, language, format, and date of creation or publication. 2.8. descriptor. A TERM chosen as the preferred representation for a CONCEPT or FEATURE in an INDEX. 2.9. main heading. One or more TERMS representing a TOPIC or FEATURE of a DOCUMENT in a DISPLAYED INDEX; the first element of an index ENTRY in a DISPLAYED INDEX. A main heading followed by a SUBHEADING and, possibly, by one or more SUB-SUBHEADINGS constitutes an ARTICULATED ENTRY. When "heading" is used without modification in this standard, it may refer to a main heading alone; a main heading and subheading; or a main heading, subheading and sub- subheading, depending on the context. 2.10. entry. The representation of a DOCUMENTARY UNIT in a DISPLAYED INDEX. Consists of at least a MAIN HEADING and a LOCATOR. More than one LOCATOR may follow a a given heading in a display, but each locator, in combination with the main heading, represents a single entry. In an ARTICULATED ENTRY, the main heading is modified by a SUBHEADING, and possibly by one or more SUB- SUBHEADINGS. [Hans suggests "and one or more LOCATORS", but this would violate what I consider to be an important and confusing point. For me, every locator represents one entry. It is simply display convention that we do not repeat identical headings (main, sub and sub-sub) in succeeding entries, but nevertheless, when a heading is followed by 2 locators, it is two entries. Thus an entry consists "of at least a MAIN HEADING and a LOCATOR". (Think of entries as they are made, before sorting and merging!) Similarly, I am opposed to the idea, suggested by Bella Weinberg as I recall, of "main headings followed by multiple subheadings." Those are actually multiple entries, each of which consists of a main heading and a subheading (and perhaps a sub- subheading). The main heading is simply not repeated in the display when it is the same as the previous main headings, but it is very much a part of each and every entry! -- what do you think???? -- JDA] 2.11. locator. The part of an ENTRY in a DISPLAYED INDEX that indicates the location of the DOCUMENTARY UNIT to which the entry refers. Locators range from brief notations, such as page numbers, to full bibliographic citations. 2.12. cross-reference. A link between two or more terms or headings in an INDEX. Cross-references can be categorized into three types according to the relationship indicated: (a) an equivalence relationship among synonymous or equivalent terms or headings, (b) an associative relationship, indicating an unspecified relationship among terms or headings, and (c) a hierarchical relationship, indicating a broader/narrower relationship among terms or headings. 3. Function of an index. The function of an index is to provide users with an efficient and systematic means for locating documentary units (complete documents or parts of documents) that may address information needs or requests. An index should therefore: a. Identify documentary units that treat particular topics or possess particular features. b. Indicate all important topics or features of documentary units in accordance with the level of exhaustivity appropriate for the index. c. Discriminate between major and minor treatments of particular topics or manifestations of particular features. d. Provide access to topics or features using the terminology of prospective users. e. Provide access to topics or features using the terminology of verbal texts in the same language as the index. f. Use terminology that is as specific as the nature of documentary units and the size of the indexing language permit. g. Provide access through synonymous and equivalent terms. h. Guide users to terms representing related concepts. i. Provide for the combination of terms to facilitate the identification of particular types or aspects of topics or features and to eliminate unwanted types or aspects. j. Provide a means for searching for particular topics or features by means of a systematic arrangement of entries in displayed indexes or by means of a clearly documented and displayed method for entering, combining, and modifying terms to create search statements and for reviewing retrieved items. 4. Types of index. Indexes may be categorized by type of object to which headings refer (referent); by type or extent of indexable matter used to produce the index; by method of arranging entries; by method of term coordination; by type, format, genre, or medium of document being indexed; by medium of the index; by mode of publication; by whether the index is a one-time (closed-end, monographic) index or a continuing (open-end, serial) index; and by type of authorship. The following examples illustrate common types of indexes. They are by no means exhaustive. 4.1. Indexes by type of referent. a. authors: include all types of document creators, such as writers, composers, illustrators, translators, editors, choreographers, artists, sculptors, painters, inventors. b. topics or features: refer to topics (subjects) treated in documents and/or documentary units possessing particular features. c. names: limited to proper names, such as names of persons, places, corporate bodies. d. numbers or notations: provide access by means of numerical or coded designations, such as classification notation, patent number, ISBN, date. 4.2. Indexes by type or extent of indexable matter used to produce the index. a. titles: based on titles of documents. b. first lines: based on first lines of poetry. c. citations: based on reference citations to other documents. d. full-text of a document: based on the full text of documents. 4.3. Indexes by arrangement of entries. a. alphabetical or alphanumeric: Headings arranged according to the commonly accepted order of letters and numerals. b. classified: Headings arranged on the basis of relations among headings, for example, hierarchy, inclusion, chronology, or association. Classified indexes are often based on pre-existing classification schemes, such as the Dewey Decimal Classification. c. alphabetico-classed: Broad headings arranged alphabetically. Narrower headings grouped under broad headings and arranged alphanumerically or relationally on the basis of hierarchy, inclusion, chronology, or other association. Note: Electronic indexes often have no arrangement that is apparent to the user. However, indexes designed for human scanning, browsing and examination must have some arrangement, regardless of medium. 4.4. Indexes by method of document analysis. a. human intellectual analysis and identification of topics and concepts expressed and/or features manifested. b. computer algorithms designed to identify useful terms or phrases.. 4.5. Indexes by method of term selection. a. assignment of terms to represent topics and features (whether or not the term is in the documentary unit being indexed). b. extraction of terms from the documentary unit. c. a combination of assignment and extraction methods. 4.6. Indexes by method of term coordination. a. pre-coordinate combination (articulated indexes), such as subject heading indexes, string indexes, chain indexes, keyword indexes (including KWIC, KWOC, KWAC indexes), rotated and permuted indexes. b. post-coordinate combination. Includes the use of Boolean operators, proximity measures, and the unordered combination of weighted terms. 4.7. Indexes by type, periodicity, format, genre, or medium of document(s) being indexed. a. books Note: An index to a single book is often referred to as a "back-of-the-book index," although such an index may be placed at the beginning or elsewhere in a book. b. periodicals, serials c. monographs d. poetry e. fiction, short stories f. films, videos g. illustrations, pictures, paintings h. artifacts i. software j. computer-readable texts k. maps l. sound recordings 4.8. Indexes by medium of index. a. printed or written. b. microform. c. electronic media, including online, CD-ROM. d. braille. 4.9. Indexes by mode of publication. a. indexes published together with the documentary units to which they refer. b. indexes published separately from the documentary units to which they refer. 4.10. Indexes by periodicity of the index. a. one-time, closed-end indexes. b. continuing, open-end indexes. 4.11. Indexes by authorship: An authored index is a separately authored document distinct from the authorship of the document(s) that is(are) indexed. It is created independently by one or more persons through human intellectual analysis of text, as distinguished from indexes that are created solely through algorithmic analysis of text carried out electronically and from those that are ongoing products of corporate activity, for example, by indexing and abstracting services and database producers. 5. Design of indexes. In the design of indexes, decisions must be made concerning key features and attributes. Careful consideration of options for each feature or attribute will contribute to a better index, since each feature or attribute will influence overall quality and performance of the index. Decisions on options should be based primarily on needs, habits, and preferences of users. Publishers and producers of indexes should agree on feature and attribute options prior to the production of an index. All special or unusual features should be made clear to users in an introductory statement in print indexes or in on-screen and off-screen documentation for electronic indexes (see 8.1.5. Introductory note). The following key features and attributes are present in most indexes. 5.1. Subject scope. The scope of subject index to a single document should be the same as the subject scope of the document. The subject scope of an index covering multiple documents may not necessarily be the same as the subject scope of the indexed document(s). When this is the case, the subject scope should be clearly stated in terms of the kinds of topics or features indexed. Subject scope can be stated in terms of facets or categories covered by the index. Examples of facets or categories include, but are not limited to: entities concrete entities persons individuals groups institutions artifacts natural objects abstract entities belief systems disciplines (areas and methods of study) theories, hypotheses imaginary entities attributes and properties materials operations, processes, events, conditions places times historical periods Appropriate facets or categories are frequently unique to a particular field. For example, facets that may be indexed in a document related to literature include: specific literatures (by nationality or culture) performance media languages periods individuals (for example, authors) groups/movements genres works features literary techniques themes/motifs/figures/characters influences (recipients) sources processes types of scholarship methodological approaches theories devices/tools disciplines scholars document types Examples of facets related to soil science include: kinds of soil soil structure soil constituents soil properties processes in soil operations on soil laboratory techniques 5.2. Documentary scope. Indexes are also defined by categories of indexed documents. The documentary scope or coverage of an index to a single document is obvious. For indexes to multiple documents, such as those provided by indexing and abstracting services or textual databases, it is important to state explicitly the kinds of documents included within the documentary scope of the index with respect to such criteria as: medium format periodicity (monographs, serials) audience or level language nationality (place of publication) time (date of publication or date of receipt) specific titles (when scope is limited to a stated list of documents). 5.3. Domain. Domain refers to the "territory" covered in order to produce an index. The domain for an index to a single document is obvious, but it is not obvious for indexes to multiple documents and should be clearly stated. a. Locational limits. Within a particular subject and documentary scope, index producers can limit domain to a single collection or several collections of documents, in which case the index may be called a "catalog" or a "union catalog." Similarly, a domain can be limited to documents located in particular places or countries, or, it can be universal, attempting to cover documents wherever they are located. b. Primary versus secondary sources. As a general rule, indexes should be based on primary sources, that is, the actual documents being indexed. When indexes are compiled on the basis of secondary sources (descriptions of documents such as abstracts, reviews, other indexes, databases, or catalogs) rather than the documents themselves, this practice should be clearly stated and the sources of data described. c. Selection criteria. When a domain is further limited by qualitative selection criteria, these criteria should be stated. 5.4. Multiple versus unified indexes Unified indexes should generally be preferred, but separate indexes may be justified when particular aspects are especially important such as authors or persons as subjects, animal species, products, ingredients, or particular types of documents, such as statutes, legal cases, reviews, maps, illustrations, or advertisements. Separate indexes may also be desirable when it is awkward to assimilate verbal terms (using natural language words) with non-verbal terms, such as chemical formulae or patent numbers, or terms in different writing systems, such as the Roman alphabet and non-Roman scripts. Separate indexes for particular subject facets or documentary types are often desirable in electronic indexes to facilitate targeted searches. Such separate indexes should also allow for global searches across all indexes. 5.5. Codes and symbols. Most indexes within the scope of these guidelines will use the standard Roman alphabet, punctuation symbols, and Arabic and Roman numerals in accordance with standards of English language usage. Whenever any other symbols are used, for example, for music, choreography, chemistry, mathematics, or non-Roman writing systems, these symbols, the codes that govern their use, and the method for arranging non-alphanumeric symbols in displays should be described. 5.6. Display media. Indexes may be displayed in a wide range of media, including but not limited to print on paper, cards, microforms, or electronic displays linked to online databases or to indexes stored in such media as CD-ROMs or optical disks. Each medium has particular advantages and disadvantages, which need to be considered in relation to the needs, habits, and preferences of users. The medium will influence most other options regarding access to the index. 5.7. Documentary units. The size and type of documentary units to which an index refers directly determines what can be retrieved. For indexes to verbal documents, documentary units can range from lines, statements, paragraphs, pages, sections, articles, chapters, monographs, serials or series, to entire collections. Analogous units apply to non-verbal documents. The smaller the documentary unit, the more direct the referral to a particular topic or feature is likely to be. Inherent documentary units should be preferred over physical medium units. Whenever possible, numbered or otherwise specified paragraphs or sections of a printed verbal text should be preferred to pages, since paragraphs or sections are more likely to constitute conceptual units. There usually is no particular or enduring relationship between a physical page in a book and a particular part of a text. Indexes that refer to inherent documentary units may be used without change when a text appears in a variety of formats. (See also 7.4. Locators.) 5.8. Indexable matter. Indexable matter consists of the portions of documents that are actually analyzed and indexed. Not all portions may be equally important. For example, introductory matter, appendixes, bibliographies, glossaries, illustrations, tables, advertisements, letters, and reviews may or may not need to be indexed, or they may be indexed at different levels of exhaustivity or specificity. Indexing also may be limited to specific portions of text (for example, abstracts, first and/or last paragraphs, or captions). Decisions on appropriate indexable matter should be based on perceived importance to users of documentary units and should be explicitly stated. 5.9. Analysis method. Documents may be indexed through human intellectual analysis, algorithmic machine analysis, or combinations of human and machine analysis. The method of analysis used to produce an index should be stated. When indexes are created by particular individuals, their contribution should be acknowledged at the head of a displayed index or in the documentation or explanatory matter accompanying a non-displayed index (see also 4.11. Indexes by authorship). 5.10. Exhaustivity. Exhaustivity of indexing is the detail with which topics or features of a documentary unit are analyzed and described. Exhaustivity may be described as the number of unique terms assigned to or extracted, on average, from a documentary unit. It can range from summary indexing in which only a few terms are assigned per documentary unit, to highly exhaustive indexing in which hundreds of terms may be assigned or extracted. (Note that in a displayed index, a single heading often consists of multiple terms.) Summary indexing tends to favor high-precision retrieval. Only documents that are closely and centrally related to a particular index term or heading are retrieved. On the other hand, highly exhaustive or detailed indexing tends to favor high-recall retrieval. More documents are retrieved, including those in which a desired topic or feature may be peripheral. The best level of exhaustivity clearly depends on the needs of users, whether they want only "a few good items" at one extreme, or "everything" at the other. Multiple levels of exhaustivity are advantageous and can be implemented by marking terms as primary or secondary. In electronic indexes, weights may be assigned to terms so that exhaustivity may be adjusted to the particular needs of a user. Exhaustivity combines with the specificity of index terminology to determine the depth of indexing. Exhaustivity, along with term specificity, syntax, and extent of vocabulary management (number of cross-references), are primary determinants of the size of a displayed index. 5.11. Specificity. Specificity refers to the closeness of fit between index terms and the topics or features they represent. For example, is "pick-up trucks" used to represent that type of truck, or is "trucks" used for all types of truck? "Specific" does not necessarily mean "narrow," since a specific term may be broad or narrow depending on the topic or feature to which it refers. Specific indexing provides specific terms for all or most topics and features and results in a larger indexing vocabulary than more generic indexing. Greater specificity tends to improve the precision of searches, but it tends to make high recall searches more difficult. Categories of topics or features may be indexed by several specific terms (for example, pick-up trucks, dump trucks, van trucks, tractor trucks), rather than subsumed under a single more generic term (for example, trucks). Links between broader and narrower terms can facilitate more comprehensive, higher-recall searches (see also 5.13. Vocabulary management). Up-posting, the assignment of both specific and generic terms, is also used for this purpose. One option is to use high specificity indexing in core areas of the subject scope of the index and more generic indexing in peripheral areas. Together, exhaustivity and specificity determine in the depth of indexing. 5.12. Syntax Index syntax provides the capability and the procedure for combining individual terms to form headings, subheadings and sub-subheadings -- articulated entries -- in order to provide context for the initial term in an entry in displayed indexes and for combining individual terms into search statements for searching non-displayed indexes. Syntax is essential in both types of index because a primary purpose of an index is enable a user to eliminate documentary units that are irrelevant for a particular need. Single term entries in displayed indexes or single term searches of non-displayed indexes often cannot provide a sufficient basis for the selection of potentially relevant items or the elimination of potentially irrelevant items. Unless a term is very specific and the context of the index is narrowly defined, some context is required within the entry or search statement itself. Consequently, every index should provide for the combination of terms in headings (displayed indexes) or in search statements (non-displayed indexes). For pre-coordinated index headings and articulated entries in displayed indexes, syntax patterns also determine the number and style of direct access points provided for multi-term headings. Access to all substantive terms in pre-coordinated headings is essential. This access can be achieved either through rotated arrangement of terms within the heading so that each term becomes a lead term or through cross-references. When pre-coordinated headings are searched electronically in a non-displayed mode, individual terms may be accessed in the same way as in post-coordinate indexes. For non-displayed indexes that are searched by means of electronic comparison or matching, rules of syntax are usually based on Boolean operators (AND, OR, NOT), proximity operators, and/or the combination of weighted terms or vectors. The use of role indicators or the linking of terms that represent associated topics or features may be used to reduce false relationships. Other important considerations for syntax and searching of non-displayed indexes are text parsing procedures that delimit terms or words or permit the identification of phrases. The usual definition of a word or term in such cases is a string of alphanumeric characters delimited by spaces or punctuation. But certain punctuation characters cause special problems, for example the hyphen. Non-displayed indexes will be very different depending on whether the hyphen is treated as a word delimiter or not. Examples of syntax are provided in section 7. Syntax for entries, headings, and search statements. 5.13. Vocabulary management. The vocabulary of an index should match the vocabulary of users. When documentary units consist of verbal texts in the same language as the index, the index should also link the vocabulary of documentary units to the vocabulary of users. Accommodating the vocabulary of users is a difficult objective, since research and experience has shown that language use is extremely diverse and constantly changing. Therefore, a large lead-in vocabulary is beneficial, with cross- references linking synonymous and equivalent terms. An index should also assist users in adjusting the level of specificity of their requests to that of the index and documentary units by providing links between broader and narrower terms. An index can also suggest other avenues of search by linking related or associated terms. In displayed indexes, access by means of alternative terms and guides to related terms should be integrated into the index itself through the use of cross-references that are interfiled with index entries. Users should not have to consult a separate list or publication. In non-displayed indexes, access by means of alternative terms and guides to related terms should be provided as part of the search interface through which search terms are entered and search statements composed. Users should not have to consult vocabulary guides that are not accessible through the search interface. If a controlled vocabulary is used, the replacement of nonpreferred terms by preferred terms should be automatic. If index vocabulary is not controlled, users should be given the option of automatically adding all equivalent terms to the search. Related, broader and narrower terms should be displayed so that users have the option of adding them or using them instead of other terms. Details of vocabulary management are treated more fully in section 6. Vocabulary. 5.14. Documentary unit surrogation; locators. Unless index terms or headings are attached to or embedded in the full text of a verbal document, indexes must include surrogates that represent or describe the documentary units to which they refer and locators that point to the location of the documentary units. In many indexes the same representation serves as both surrogate and locator, especially when the full- text of the documentary unit is present in the publication, as in back-of-the- book indexes, where page, column or paragraph numbers both represent the documentary unit (as surrogate) and point to its location (as locator). In other indexes, especially those that point to documents not physically present, the surrogate may consist of an abstract and bibliographic citation. The locator, strictly speaking, consists of the part of the citation that points to the location of the documentary unit. Some indexes use a series of surrogates and locators, for example, a brief entry number to link a term or articulated entry to a fuller surrogate and locator, which may include a citation, abstract, and subject or feature terms. Locators must also be used in non-displayed indexes, but they are often not displayed to the user. Surrogates, such as citations and abstracts, are widely used in non- displayed indexes. Since the text of these surrogates may often be searched, the use of abbreviations should be avoided. Locators are treated more fully in section 7. Syntax for entries, headings, and search statements. 5.15. Display of documentary unit surrogates. Some indexes use substantial surrogates, such as abstracts or annotations, to represent documentary units, especially when the full text is not present. Such indexes employ a variety of methods for displaying these surrogates. In displayed indexes, a comprehensive display brings together in one place all the descriptions of a documentary unit that are scattered in individual index entries or contained in inverted files, so that index headings or terms are displayed together with a citation or abstract. Comprehensive displays provide the maximum amount of information and can therefore improve preliminary decisions about the potential relevance of documentary units. Electronically stored indexes, both displayed and non-displayed, can provide options for a range of surrogate displays ranging from a brief citation through full description, including abstract and all index headings or terms. Indexes that are attached directly to the full text of documentary units, such a s back-of-the-book indexes or indexes to full-text electronic databases, eliminate intermediate surrogates altogether. 5.16. Index display and arrangement. For displayed indexes, the manner in which entries are arranged and formatted for display is vitally important since access to the index is often dependent on this display. The arrangement of entries directly affects access, and the clarity of display format and guides directly affect ease of searching and comprehension. Displayed indexes may be arranged in alphanumeric, classified, or relational arrays. Classified or relational arrangements are used to bring related entries together, but they usually need their own alphanumeric indexes to facilitate access to relevant sections of the index. Non-displayed indexes may be complemented with a displayed version of the index for visual browsing or scanning of entries. These topics are treated more fully in sections 8. Display of index arrays, and 9. Alphanumeric order. 5.17. Search interfaces. The format and arrangement of a displayed index constitutes its search interface. For non-displayed indexes, the search interface is a computer program that provides the means for entering search terms or requests, for composing search statements, for exploring alternative terms, and for reviewing surrogates or the actual text of retrieved documentary units. Such search interfaces are relatively new, as compared to displayed indexes, and they are still very much the subject of experimentation and testing. Therefore, this standard will not address details of search interface design or implementation, other than to suggest that such interfaces should provide clear descriptions of: a. search capability -- what a user can and can't do; b. the means for exploring vocabulary options; b. the means for carrying out search capabilities, for example, the way in which terms may be combined and truncated and the use of search commands; c. the means for displaying and reviewing retrieved items and for modifying a search; d. the advantages and disadvantages of alternative search strategies. The search interface is an essential component of a non-displayed index. It is by means of the search interface that essential standards for any information retrieval index are implemented for non-displayed indexes, including, the capability of combining terms to specify desired topics or features and of exploring alternative and related search terminology. It is recognized that different persons or organizations are often responsible for the provision of index terms, for the management of vocabulary, and for the design and implementation of the search interface for non-displayed indexes, making it especially difficult to achieve a high quality information retrieval index without maximum cooperation. 6. Vocabulary. 6.1. Summary. A major recommendation of this standard is that an index should provide access to topics or features of documents using the terminology of the documentary unit (when possible) and the terminology of prospective users. Terms or headings assigned to or extracted from documentary units should be linked with alternative synonymous, equivalent and related terms or headings by means of cross-references in displayed indexes. Similar links are also required in non- displayed indexes. Synonymous and equivalent terms may be substituted or added automatically to the search query or may be displayed, together with related terms, for selection when the search query is being composed. (@@See@@ 3. Function of an index, items d-h.) After describing desirable sources of terms for indexes (6.2), the next section (6.3) treats standardized forms of terms. In displayed indexes, such standardized terms should serve as the preferred terms, to which all alternative forms and related terms or headings are linked by means of cross- references. Similarly, in non-displayed indexes, such standardized preferred terms should serve as the anchors around which to gather alternative and related terms. Sections 6.4 through 6.8 treat the various relationships among terms that should be made available to index users, and section 6.9 describes desirable methods for making these relationships available in displayed and non-displayed indexes. 6.2. Sources of vocabulary. The vocabulary for indexes may come from indexed documents, index users, human indexers, or compilations of vocabulary, such as thesauri, dictionaries, handbooks, and textbooks. The best source is often the text of indexed documents. Users of indexes are another valuable source, but it is often difficult or impossible to access their vocabulary directly. When it is possible to collect search terms employed by users, their terms should be incorporated into the index vocabulary. To the extent possible, indexes should link the vocabulary of users to the vocabulary of documents. Expert indexers may be aware of user vocabulary that is not present in indexed documents. Their vocabulary expertise should be used to the fullest extent possible. Compilations of vocabulary (thesauri, subject heading lists, etc.) can also be useful. However, restricting vocabulary only to such collections of terms is usually not advisable, since it may lead to unnecessary constraints on access. 6.3. Forms of terms. Conventions and customs for the form of index terms have developed for English language indexes, as well as for other natural languages used for indexing. These conventions should be observed in the establishment of preferred terms for the convenience of users, unless there are overriding conventions in a particular discipline, field, or application. In this standard, only U.S. English language conventions and customs are cited. ========================================================================= Date: Tue, 27 Apr 1993 16:18:59 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Pam Rider Organization: NetLink Online Communications, San Diego CA Subject: Re: Fax Machines In-Reply-To: ----------------------------Original message---------------------------- Kathleen, I had the same questions. Then a friend of a friend had a fax for sale. The day after installation, I was able to receive 10 changed pages while working on the index. I had been paying up to $4/page for single revised pages before that. The fax has been VERY worthwhile. -- "Trying to walk cheerfully on the earth." INTERNET: prider@netlink.cts.com (Pam Rider) UUCP: ...!ryptyde!netlink!prider ========================================================================= Date: Tue, 27 Apr 1993 16:19:32 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Carol Roberts Subject: Re: Fax Machines ----------------------------Original message---------------------------- >----------------------------Original message---------------------------- >Howdee, >A fax modem would allow one to download a file from some site remote >via modem connection, or upload a file from local site to some remote >site via modem connection. For the local site, this fax modem allows >the received file to be printed on the local printer, at whatever the >local printer quality. > >Have a happy weekend... >Rusty Etheredge >rse8135@dewie.tamu.edu I thought a plain old modem does all that. Where does the fax come in? Carol -- Carol Roberts Publications Services Cornell University cjr2@cornell.edu 607 255-9454 Practice random kindness and senseless acts of beauty. ========================================================================= Date: Tue, 27 Apr 1993 16:20:08 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Paula Presley Subject: Bridge Burning In-Reply-To: In reply to your message of TUE 27 APR 1993 09:16:34 CST ----------------------------Original message---------------------------- Congratulations...more power to you! Will we still see you on the internet? Does your institution give "guest access"? (Some do, you know--get friendly with a librarian or a computer services person) I'll miss your postings--but looking forward to seeing you at ASI Our press director just got back from an ACLS meeting in Williamsburg...he was impressed with all the flowers there...I hope Alexandria is in full bloom when we arrive. Carol, let us hear from you often. Paula Presley Assoc. Editor, The Thomas Jefferson University Press Copy Editor, The Sixteenth Century Journal Northeast Missouri State University McClain Hall 111L Kirksville, MO 63501 (816) 785-4525 FAX (816) 785-4181 Bitnet: AD15@NEMOMUS Internet: AD15%NEMOMUS@Academic.NEMOState.EDU ========================================================================= Date: Wed, 28 Apr 1993 14:40:39 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Jessica Milstead <76440.2356@CompuServe.COM> Subject: Arrangement standard ----------------------------Original message---------------------------- Thanks to all who replied to my posting about the need for a standard for arrangement. Here is my promised summary. A total of ten replies were received from all sources. Five of these felt there was a need for a standard, two did not, and three made comments without taking a position. There seems to be a sense that some guidance would be helpful, and we (NISO's SDC) have decided to determine if the voting members of NISO feel a standard is warranted. Thanks again for your responses. Jessica Milstead The JELEM Co. Member, Standards Development Committee, National Information Standards Organization 76440.2356@compuserve.com ========================================================================= Date: Wed, 28 Apr 1993 14:41:04 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Jessica Milstead <76440.2356@CompuServe.COM> Subject: Research in indexing ----------------------------Original message---------------------------- Many thanks to the numerous individuals who replied to my posting about the needs for research in indexing. Here is the promised summary of the responses. The dominant responses all concerned users, with such ideas as usability testing, or the need to aid users in formulating their queries. Other ideas included the integration of automated and manual methods, indexing of non-text materials, and the application of theories from other fields to indexing. Thanks for all your assistance. I received a number of ideas that enlarged my own thinking. Jessica Milstead The JELEM Co. 76440.2356@compuserve.com ========================================================================= Date: Wed, 28 Apr 1993 14:41:50 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: "Nancy C. Mulvany" Subject: Fax Machines ----------------------------Original message---------------------------- Re: Faxes & an Indexing Business I've had a standalone fax in my office for at least three years. Most of its use has nothing to do with my indexing business. The main benefits for indexing for me are: - can quickly send contract agreements to the client for negotiation & inevitable changes before sending a final copy for signature - when changes have been made in pages that I'm indexing the client can quickly send the revised page - although my clients know that I would be happy to review final copy of the index, few of them send it to me; however there are some indexes I *insist* on seeing final copy of; so they can send the index pages by fax; the quality of the fax is not great but at that point I just looking for gross layout errors (like the time one of my clients didn't quite understand what sub-subentries were, so they formatted them in place as main headings!) - in large indexing projects many queries can be handled effectively by fax -nancy nmulvany@well.sf.ca.us ========================================================================= Date: Wed, 28 Apr 1993 14:44:09 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: Charlotte Skuster Subject: Standards draft 3.1 (part 2) 6.3.1. Parts of speech. Nouns, including verbal nouns (gerunds) and noun phrases, are the preferred parts of speech for terms. Adjectives are often used to modify nouns; they are rarely used alone. Prepositional phrases are often used as subheadings to modify main headings or within headings to modify lead terms. They are also used as role indicators to link terms in string indexes. For example: parts of speech [prepositional phrase modifying lead term] singular forms [lead term is adjective modifying a noun] spelling [a verbal noun (gerund)] word order in multi-word terms [prepositional phrase as subheading] advertising. Japanese cars. Germany. effects of advertising [preposition as role indicator in string indexing] sales. Japanese cars. Germany effects on advertising [preposition as role indicator in string indexing] 6.3.2. Spelling. For U.S. indexes, standard U.S. spelling should be used. If there is more than one standard spelling (for example, groundwater, ground-water, ground water), the one used in the indexed document(s) should be preferred if used consistently. Otherwise, one spelling should be chosen and employed consistently. Alternative spellings should be linked to the standard term. This is especially important in non-displayed electronic indexes, since even minor variations in spelling (for example, aluminum / aluminium) may lead to the loss of access. Common contractions, abbreviations, and acronyms should be used as terms or linked to terms. Their spelling should conform to common usage (see also 6.3.3. Capitalization; 6.5. Synonymous and equivalent terms). 6.3.3. Capitalization. All terms should be written with lower-case letters with the exception of proper nouns and acronyms. In proper nouns, the first letter of the first word and the first letter of each succeeding word, other than conjunctions, prepositions, and articles, should be capitalized. Acronyms of names of organizations should follow usage of the organization (for example, NATO, Unesco). Other acronyms should follow conventional capitalization (for example, radar, COBOL). 6.3.4. Singular and plural forms. In English, the convention and custom in most indexing situations is to use the plural form for terms denoting discrete objects (countables) and the singular form for non-countables (mass words). The plural is used when the question as to quantity asks "How many?" The singular is used when the question as to quantity asks "How much?". If the singular and plural forms have different meanings, both forms should be used if both are needed to represent topics or features of a text: memories memory building buildings 6.3.5. Articles. The use of articles, especially initial articles, should be avoided in index terms. Initial articles should be omitted from corporate body names when possible, but not from place names or titles. Articles should not transposed (@@see@@ 6.3.9.2. Corporate body names, 6.3.9.3. Geographical names, 6.3.9.4. Titles of documents, 6.3.9.5. First lines). 6.3.6. Compound terms. As a general rule, a single term (as opposed to a pre-coordinated heading or articulated entry) should represent a single concept. What constitutes a single concept will vary from situation to situation. Frequently two or more concepts become "bound" together and are commonly expressed by a "compound term," such as "information science", "birth control", or "form of government". When such compound terms become established, they should be preferred to the alternative of forcing the combination of two separate terms, for example, "science" and "information" or "control" and "birth" or "conception" at the time of searching or when combining terms into headings and entries. Use of compound terms also helps to avoid "false drops," such as the retrieval o f documents on "library schools" when "school libraries" is intended. Similarly, terms like "information" and "science" can occur in many contexts where "information science" is not discussed. 6.3.7. Antonyms and associated terms. When antonyms and other closely associated terms (for example, honors and awards) are combined to form compound terms, the constituent terms should be linked to the compound term by a cross-reference. [Note: at Raya's suggestion, I have removed the term "combined term," since we did not define it and use it nowhere else. -- JDA] awards @@see@@ honors and awards evil @@see@@ good and evil Note: The form and presentation of the cross-references will differ in displayed and non-displayed indexes. 6.3.8. Word order in multi-word terms. Terms consisting of more than one word, including compound terms, should be used in natural language order without inversion. For example: personal names [not: names, personal] 6.3.9. Proper names and titles of documents. Names of persons, corporate bodies, and places should be established in accordance with standards used in library practice, since it is advantageous for users to experience a measure of uniformity across information retrieval systems. @@The Anglo-American Cataloguing Rules@@, 2d edition (AACR2), provides detailed guidance for the establishment of names. Note: This standard provides only a cursory summary of the provisions of AACR2. 6.3.9.1. Personal names. Personal names should be provided in the form most commonly used, and in as full a form as possible when there is more than one common form. Limiting forenames to initials invites confusion, unless initials are part of the commonly used form of a name (for example, D. H. Lawrence). When more than one name or form of name is in use, they should be linked as synonymous terms. Where surnames are in common use, names should be entered under surname, followed by a comma and any given names or initials: Lee, Kuan Yew Wheatley, Henry B. Persons identified only by a given name or forename should be entered under that name, qualified if necessary by a title of office or other distinguishing epithet: Boudicca, Queen of the Iceni Leonardo da Vinci Ethelred the Unready Persons normally identified by a title of honor or nobility should be entered under that title, expanded if necessary by their family name: Dalai Lama Marlborough, John Churchill, first Duke Compound and multiple surnames, whether hyphenated or not, should be entered under the first part, unless usage favors another practice. For example, Portuguese names are customarily entered under the last part. Links should be established among all possible forms of entry: Layzell Ward, Patricia [with cross-reference from: Ward, Patricia Layzell] Pe'@rez de Cue'@llar, Javier [with cross-reference from: De Cue'@llar, Javier Pe'@rez; and from: Cue'@llar, Javier Pe'@rez de] When two or more persons have the same name, their names constitute homographs and should be distinguished with qualifiers consisting of a fuller form of name or dates where available; otherwise, use occupation, title, or nationality: Lawrence, D. H. (David Henry) Lawrence, D. H. (Derek Herbert) Butler, Samuel (1612-1680) Butler, Samuel (1835-1902) Rickert, Heinrich (philosopher) Rickert, Heinrich (politician) 6.3.9.2. Corporate body names. Names of corporate bodies should be entered without transposition in the form most commonly used by the body itself. If more than one form is common, the fuller form should be used. If an abbreviation or acronym is the commonly used form, that form (not the full form) should be used: J. Whitaker & Sons [not Whitaker, J., & Sons] H. W. Wilson Company [not Wilson, H. W., Company] Unesco [not: United Nations Educational, Scientific, and Cultural Organizations] TRON Project [not: The Realtime Operating System Nucleus Project] Unless abbreviations constitute or are part of the commonly used form, names of corporate bodies should not be abbreviated: U. S. Department of Education [not DOE] University of Nebraska [not Univ. of Nebraska] New York University [not NYU] Omit initial articles unless they are required for grammatical reasons: Club (London) [not: The Club (London)] Library Association (United Kingdom) [not: The Library Association (United Kingdom)] But: Der Blaue Adler (Association) Note: See 9.3. regarding the arrangement of headings with initial articles. Enter corporate bodies that are parts of larger bodies under their own names unless the name is indistinct or implies subordination. If the name needs the name of a higher body, use the lowest level body that can be entered directly under its own name: Public Library Association. Audiovisual Committee [not: American Library Association. Public Library Association. Audiovisual Committee] U. S. Department of Health and Human Services [not: Department of Health and Human Services] When there are several hierarchical levels, as many intervening bodies as necessary should be included in the name to avoid confusion: American Library Association. Resources and Technical Services Division. Board of Directors. Identical names for different bodies constitute homographs and must be distinguished with qualifiers: Metropolitan Museum of Art (Cleveland, OH) Metropolitan Museum of Art (New York, NY) Cross-references should link different names for the same body and all possible forms of entry, including inverted forms: Medicine, National Library of @@see@@ National Library of Medicine The Realtime Operating System Nucleus Project @@see@@ TRON Project [enter reference under both "The . . ." and "Realtime . . ."] United Nations Educational, Scientific, and Cultural Organizations @@see@@ Unesco Whitaker, J., & Sons @@see@@ J. Whitaker & Sons Wilson, H. W., Company @@see@@ H. W. Wilson Company Note: The form and presentation of the cross-references will differ in displayed and non-displayed indexes. 6.3.9.3. Geographical names. Geographical names should be as full as necessary for clarity, with qualifiers to avoid confusion between otherwise identical names: Middletown (CT) Middletown (OH) Middletown (Powys, Wales) Abbreviations should not be used unless there is a commonly accepted standard, such as U. S. Postal Service abbreviations for states of the United States. Prefer the English form if there is one in general use. Otherwise use the form in the official language of the country: Buenos Aires An article or preposition should be retained in a geographical name of which it forms an integral part: Des Moines Las Vegas Los Angeles The Dalles The Hague Note: See 9.3. regarding the arrangement of headings with initial articles. 6.3.9.4. Titles of documents. To the extent possible within typographic constraints, titles of documents should not be changed or altered. For example, the name of a chemical should not be substituted for a chemical symbol or a numeral replaced with its name. Titles should not be abbreviated unless very long, and any omissions should be indicated: @@Inquiry into the nurturing and elimination of life forms within marginally controlled ecosystems . . .@@ Titles with numerals, especially initial numerals, should be linked with equivalent titles with names of numerals: @@Ten sixty-six and all that@@ @@see@@ 1066 and all that @@Nineteen eighty-four@@ @@see@@ 1984 @@Two thousand and one@@ @@see@@ 2001 If necessary to avoid confusion, qualify the title of a document with a term that will indicate that it is a document: @@Charlemagne@@ (play) @@Genesis@@ (Anglo-Saxon poem) If necessary for identification, names of creators, places of publication, dates, or other qualifiers may be used: @@Ave Maria@@ (Gounod) @@Ave Maria@@ (Schubert) @@Ave Maria@@ (Verdi) @@Natura@@ (Amsterdam) @@Natura@@ (Milan) An initial article should not be omitted or transposed to the end of the title: @@Das Kapital@@ (Marx) [not: @@Kapital@@ (Marx); @@Kapital, Das@@ (Marx)] @@The Tempest@@ [not: @@Tempest@@; @@Tempest, The@@] See section 9.3. regarding the arrangement of headings with initial articles. Prepositions at the beginning of a title should be retained: @@An die Musik@@ @@To the lighthouse@@ 6.3.9.5. First lines. In first line indexes, initial articles should be retained in natural order, not transposed. See section 9.3. regarding the arrangement of headings with initial articles. 6.3.10. Romanization. Names and words rendered into Roman script from another writing system should be based on standard Romanization tables unless a well-established English language form exists. Use Romanization tables published as NISO standards or adopted by the American Library Association and the Library of Congress. Alexander the Great [not: Alexandros ho Megas] Confucius [not: Kung Fu Tzu] Omar Khayyam [not: 'Umar Khayyam] Cross-reference should link alternative forms of Romanized names and other terms. 6.4. Term weights. One function of an index is to discriminate between major and minor treatments of particular topics or manifestations of particular features. One method for achieving this goal is to assign weights to terms or to indicate major and minor terms by means of typography or symbols. Another method is to attach a subheading that indicates minor treatment, such as "also mentioned in" or "passing reference". The use of some weighting scheme permits the user who wants only major treatments to eliminate minor treatments, while permitting the user who wants every treatment to find them as well. Weighted terms are especially useful in non-displayed indexes. They can be used as a basis for ranking retrieved records on the basis of estimated relevance. See also 7.4.4. Methods of emphasizing locators; 7.5.2. Weighted term combinations [I'm not happy with this, but if we are going to include "discrimination between major and minor treatments" as one of our index functions, we had better treat it somewhere!!!!] 6.5. Homographs. Identical terms which represent different concepts or features can cause confusion and should be differentiated by the addition of a qualifier: races (anthropology) races (sport) 6.6. Synonymous and equivalent terms. Research and practice indicate that index users tend not to agree on terms for particular concepts or features. Therefore, it is essential that indexes provide for the largest possible number of alternative terms, including abbreviations and acronyms. All terms that may be used for the same topic or feature within the context of an index should be linked so that any such term will lead searchers to the same documents. Terms including numerals should be linked with equivalent terms having the names of numerals: nineteenth century @@see@@ 19th century 5 year plans @@see@@ five year plans Small variations in terms that have little or no impact on filing position cause few problems in displayed indexes, but such variations can cause terms to be completely lost in indexes that are searched electronically. Therefore, terms with even small variations in spelling or endings (for example, aluminum, aluminium) should be linked in electronic indexes. All such terms with noncontiguous filing positions should be linked in displayed indexes. What constitutes equivalent terms depends on the level of specificity used in an index. Unused narrower terms need to be linked to broader or related terms that are used. Some indexes make a distinction between synonymous or equivalent terms on the one hand and broader terms that are used in place of unused narrower terms on the other. These indexes use the reference "see" or "use" for the former and "see under" for the latter: cars @@see@@ automobiles convertibles (automobiles) @@see under@@ automobiles Note: The form and presentation of the cross-references will differ in displayed and non-displayed indexes. 6.7. Hierarchical relationships among terms. Links between narrower and broader terms are important to guide the narrowing of a search to particular members of a larger set of terms (for example, from "computers" to particular types of computers) or the broadening of a search to all members of a larger set (for example, from "Labrador retrievers" to all species of dog). Examples of hierarchical relationships include: a. genus/species: furniture / chairs behavior / aggression bears / polar bears b. discipline/constituent studies: geology / petrology c. class/individual members: bridges / Golden Gate Bridge standardizing bodies / NISO d. entity/parts or kinds: buildings / rooms United Nations / Unesco population / immigrants chemical industry / petrochemical industry e. larger and smaller geographic units: Europe / France New Jersey / Middlesex County / New Brunswick 6.8. Other relationships. Links between terms having relationships other than hierarchical provide searchers with additional options for improving their searches. For example: a. discipline/objects studied: botany / plants physical chemistry / molecules b. theoretical study/application or technology: dynamics / mechanical engineering state ownership / nationalized industries c. activity/agent: photography / cameras, photographers singing / voice; singers d. activity/thing acted upon: angling / fish dentistry / teeth e. activity/product: aggression / violence cartography / maps f. closely related topics not generally differentiated in common parlance but differentiated in a particular index: boats / ships pottery / porcelain g. related topics separated in a particular index when related nouns and adjectives take different forms: law / legal . . . women / female . . . 6.9. Changes in terminology. In continuing indexes, care should be taken to link older and newer terms that are synonymous, equivalent, or closely related. The date of the change should be indicated. Examples of changing terminology include: a. the introduction of a new term as a substitute for an older term: wireless / radio Negroes / Blacks / African Americans b. name changes: Sri Lanka / Ceylon Harris, Jessica / Milstead, Jessica School of Communication, Information, and Library Studies / Graduate School of Library Service. c. the use of additional terms to express narrower topics previously embraced by a broader term: computers / microcomputers, minicomputers 6.10. Display of vocabulary in indexes. Information about vocabulary and relations among terms or headings (for example, synonymy, equivalence, homography, hierarchy, association) should be presented as an integral part of an index. Searchers should not have to consult separate, unconnected files for vocabulary information. How integration of vocabulary information and index terms or headings is accomplished depends largely on the medium of the index and whether or not indexes are displayed for searching. 6.10.1. Vocabulary information in displayed indexes. In print media, indexes must be displayed as ordered arrays of entries, since it is only through such displays that users enter the index. Such displays are becoming more common for electronic indexes as well, especially those designed for non-expert users, such as online public access catalogs (OPACs) in libraries. In displayed indexes, whether in print or electronic media, vocabulary information should be integrated into the same sequence of entries that describe documents, using a variety of notes and cross-references. 6.10.1.1. Cross-references versus double entries. In closed-end, one-time indexes (as opposed to open-end, continuing indexes), a "see" reference should be replaced by a duplicate entry under the alternative heading if the duplicate entry does not occupy more space than the cross- reference would: automobiles 23, 45 cars 23, 45 Not: automobiles 23, 45 cars @@see@@ automobiles 6.10.1.2. Cross-references to multiple terms or headings. When a cross-reference refers to multiple terms or headings, these should be listed by category if the nature of the relationship is indicated (for example, synonymous, equivalent, broader, narrower, related), and within category in alphabetical order, separated by semi-colons. If the nature of relationships is not indicated, the referenced terms or headings should be in a single alphabetical sequence. sexuality @@used for equivalent term@@ sexual nature. @@see also narrower terms@@ bisexuality; chastity; heterosexuality; homosexuality; incest; necrophilia; sublimation. @@see also related terms@@ gender; sex; sexual identity; sexual problems. @@see also broader terms@@ human nature; behavior. 6.10.1.3. Location of "see also" cross-references. "See also" cross-references should normally follow any locators related to the term or heading from which they refer: bears 100, 217, 923 @@see also@@ badgers; koala bears; raccoons However, since their purpose is not only to suggest additional terms or headings that may be useful, but also to suggest alternative terms or headings, "see also" cross-references should precede subheadings in those indexes in which many headings have a large number of subheadings. Placing "see also" references before subheadings will prevent these references from being overlooked or found only after perusing unwanted subheadings. In these cases, "see also" cross-references should be clearly distinguished from subheadings. They can be displayed on lines indented more deeply than subheadings: economics 144, 195, 229, 363 @@see also@@ assets; banking; business; commerce; firms; transport; wealth bibliographies 208 mathematical models 160 statistics 155 Cross-references should be attached to the heading or the subheading from which they refer: economics statistics 155 @@see also@@ econometrics When a cross-reference leads from a subheading under one main heading to the same subheading under another main heading, the reference should include both the main heading and the subheading referred to: economics statistics 155 @@see also@@ economic policy -- statistics 6.10.2. Vocabulary information in non-displayed indexes. To be useful, vocabulary information, including relations among terms, should be displayed to users in conjunction with and in the same medium that is used to search the index. Users should not have to consult a completely separate vocabulary file, copy down terms, and then re-enter them in order to use them for a search. Users should have the option of automatic or selective addition or replacement of synonymous and equivalent terms. In the automatic mode, if index terms are limited to preferred terms, the preferred term should replace any synonymous or equivalent terms. When all such terms may be used, then all should be added to the search. The user should be notified of any modification to the search. In the selective mode, preferred, synonymous, or equivalent terms must be displayed so that they may be selected for replacement or addition. Users should have the option of seeing displays of other vocabulary information and selecting broader, narrower, or other related terms for use in their search, either in addition to terms already selected, or in place of earlier terms. Methods for effectively displaying vocabulary information for non-displayed indexes are not yet well established. The development of such methods should be encouraged, since vocabulary information is essential for efficient use of indexes of all types. 6.10.3. Scope and history notes A scope note helps clarify the scope or application of a term. It should be set off from the term itself by means of type or layout. A history note explains changes in usage over time in a continuing index: "Radio" replaced "wireless" in 1950. This information may also be presented in the form of a cross-reference: radio @@in earlier volumes see@@ wireless wireless @@see@@ radio When both old and new entries are present in the same index, "see also" references must be used: radio @@see also@@ wireless @@for references before 1950@@ wireless @@see also@@ radio @@for references from 1950 onward@@ When cross-references refer to newer terms that formerly were subsumed under a broader term, dates should be attached to terms so that users know when such terms were introduced: computers @@see also@@ microcomputers [1977]; minicomputers [1972] microcomputers @@see also@@ computers for entries before 1977 7. Syntax for entries, headings, and search statements. 7.1. Summary. A major recommendation of this standard is that an index make it possible for users to search for multiple topics or features, or aspects of topics or features, in combination. In displayed indexes, this capability is provided by combining terms into headings and and headings into articulated entries. In non-displayed indexes, this capability is provided by a search procedure that accepts search statements with multiple terms. The means for combining terms in displayed headings and entries or in search statements is syntax. A variety of syntactic styles and methods are available. This standard does not recommend any particular syntactic method for either displayed or non-displayed indexes; it simply states that every index must incorporate some syntactic method so that terms can be combined for the purposes of searching. The type, format, and size of an index and the needs and preferences of users will govern the choice of appropriate method. The combination of terms takes place in advance of the search in displayed indexes; pre-combined headings and entries will be provided by the index producer. The combination of terms takes place at the time of the search in non-displayed indexes. The index producer provides the means for combining terms, but the actual combination is performed by the user of the index. This section provides examples of the the major types of index syntax available. Locators are an integral part of entries in displayed indexes. They are also discussed in this section. Note: This section includes examples of syntax only. The absence of cross- references in these examples in no way implies that the use of any particular syntactic method or style will by itself fulfill the recommendations of this standard WITHOUT also providing for some method of linking synonymous, equivalent, and related terms. See section 6. Vocabulary. 7.2. Entries in displayed indexes. In displayed indexes, an entry consists of at least a main heading and a locator that identifies or points to the documentary unit that the heading describes. A heading may consist of more than one index term. In an articulated entry, the main heading may be followed by a subheading, which in turn may be followed a sub-subheading. In displayed indexes, identical headings for subsequent entries are generally not displayed. Nevertheless, each locator represents a single entry. For example, the following display consists of 6 entries. economics 144, 195, 229 bibliographies 208, 244, 363 The actual entries, displayed in full, are: economics 144 economics 195 economics 229 economics \ bibliographies 208 economics \ bibliographies 244 economics \ bibliographies 363 In displayed indexes, identical headings should not be followed by more than 5 locators (that is, 5 or more identical entries should be avoided). Identical entries may be distinguished and made more specific by the addition of additional terms or subheadings for context or aspect. Even when single term entries are unique, the addition of terms or the use of subheadings is desirable. The additional terms or headings can provide additional information to help the user determine whether the documentary unit might be useful. The addition of a term indicating context or aspect can often relieve users of useless pursuits. Such additional terms provide criteria for eliminating irrelevant references without having to consult each one. 7.3. Syntax in displayed indexes. Syntax used in displayed indexes is often called "pre-coordination" syntax because the combination of terms must take place prior to the presentation of the index. The following sections describe examples of pre-coordination syntax. 7.3.1. Ad hoc syntax. Syntax is often applied on a case-by-case, heading-by-heading basis in closed-end indexes such as book indexes. Individual headings are created as appropriate for the nature of the indexed document and for the needs of prospective users. Prepositions should not be used at the end of subheadings unless needed to avoid ambiguity. When a subheading follows a main heading in natural syntactic order, an initial preposition or verb must be used to clarify the meaning: computers compared with abacus for management in hospitals management of food rationing [not: rationing of] land irrigation [not: irrigation of] taxation [not: taxation of] use [not: use of] management of computers management use of computers rationing food fuel health care use land capital labor Note: The above examples are shown as they would appear in a printed index, but other styles of linking lead terms and subheadings are also possible, depending on index format and medium. 7.3.2. Natural language syntax. Some indexes take advantage of the syntax of pre-existing segments of text to provide syntax for index headings. The most common examples are keyword indexes based on document titles, but the text segment could come from other parts of documents as well, for example, section titles or captions. 7.3.2.1. KWIC indexes Key Word in Context (KWIC) indexes are created by computer algorithms that rearrange titles or other brief text segments (usually not exceeding one line of print) in order to highlight keywords by position and sometimes by typography . The natural word order of the title or text segment is preserved on both sides of a keyword, which is displayed in a central column. Headings are arranged on the basis of the keyword and words following the keyword. Text that extends beyond the right margin may be moved to the left-hand portion of the column if there is room, or may be eliminated: zation and presentation of INDEXES. Guidelines for the cont al Standard Guidelines for INDEXES in Information Retrieval Guidelines for Indexes in INFORMATION Retrieval American N dexes in Inf American NATIONAL Standard Guidelines for In+ Note: Word pairs and phrases within the text segment (for example, Information Retrieval; National Standard) are preserved in the ordering of keywords so that such word pairs or phrases may be sought in the alphabetical sequence. 7.3.2.2. KWOC indexes Key Word Out of Context (KWOC) indexes resemble the traditional format of indexes with a lead term on the left followed by a an indented subheading. The lead term is usually not repeated when it is the same for subsequent entries: Indexes American National Standard Guidelines for Indexes in Information Retrieval Guidelines for the content, organization and presentation of indexes Information American National Standard Guidelines for Indexes in Information Retrieval National American National Standard Guidelines for Indexes in Information Retrieval Note: Word pairs and phrases are not preserved in the alphabetical sequence of keywords. It is not possible to look up directly phrases such as "information retrieval," "national standard," etc., unless the second word happens to be the first word of the subheading formed from the title or text segment. The loss of direct access to word pairs and phrases is a disadvantage. 7.3.2.3. KWAC indexes Keyword Alongside Context (KWAC) indexes preserve word pairs and phrases in the alphabetical sequence of keywords while at the same time imitating the traditional format with the lead term on the left: Indexes Guidelines for the content, organization and presentation of in Information Retrieval. American National Standard Guidelines for Information Retrieval. American National Standard Guidelines for Indexes in National Standard Guidelines for Indexes in Information Retrieval. American 7.3.3. Subject headings [Hans has crossed out this entire section, saying "not a syntax," but subject headings incorporate rules and patterns of syntax and they are a major source of syntactic devices in many types of index. -- JDA] Syntax may be provided by using pre-established lists of subject headings. Such lists generally include headings consisting of pre-combined terms or provisions for combining terms at the time of indexing in accordance with rules or patterns. Pre-combination may be achieved in two ways: a. linking terms to each other by long dashes: animals -- diseases -- chemotherapy libraries -- New Jersey -- New Brunswick b. modifying the lead term by other terms so that the natural word order is inverted: students, foreign Both methods may be combined, to create headings such as "students, foreign -- statistics". Note: These standards advocate natural language word order for terms and headings. The inverted headings shown here reflect earlier practice, still extant in some lists of subject headings. See 6.3.8. Terms consisting of more than one word. 7.3.4. Permuted indexes. Permuted indexes display every possible combination of words from a segment of text or from a set of terms. The result is a main heading-subheading combination for each word pair. Since the number of such combinations increase exponentially as the number of words in each heading increases, permuted indexes are usually restricted to headings consisting of no more than two words. The titles used for the previous examples of keyword indexes would produce the following permuted index headings: INDEXES AMERICAN CONTENT GUIDELINES INFORMATION NATIONAL ORGANIZATION PRESENTATION RETRIEVAL STANDARD INFORMATION AMERICAN GUIDELINES INDEXES NATIONAL RETRIEVAL STANDARD NATIONAL AMERICAN GUIDELINES INDEXES INFORMATION RETRIEVAL STANDARD [Hans says "bad example", but it applies permutation to the same example as for other keyword indexes. Yes, there are ridiculous entries, but folks tend to ignore or overlook them. That is the price of automatic formatting -- a price that millions are willing to pay, which is how ISI became a multi-million dollar industry!] 7.3.5. String indexing. String indexing uses computer algorithms to combine multiple terms into multiple headings. Each heading has a different term as its lead or main term. The set of terms is treated like a "string" or sequence of terms that is rearranged under each lead term. The terms themselves may be assigned by human indexers. It is their manipulation into index entries that is governed by computer algorithms. 7.3.5.1. Rotated terms. The simplest form of string indexing places each term, in turn, in the lead position followed by all other terms in alphanumeric order. In the following examples, numerals are filed after letters: American Mercury (periodical). Editors and Editing. Ku Klux klan. Mencken, H. L. Methodist Episcopal Church (South). Temperance Movements. 1910-33. Editors and Editing. American Mercury (periodical). Ku Klux klan. Mencken, H. L. Methodist Episcopal Church (South). Temperance Movements. 1910-33. Ku Klux klan. American Mercury (periodical). Editors and Editing. Mencken, H. L. Methodist Episcopal Church (South). Temperance Movements. 1910-33. [Hans says "bad example." This is taken from ABC Clio's American History and Life. It seems to me a fine example of rotated string syntax. The idea here is that lead terms modified by other terms are better than isolated terms, and users can construct sense out of the conceptual order of the subheadings.] 7.3.5.2. Faceted indexing Faceted indexing arranges terms in entry strings according to facet relationships. Terms are first placed into facets or tagged with facet indicators. A computer algorithm then uses the facet to arrange the terms into subheadings. Faceted indexing that is designed to accommodate broad subject areas uses generic term categories like location, key system or entity, action or effect of action, agent or instrument, viewpoint or aspect, particular instance, document form, and target user. These primary categories are sometimes modified by secondary categories such as part, property, role definer, modifiers, dates, and various connectives. The following coded terms will produce the following headings and subheadings: [location] West Germany [key entity] cars [modifier] Japanese [action] sales [role definer] effects of / on [agent, instrument] advertising Germany Japanese cars. sales. effects of advertising cars. Germany Japanese cars. sales. effects of advertising Japanese cars. Germany sales. effects of advertising Sales. Japanese cars. Germany effects of advertising advertising. Japanese cars. Germany effects on sales When faceted indexing is applied to a narrow subject area, facets tend to be tailored to aspects of particular interest in that subject area. In literature, for example, terms may be placed into facets such as specific literatures, performance media, language, periods, individuals (real), groups/movements, genres, works, literary techniques, themes/motifs/figures/characters, influences, sources, processes, methodological approaches, theories, devices/tools, and disciplines. The designated citation order for these facets determines the order of terms in subheadings: HOMOSEXUALITY English literature. short story. 1900-1999. Forster, E. M. "Dr. Woolacott." symbolism. treatment of salvation; HOMOSEXUALITY. SALVATION English literature. short story. 1900-1999. Forster, E. M. "Dr. Woolacott." symbolism. treatment of SALVATION; homosexuality. SYMBOLISM English literature. short story. 1900-1999. Forster, E. M. "Dr. Woolacott." SYMBOLISM. treatment of salvation; homosexuality. 7.3.5.3. Ad hoc coding. Some forms of string indexing require indexers to encode a natural language statement. The statement may be created to describe a document or may already exist as a text segment, such as a title. For example, NEPHIS (Nested Phrase Indexing System developed by Timothy Craven) uses pointed brackets < > to enclose meaningful words or phrases that deserve direct entry; the question mark is used to introduce connectives, usually prepositions; and the symbol '@' is used to turn off otherwise automatically generated headings. The following coded statement will result in the following headings: @effects? of ? on >? in > advertising effects on sales of Japanese cars in Germany cars Japanese -. sales in Germany. effects of advertising Germany sales of Japanese cars. effects of advertising Japanese cars sales in Germany. effects of advertising sales of Japanese cars in Germany. effects of advertising 7.3.5.4. Chain indexing. Chain indexing is based on the terms and the citation order of facets or aspects in a classification scheme. The chain index produces headings that complement the classification scheme by creating a "chain" of terms from the classification heading but reversing the order in which facets or aspects are cited. The following headings from the Dewey Decimal Classification would produce the following chain index entries: 100 philosophy 170 ethics 172-179 applied ethics 178 ethics of consumption 178.1 in use of alcoholic beverages alcoholic beverages: consumption: applied ethics 178.1 applied ethics 172-178 beverages: alcoholic: consumption: applied ethics 178.1 consumption: applied ethics 178 ethics 170 philosophy 100 [Hans says: choose better example (no centered headings). But centered headings are common in classification. Jessica wonders "is this used enough today to warrant inclusion here?" But we say elsewhere that indexes that are displayed in a classified or relational array need an alphabetical index as well, and chain indexing is a natural and efficient method for generating such an index. See 8.1.1.2. Classified or relational displays.] 7.3.6. Syntactic Cross-references. When syntax rules place terms only in secondary positions in headings or entries, cross-references must be used to provide direct access to such terms: United States -- History -- Civil War -- Bibliography [established heading] History -- United States @@see@@ United States -- History Civil War -- United States -- History @@see@@ United States -- History -- Civil War Bibliography @@see also@@ particular topics with "bibliography" as subheading, for example: United States -- History -- Civil War -- Bibliography 7.4. Locators in displayed indexes. The purpose of a locator is to lead the user to the documentary unit or to a description of the documentary unit to which an index entry refers. The nature of the locator depends on the medium and type of index and on the type of documentary units to which the index refers. In electronic indexes, index terms or headings may be linked to documentary units or to their surrogates without visible locators. Locators should refer as directly and succinctly as possible to the documentary units to which index headings refer. (See also 5.7. Documentary units.) 7.4.1. Locators for printed documents. Printed books, pamphlets, periodicals, and similar documents normally consist of numbered pages bound into one or more volumes. Pages are the traditional documentary units for indexes to printed books, pamphlets, and similar documents because pages are usually numbered while inherent textual or conceptual units, such as paragraphs, are usually not numbered. If pages are divided in some way, such as into columns, such smaller units may be used instead of or in addition to pages. With certain classes of printed material, inherent textual units are often numbered and may therefore be used as locators. For example, parts of plays may be referred to by act, scene, and line number, and parts of books of the Bible by chapter and verse number. If documents have numbered paragraphs, then paragraphs rather than pages should be used as documentary units and paragraph numbers should be used as locators. When a document consists of a series of uniquely numbered discrete units, such as abstracts, quotations, or case reports, these units are preferable as locators. When there is more than one numbered sequence, they must be distinguished typographically: Livingstone, Ken 1/3, 1/97, 3/94 or Livingstone, Ken 1:3, 1:97, 3:94 When indexing several issues or volumes of a periodical or serial publication, locators should be based on the numbering of the issues at the time of publication. When documentary units are documents within a collection, for example, articles in a periodical, chapters in a monograph, or letters in an archive, sufficient information must be given to identify the document. For periodical articles, each locator normally consists of: author(s); title of article; title of periodical; volume, issue number, inclusive pagination, and date. The content, format, punctuation, and order of elements should conform to relevant standards, such as @@American National Standard for Bibliographic References@@ (ANSI Z39.29-1977); @@International Standard, Documentation -- Bibliographic references -- Content, Form and Structure@@ (ISO 690: 1987 (E)); @@International Standard, Documentation -- Bibliographic Identification (biblid) of Contributions in Serials and Books@@ (ISO 9115: 1987). Abbreviation of names and titles should be avoided, especially if they can be searched electronically. [In the penultimate paragraph, Hans wants to list date before volume in accordance with ISO 9115, but the NISO standard Z39.29 puts date after volume, etc. Shouldn't we follow NISO?] 7.4.2. Locators for documents in other media. Documents in other media may, for indexing purposes, be divided into three types: a. Those consisting of elements that form one or more sequences that are, or may be, continuously numbered and so accessed by the user. Such materials may be treated broadly as in 7.4.1. Examples are a collection of slides, a filmstrip, an audiodisc, or a machine-readable database. Locators would be slide numbers, frame numbers, side and band numbers, and record identifiers respectively. b. Those consisting of one or more sequences of elements that cannot be distinguished numerically or so accessed by the user. Examples are serially accessed materials such as motion picture film, audio, and video recordings. In these cases, relative locators must be devised, such as playing time from a particular point. c. Those not consisting of sequences, such as individual maps, plans, charts, pictures, sculptures, and realia. In some cases specific conventions exist, such as either grid references or coordinates for maps. In other cases, locators must be devised ad hoc. Note: Most machine-readable text files fall into either category (a) or category (b). Locators for such files may also take the form of file pointers or embedded terms or tokens. 7.4.3. Multiple locators in print indexes to single documents. If a subject is given continuing treatment in a consecutively numbered sequence, reference should be made to the first and last numbered elements only (for example, 3-11). The first and last element should be given in full in order to avoid ambiguity (for example, 20-25, 103-112, 1014-1027, not 20-5, 103-12, 1014-27). However, when locators are extremely long (5 digits or more) or where space is limited, unchanged initial numerals may be elided (for example, 100026-28). Expressions such as "3ff" or "3 et seq." should not be used because they are confusing to most users and may give incomplete information unless defined for a particular index. If the treatment of a subject appears in a consecutively numbered sequence but consists of separate treatments (as opposed to a single, continuing treatment), individual locators for each numbered element should be used (for example, 3, 4, 5). 7.4.4. Methods of emphasizing locators in print indexes. If an entry includes several locators, the reference leading to the fullest or most significant treatment may be emphasized typographically or by position. Locators that relate to particular types of matter, such as tables and illustrations, may also be emphasized. Locators to illustrations, for example, may be italicized, enclosed in brackets, or prefixed or suffixed with an 'i' or asterisk. Where more than one type of material is indicated, it is preferable to use the same system for all (for example, 'i' for illustrations, 'm' for maps, 't' for tables). economics 144, 195, @@229@@ [major treatment emphasized by typography] bibliographies 208, 244, 363 major university departments 210-212, 211m, 212t [map and table indicated] economics 229; also 144, 195 [major treatment emphasized by location of locator] 7.4.5. Presentation of locators in print indexes. Locators should be clearly separated from headings by spacing, punctuation, or both, for example, by two spaces or by a comma or colon plus one space. The method used should depend on the nature of headings and the kind of punctuation used within headings. For example, headings that may end with commas and dates or other numerals should not use a comma plus space to introduce locators: Paris, 1989 : 1934, 2045 [not Paris, 1989, 1934, 2045] vitamin B 12: 13, 15 [not vitamin B 12 13, 15] The method for presenting locators should be consistent throughout an index. 7.4.6. Presentation of other identifying data in print indexes. Some indexes add information to citation locators indicating the presence of photographs, tables, and other illustrations or features. These indications, like index headings and subheadings, assist users in deciding whether documentary units are likely to be of value to them. They should be placed after the locator, separated from it by a period: Doe, John. The indexing of pictures. Journal of Indexing. 2: 25-87; 1983. 4 ill. (1 col.), 1 table, bibl. 7.5. Syntax in non-displayed indexes. In non-displayed indexes, the search statement plays a role similar to the index heading or entry in a displayed index, in the sense that the search statement may combine terms representing the topics and features and their aspects that a user is seeking. Since terms are combined after any initial indexing is done, syntax in non- displayed indexes is often called "post-coordination syntax." Two common approaches in post-coordinate indexing are the use of Boolean operators (AND, OR, NOT) and the use of weighted terms. Post- coordination syntax also includes proximity operators, stemming, and truncation. In addition, such syntactic devices as the use of links and role indicators may influence the application of post-coordinate syntax. Compared to displayed indexing, non-displayed indexes are new, and additional methods for creating search statements are under development. 7.5.1. Boolean syntax. Boolean syntax combines terms using the operators AND, OR, and NOT. It has become a de facto standard for non-displayed indexes in electronic databases, but it has two major drawbacks: First, the meaning of AND and OR does not correspond to the usual senses of these words. Second, the Boolean search divides a database into two distinct sets, retrieved and not retrieved. Retrieved documents are not ranked in any way on the basis of possible or probable interest. Documents that meet most but not all requirements of a search are not retrieved. In Boolean searches, retrieval is "either/or" -- no "maybes" are considered. In most syntactic systems, the addition of terms serves to limit the scope of a search. The use of the Boolean OR has the opposite effect. It increases the scope of a search. It is often used to combine synonymous or equivalent terms. 7.5.2. Weighted term syntax. Searching by weighted term combination, also called "vector" or "probabilistic" searching, combines terms and retrieves all documents that are represented by one or more of the search terms. Retrieved documents are then ranked, with those having the highest number of terms coming first. Both index terms and search terms may be weighted to reflect importance or interest. Such weights can further influence the calculation of "retrieval scores" for the purposes of ranking documents. Instead of dividing a database into two distinct sets (retrieved and not retrieved), weighted combination searching rearranges the entire database along a continuum of estimated degree of interest based on the search statement. 7.5.3. Proximity operators, stemming, and truncation. Both Boolean and weighted term combination syntax may be combined with a wide variety of methods for broadening or limiting the scope of a search statement. These methods include, but are not limited to, the use of: proximity operators, which specify that two or more terms must be in certain proximity; stemming, which removes certain suffixes and/or prefixes; and truncation, which permits the use of parts of words, including word-roots. 7.5.4. Links and role indicators. Links and roles are syntactic devices applied at the indexing stage that are designed to make post-coordinate searching more precise. Links are used to indicate terms that may be logically linked to represent topics or features of the documentary unit. Linking eliminates accidental retrieval of documentary units by the combination of terms that individually describe the documentary unit but that have no logical relationship. For example, the following terms for a documentary unit on the poetry of Thomas Hardy and the novels of E. M. Forster would be linked. Forster, E. M. -- novels Hardy, Thomas -- poetry If a search is limited to linked terms, then the term combination "Hardy, Thomas -- novels" would not retrieve this documentary unit. Roles are used to indicate the roles played by concepts represented by particular terms in particular documentary units, for example: insulin -- therapeutic use insulin -- product These role indicators would prevent the retrieval of a documentary unit treating the manufacture or marketing of insulin as a product when its therapeutic use was wanted. Role indicators can consist of role terms, as in these examples, or by special notation. 8. Display of index arrays. In print media, individual index entries must be displayed in ordered arrays, which provide the means of access to particular headings and entries. Therefore, the method of ordering entries is absolutely crucial. In electronic media indexes, entries may be sought by means of electronic matching without regard to index order. However, index displays in electronic media may suggest options for searching and permit browsing and scanning. Such electronic visual index displays also need to be arranged in helpful order. Entries retrieved by means of searching non-displayed electronic indexes are displayed in arrays a fter retrieval. These arrays, too, should be ordered according to useful criteria. 8.1. Introductory note. If a displayed index is not straightforward or its conventions self- explanatory, an explanatory introductory note should precede the index. Any abbreviations, symbols, or typographical conventions, requiring explanation should be including in this note. In the case of separately published indexes, the introductory note should include sufficient bibliographic information (for example, author, title, publisher, place and date of publication or periodical volumes/issues) in order to completely identify the documents indexed. (See also 5. The design of indexes.) ========================================================================= Date: Wed, 28 Apr 1993 16:44:42 ECT Reply-To: Indexer's Discussion Group Sender: Indexer's Discussion Group From: "Neva J. Smith" Subject: Fax Machines or e-mail/ftp In-Reply-To: <9304281851.AA14235@emx.cc.utexas.edu> ----------------------------Original message---------------------------- Has anyone used e-mail or ftp as a substitute for the fax machine or fax modem? What results have you had? Smith, Neva J. dba DataSmiths Information Services njsmith@emx.cc.utexas.edu =========================================================================