Glossary

 

Aboutness – Aboutness is the content or subject of an entity. 


Access points – Access points are portions of a bibliographic record that allow access to the entity corresponding to that record.  In cataloguing, some access points are title, author and ISBN.


Authority files – An authority file contains all the allowed spelling and representations of items that are in the database the file controls.  The authority file is an example of controlled vocabulary for a catalog.


Basic-Level category – A basic level category is similar to a natural category, that most people recognize and understand.  Examples are color or furniture.


Bibliographic control – Bibliographic control is the way that descriptions of information packages are formed or created.  It is the way that knowledge is organized.


Case frames –  A case frame is a structural tool for creating a relationship between entities. 


Compositionality – In natural language, when a different, more complex meaning is created by the combining of two words.  One example is hot tea or a piggy bank.

Conceptual schema – A conceptual schema is all the contents and the rules for input, querying, and export of that content for an information system.

Crosswalk – A crosswalk is a program or system for switching between two classifications, for example moving from Dewey Decimal to Library of Congress.


Data Model – A data model is a system for connecting entities to create a database.  There are three types of data models: Relational, Entity Relationship, and Object Oriented.

 

Data Warehousing – Data warehousing is the way that information is stored for retrieval and searching.  It must be stable and useful.


Descriptive cataloguing – As opposed to subject analysis, descriptive cataloguing identifies and describes information packages.  Descriptive cataloguing is about of-ness, rather than aboutness.


Decision Support Systems (DSS) – A Decision Support System is one type of data warehousing.  It is meant as a information management tool.


Facets – Facets are the individual aspects or characteristics of an entity, each of which can be used to describe it.  In combination, facets classify the entity.


Entities – Entities are objects made up of a set of values within a domain.


Free-Text – Free text is like fulltext, where all the words are searchable and there is uncontrolled vocabulary.


Inheritance – Inheritance is a type of relationship where entities share attributes with levels both above and below them.  It is similar to a family structure.


Knowledge Representation – Knowledge representation is the physical layout of information. 


KWIC Index – Keyword in Context is when entries are viewed with their corresponding information.  All words are included.


KWOC Index – Keyword out of Content is when entries are viewed without their corresponding information.  Only the keywords are used.


Literary Warrant – Literary warrant is the idea that one should classify the items one has at hand, and only add more items when they become part of the collection.  The Library of Congress followed this idea when creating their classification system.


Martel's Seven Points – Martel’s Seven points are guidelines for Library of Congress cataloguing.  They serve as a tool for those who are classifying information packages.


MARC records – Machine Readable Cataloguing records are a formal way of representing information packages.  The records include fields with specific attributes of the package.


Parsing – Parsing breaks a sentence into its individual parts and tests the grammar behind it.  It is used to test databases and programs.

  
PMEST – PMEST represents that classification system developed by Ranganathan.  It stands for
Personality, Matter, Energy, Space, and Time.


Polysemy – Polysemy is a lexical relationship where words have many meanings.


Precision and Recall – Precision and recall are two mathematical formulas to study relevance in search results.  They have an inverse relationship.


PRECIS – Preserved Content Indexing System is an alternate indexing system that is not accepted widely.  


Pre-coordinated vs. post-coordinated – With pre-coordinated (or pre-combined) the indexer keeps words together to create more complex concepts.  In post-coordinated (or post-combined) the searcher combines words to search out complex concepts.


Primary key – The primary key in a relational database is a unique field that serves as the main access point for an entity.


Query languages – Query languages are the rules by which information is found or searched within a database.


Schedule (classification) – A classification schedule is a printed or electronic volume that breaks down knowledge into subsets, or classes for cataloguing purposes.


Shelflist –  A shelflist is that order in which items are classified or kept within a collection.


State machine – A state machine says that entities can only be in one state at  a time.  Examples include a traffic light (red, yellow, green) and a door (open, closed).


Subject analysis – Subject analysis seeks to classify the aboutness of an entity, its topic or theme.  From this, one can categorize and sort entities.


Subsumption – Subsumption is a type of hierarchy.  General categories subsume, or include, more narrow categories.

 

Taxonomies - A taxonomy is a type of knowledge representation that categorizes concepts and entities. They are often hierarchical.

 

Technical reading – Cataloguers perform a technical reading of an information package in order to perform descriptive cataloguing and subject analysis.


Thesaurus –  A thesaurus is a type of index, with a controlled vocabulary that uses hierarchy, equivalence and association to show relationships between terms.


Truth Value – A truth value is related to Boolean searching and simple logic. 


Erika McCoy
eamccoy@jhu.edu
Updated December 17, 2002