Metadata Thesauri
Thesauri and Keywords
Thesauri (or thesauruses) are indexed determined collections of keywords. The use of a specific list of keywords to describe the characteristics of dataset facilitates in the first place its more efficient management, but more importantly allows its easier retrieval. The use of multiple similar words to define the same thematic characteristic generates non matching results while searching for data by themes. Data often is not retrieved by themes because the words describing the theme in the dataset and the word used for the search don’t match even if their meaning is extremely similar. thesauri contain a limited number of words that uniquely describe specific themes. A clarifying comparison is the similar problem that people might have while looking for information for a specific business on the yellow pages. People searching on yellow pages might experience some frustration searching for a specific business, because there are several similar words that can describe the same business and at the beginning it is unclear which one to use. Therefore often yellow pages have in their first pages a list of potential words for businesses that point to few keywords, under which all the business can be retrieved. The yellow pages are using a specific thesaurus of keywords, and potential users can more easily search it if they have available the list of keywords of the used thesaurus. By using keywords words from a thesaurus and letting know users that the specific thesaurus has been used, there is a higher chance that data can be retrieved. Thus ideally the thematic and geographic content of geospatial data sets should be consistently described using keywords of a thesaurus. The CSDGM “encourages” metadata creators to use theme keywords derived from a formally registered thesaurus or a similar authorative source of theme keywords. If no thesaurus is used, a “none” value can be entered.
Geographic Thesauri Links
Getty Thesaurus of Geographic Names
Excellent tool to collect several geographic keywords for a given location at different scale level (ex. populated place, district, province, country, continent). The Getty Thesaurus of Geographic names (TGN) is a structured vocabulary currently containing around 1,102,000 names of populated areas, political regions, infrastructure, hydrographic, hypsograpic and vegeatation sites around the globe. Linked to the place record are names in English and other languages, geographic coordinates, notes, sources for the data, and place types. The TGN describes also the broader context of the places represented as part/whole relationship within a hierarchy.
GEOnet Names Server
Very extensive collection of Geographic Keywords. The GEOnet Names Server (GNS) provides access to the National Geospatial-Intelligence Agency's (NGA) and the U.S. Board on Geographic Names' (US BGN) database of foreign geographic feature names. The database includes 4,000,000 locations around the globe of populated areas, political regions, infrastructure, hydrographic, hypsograpic and vegeatation sites with their geographic location.
Thematic Thesauri Links
GCMD - Science Keywords and Associated Directory Keywords
The GCMD Earth science Keywords has been developped and mantained by NASA to meet the needs of the Earth science community. These keywords are organized in a hierarchy constructed of TOPICs, TERMs, and VARIABLEs. This is one of the most commonly used thesaurus to document GIS resources related to Earth Science.
CIESIN - Indexing Vocabulary
The CIESIN Indexing Vocabulary was developed to index data resources and data sets related to human interactions in global change. Metadata records containing CIESIN Indexing Terms appear in the SEDAC Gateway and the Global Change Master Directory (GCMD). CIESIN Indexing Terms consist of a controlled thesaurus of socioeconomic and environmental parameters or indicators arranged according to nine Science Data Domains or "Topics".
CAB - The Thesaurus for the Applied Life Sciences
For indexing animal health and agricultural Internet sites.
CERES/NBII - Thesaurus Partnership Project
CERES and National Biological Information Infrastructure (NBII) Biological Resources Division (BRD) are collaborating on the development of an Integrated Environmental Thesaurus and Thesaurus Networking Tool Set for Metadata Development and Keyword Searching.
Government of Canada - Core Subject Thesaurus
The Government of Canada (GoC) Core Subject Thesaurus Web site has been developed to help Content managers, Librarians and Metadata developers to select controlled subject terms for metadata requirements. The GoC Core Subject Thesaurus (CST) is to be used as a source of standardized terminology for the indexing and retrieval of GoC information resources in various formats. Its main function to standardize the external form and meaning of index terms, thus ensuring that a particular concept or subject will always be represented in the same way in GoC metadata resources.
NASA Thesaurus
The NASA Thesaurus contains the authorized subject terms by which the documents in the NASA Aeronautics and Space Database are indexed and retrieved. The scope of this controlled vocabulary includes not only aerospace engineering, but all supporting areas of engineering and physics, the natural space sciences (astronomy, astrophysics, and planetary science), Earth science, and to some extent, the biological sciences.
Anderson Land Cover Classification System
The Anderson Land Cover classification system has been developed to meet the needsfor an up-to-date overview of land use and land cover throughout the US on a basis that is uniform in categorization at the more generalized first and second levels and that will be receptive to data from satellite and aircraft remote sensors. The proposed system uses the features of existing widely used classification systems that are amenable to data derived from remote sensing sources. It is intentionally left open-ended so that Federal, regional, State, and local agencies can have flexibility in developing more detailed land use classifications at the third and fourth levels.
NLCD Land Cover Class Definitions
The classification system used for NLCD is modified from the Anderson land-use and land-cover classification system. Many of the Anderson classes, especially the Level III classes, are best derived using aerial photography. It is not appropriate to attempt to derive some of these classes using Landsat TM data due to issues of spatial resolution and interpretability of data. Thus, no attempt was made to derive classes that were extremely difficult or “impractical” to obtain using Landsat TM data, such as the Level III urban classes.
Cowardin U.S. Fish and Wildlife Service Wetland Classification System
In 1979, a comprehensive classification system of wetlands and deepwater habitats was developed for the U.S. Fish and Wildlife Service
NBII Systematics
The NBII serves as a gateway to these resources, selecting, annotating, and organizing them according to topic and discipline for ease of discovery and access by NBII users.
Dictionary of Geologic Terms
National Soil Survey Center (NSSC) Soil Taxonomy
Viikki Science Library Thesaurus of agriculture, forestry, food, nutrition, household sciences, consumer research, rural development and environmental sciences
A tool for selecting terms for information retrieval and indexing documents related to agriculture, forestry, food, nutrition, household sciences, consumer research, rural development and environmental sciences.
FAO - Agrovoc database
The AGROVOC Thesaurus has been developed by FAO and the Commission of the European Communities in the early 1980s and is used by AGRIS and CARIS information systems of FAO for indexing and retrieval since 1986. It is a multilingual structured and controlled vocabulary designed to cover the terminology of all subject fields of agriculture, forestry, fisheries, food and related domains. The last edition of AGROVOC (Third Edition, Version 1997) contains over 46000 terms (key words), of which 16105 base descriptors (English), 9480 English synonyms, 8693 French synonyms and 12086 Spanish synonyms. Supplements to AGROVOC are produced yearly.
FAO - ASFA (Aquatic Sciences and Fisheries Abstracts) Thesaurus
Aquatic Sciences and Fisheries Abstracts (ASFA) is an International Cooperative Information System which comprises an abstracting and indexing service covering the world's literature on the science, technology, management, and conservation of marine, brackish water, and freshwater resources and environments, including their socio-economic and legal aspects. The ASFA bibliographic database is the principal output of the system and it contains over 900,000 references, with coverage since 1971
Environmental Protection Agency: Terms of Environment
"Terms Of Environment" defines in non-technical language the more commonly used environmental terms appearing in publications, news releases, and other documents of the US Environmental Protection Agency (EPA).
On-Line Glossary of Technical Terms in Plant Pathology
UNESCO Thesaurus
The UNESCO Thesaurus is a controlled vocabulary developed by the United Nations Educational, Scientific and Cultural Organisation which includes subject terms for the following areas of knowledge: education, science, culture, social and human sciences, information and communication, and politics, law and economics. It also includes the names of countries and groupings of countries: political, economic, geographic, ethnic and religious, and linguistic groupings
References
Brown, Fred. (1998). Vocabulary Links:// Thesaurus Design for Information Systems - seminar by Dr. Bella Hass Weinberg.
Available at : http://www.allegrotechindexing.com/article02.htm
Craven, T. (2001). Thesaurus construction. London, Ont.: University of Western Ontario.
Available at : http://instruct.uwo.ca/gplis/677/thesaur/main00.htm
Cross, Phil, Dan Brickley, Traugott Koch. (2000). Conceptual relationships for encoding thesauri, classification systems and organised metadata collections and a proposal for encoding a core set of thesaurus relationships using an RDF Schema.
Available at : http://www.desire.org/results/discovery/rdfthesschema.html
Doerr, Martin. (2001). Semantic problems of thesaurus mapping. Journal of Digital information. 1(8)
Available at : http://jodi.ecs.soton.ac.uk/Articles/v01/i08/Doerr/
Fell, P. H. and hansen D. T. Construction of a Theme Keyword Thesaurus for Indexing Search and Retrieval across Networks. ESRI GIS library.
Available at: http://gis.esri.com/library/userconf/proc97/proc97/to200/pap187/p187.htm
Ganzmann, Jochen. (1990). Criteria for the evaluation of thesaurus software. International classification, 17(1990) No. 3/4, p. 148-157, 23 refs., 1 appendix.
Available at : http://www.willpower.demon.co.uk/ganzmann.htm
Harpring, Patricia. (1999). Indexing with the Getty Vocabularies. (PowerPoint File, 980 KB) . In: Presentations from the Subject Analysis and Retrieval Working Group Conference Controlled Vocabulary and the Internet, September 29, 1999. (CENDI Conference).
Available at : http://www.cendi.gov/presentations/harpring.PPT
Hart, Quin J. (1999). The CERES/NBII Thesaurus Partnership Project, (Project Web Site). In: Presentations from the Subject Analysis and Retrieval Working Group Conference Controlled Vocabulary and the Internet, September 29, 1999. (CENDI Conference).
Available at : http://ceres.ca.gov/thesaurus/
Kosovac, Branka. (1997?; 1999). Internet/Intranet and thesauri [Web Page].
Available at : http://www.nrc.ca/irc/thesaurus/roofing/report_b.html.
Miller, Paul. (2000). I say what I mean, but do I mean what I say? Ariadne. 23: Mar. 22.
Available at : http://www.ariadne.ac.uk/issue23/metadata/#26
Willpower Information. (1992). Thesaurus principles and practice.
Available at : http://www.willpower.demon.co.uk/thesprin.htm