Metadata Resources

CSDGM Overview

Metadata Tools
Metadata Links
 
Home > Metadata Resources > CSDGM - The FGDC Standard

CSDGM - The FGDC standard

CSDGM standard - How the standard was developed?

The Federal Geographic Data Committee (FGDC) in 1994 approved a first version of a content standard for metadata, the Content Standard for Digital Geospatial Metadata (CSDGM). The Executive Order 12906 signed by President Clinton in 1994, "Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure," requires all Federal agencies to use the CSDGM standard to document data that they produce beginning as of January 1995. In June 1998, the FGDC endorses the 2nd version of the FGDC Content Standard for Digital Geospatial Metadata (FGDC-STD-001-1998). At the present time all the US federal agencies that produce geospatial data are required to use the CSDGM standard. This standard is complex, but it provides a common framework for agencies to build detailed metadata upon. State and local agencies have been encouraged to adopt this metadata standard to help support the National Spatial Data Infrastructure (NSDI). Metadata which follow the Content Standards are machine-readable so that they can be searched and parsed on distributed NSDI Clearinghouses.

The standard provides a standard system for users to know:

  • What data are available
  • Whether the data meet their specific needs
  • Where to find the data
  • How to access the data

These standards are now slowly being implemented also by governmental, non-profit, and commercial participants worldwide that can make their collections of spatial information searchable and accessible on the Internet using free reference implementation software developed by the FGDC.

For more information about FGDC metadata standards visit the FGDC web page


Metadata Sections

The Content Standard for Digital Geospatial Metadata (CSDGM) is composed of several numbered chapters, called sections. These sections are organized hierarchically with the highest level section defined as metadata and numbered 0, being the entry point for the content of the metadata. Underneath the metadata section there are ten numbered sections, each one representing a different concept about the dataset that the metadata describes. The first seven chapters are the main sections while the remaining three are the supporting sections. Sections describe geographical dataset as a hierarchy of compound elements, data elements and information about the values of the data element. The CSDGM defines data elements as 'logically primitive items of data', and is intended as the smallest unit to describe a particular characteristic of a spatial dataset. Compound elements contain groups of data elements and eventual other compound elements and are used to organize the grouping among elements in each section. The sections establish the name and definition of each data and compound element, provide information about the values of data element and refer to which elements are mandatory and which can be repeated.

Each section is organized in three parts: the section definition, the production rules and the component elements list.

  • The section definition describes the name and definition of the compound elements that identify the overall section.
  • The production rules synthetically specify any relationship between compound elements and between compound elements and data elements using a specific syntax developed by Yourdon. Production rules also provide information on whether elements are mandatory and/or repeatable. More detailed information on production rules can be found in the Appendix D (Production Rules and Symbols) of the FGDC CSDGM Workbook.
  • The component elements list provides the names, definitions and the information about the allowed types of values that characterize each element of the section.

Main Sections

The first seven sections are the “Primary” metadata sections and are considered the main sections of the metadata. Of these seven sections, the first (Identification and Information) and seventh section (Metadata Reference) are absolutely considered mandatory, and are referred as the core elements of the metadata. Below is a list of the seven main sections with a short description of the information they contain.

Section 1 - Identification Information

This section describes basic information about the data. What is the name of the data? What is the data purpose? Who developed the data? When was the data developed? What geographic area does it cover? What are the themes associated with the data? When was the data produced? Are there any plans to update the data in the future? Are there restrictions on accessing or using the data? If any, what browse graphic files illustrate the data? What is the native dataset environment?

Section 2 - Data Quality

This section describes the quality of the data. What is the positional and attribute accuracy of the data? What events, parameters and sources were used to build the data? What are the processes steps used to build the data? Are the data complete? What is the logical consistency and completeness of the data? Is information available that allow users to decide if the data is suitable for their purposes?

Section 3 - Spatial Data Organization

This section describes the spatial information of the data. Are the data in vector or raster format? How many vector or raster object types contain the data? Are indirect spatial references (indications other than geographic coordinates, such as city, country name, street address) available to encode locations?

Section 4 - Spatial Reference

This section describes the systems used to encode the information as an earth location. What is the reference or projection system to encode coordinates in the data set? What ellipsoid model, horizontal or vertical datum are used? What is the latitude and longitude resolution of the data? What parameters should be used to convert to another coordinate system?

Section 5 - Entity and Attribute Information

This section describes the entity, attribute and attribute value information associated with the data. What and how many features or entities are included in the data? What attributes have each entity? What is the definition and domain value of each attribute? What is the accuracy of the attribute values? Are attributes values encoded? What do these code mean?

Section 6 - Distribution

This section provide information about the distributor and how to obtain the data. Who is the main contact for data distribution? What formats and media are available for distribution? What is the liability assumed by the distributor? What is the price of the data? What are the required steps to obtain the data? Is the data available online and where?

Section 7 - Metadata Reference

This section provide information on the metadata itself and the responsible party. Who and when compiled the metadata? What are the information to contact the metadata creator? What is the metadata standard name and version? What are the restriction about handling the metadata?


Supporting sections

The last three sections are the minor or supporting sections of the CSDGM content standard. The information in these sections are used as support for the seven main sections and never stand alone. Such sections define citation, temporal and contact information, as supporting tables in a relational database that can be referred multiple times to main sections and are therefore separately defined once for convenience. Below are the three supporting sections and a short description of their content.

Section 8 - Citation Information

This section provide information about the reference to be used for the data. The information of this section is very similar to the information found for books citations. Who is the data author? When was the data published? What is the official title of the data? What is the version and the publication form? What are the publication details for the data?

Section 9 - Time Period

This section identifies the date and time for events of interest for the data. What is the single year, month, day and time of the day for one or more events? If the event occurred during a range of time, what is the beginning and ending year, month, day and time of the day for one or more events?

Section 10 - Contact Information

This section provide details to contact the person or organization to gather additional information about the data and its distribution. Who is the person or organization to contact? Where are they located? What is their physical and e-mail address? How and when to contact the person or organization?


Data types

The CSDGM standard is a frame built upon two different types of elements: compound and data elements.

  • Data element is the simplest entity of metadata, a unit of description for any given characteristic of the data.
  • Compound data is a complex entity made up of other compound data and data elements.

An extremely appropriate analogy to understand this concept is the hierarchical system of folders and files. As compound data, folders contain other folders and files Files theirselves, as data elements, can not be subdivided into smaller units and are the unique holder of specific raw data.

Compound data outline includes a name, definition and short name. A unique data type, that is “compound”, is available for compound elements.

Element data forms include the name and definition of the data element, data type to describe the data element, domain of the data type and the data element short name. There are five data types, describing the kind of value to be provided, that can be associated with data elements:

  • Integer for integer numbers
  • Real for real numbers
  • Text for ASCII characters
  • Date for day of the year
  • Time for time of the day

The date and time data values must confirm to the ANSI standard date and time. For more information on specific value format to describe calendar dates, time of the day, latitude/longitude values, Network addresses and file names, consult the Appendix B (Data Element Forms for Special Values) of the FGDC CSDGM Workbook.

Every compound or data element has assigned a unique short name of eight or less characters. Short names function as element tag names in XML, their functional role is to efficiently map the metadata to assist with user implementation of the standard.

The form for the definition of a compound element is:

Compound element name -- definition.

Type: compound

Short Name: unique code to represent the specific compound element

Example:

Source Citation -- reference for a source data set.

Type: compound

Short Name: srccite

Spatial Domain - the geographic areal domain of the data set.

Type: compound

Short Name: spdom

The form for the definition of a data element is:

Data element name -- definition.

Type: (choice of "integer", "real", "text", "date", or "time")

Domain: (describes valid values that can be assigned)

Short Name: unique code to represent the specific data element

Examples:

Purpose -- a summary of the intentions with which the data set was developed.

Type: text

Domain: free text

Short Name: purpose

North Bounding Coordinate -- northern-most coordinate of the limit of coverage expressed in latitude.

Type: real

Domain: -90.0 <= North Bounding Coordinate <= 90.0;

North Bounding Coordinate >= South Bounding Coordinate

Short Name: northbc

Process Date -- the date when the event was completed.

Type: date

Domain: "Unknown" "Not complete" free date

Short Name: procdate


Domain values

The information needed for data elements requires a description of the domain values for their data type. The domain role is to specify either a list of valid values, or a range values that can be assigned to a data element. The domain for a data value can be specified by one of the following three types:

  • Specified only by type. This domain case is composed by the word “free” followed by the type of the data element, such as free text, free integer. Any valid value for the data type can represent freely the data element domain. For example:

Abstract -- a brief narrative summary of the data set.

Type: text

Domain: free text

Short Name: abstract

  • Specified by a list of values, references to a list of values or a range of values. The values used to represent the data type have to be selected from the value list or range reported. For example:

Progress -- the state of the data set.

Type: text

Domain: "Complete" "In work" "Planned"

West Bounding Coordinate -- western-most coordinate of the limit of coverage expressed in longitude.

Type: real

Domain: -180.0 <= West Bounding Coordinate < 180.0

  • Partially specified by a set of values, followed by the “free” convention. In this case, the values can be selected from a reported list or range of values, but if the reported values are not adequate a free value can be used. For example:

Process Date -- the date when the event was completed.

Type: date

Domain: "Unknown" "Not complete" free date


What is mandatory?

The main role of the CSDGM is to dictate what information and in which format should be present in the metadata. Indeed, the standard does not provide guidelines to how geospatial data is to be maintained, but what information is mandatory. Therefore, the most critical step to implement this standard is understanding which information are required and ensure that these values are always recorded. The standard defines all the metadata elements as either being mandatory, mandatory if applicable, or optional.

Mandatory

All the mandatory elements must be provided. Even if the information is not known for a specific element, the value “Unknown” or a similar value must be recorded for the element. In the graphical representation of the CSDGM, mandatory data elements are represented as non-shaded or yellow boxes.

Optional

Metadata creators can provide optional elements based on their discretion. In the graphical representation of the CSDGM, optional data elements are represented as darkly shaded or light-blue boxes. An example of optional data element is the element 10.8, Contact Electronic Mail Address. The metadata producer might or might include the e-mail address in the metadata based on its discretion. Does the contact person have an e-mail or not? Does he wish to provide it? The e-mail can be provided, but it is not necessary.

Mandatory if applicable

Some elements can either be mandatory or not, based on the type of dataset that is described. Some dataset show certain characteristics that would make mandatory to describe mandatory if applicable elements. In the graphical representation of the CSDGM, mandatory if applicable data elements are represented as lightly shaded or green boxes. As an example, the element 4.2, Vertical Coordinate System Definition, is a mandatory if applicable element. If a dataset has a vertical component, such a Digital Elevation Model, the element is applicable and the relative information needs to be provided. If a dataset does not have a vertical component, such as land use classification, the information of the element is not applicable and its values are not provided. It is important to understand that optional and mandatory if applicable elements are different. Often mandatory if applicable elements have been treated as optional, altering the real meaning of these last types of elements.


Repetition of the elements

In the CSDGM standard there are several elements that can be repeated more than once, if there are several entries that can be used in the same element. The standard defines the both the repeatable elements and the limit of number of times that the element can reoccur. When the CSDGM allows the element repetition, the CSDGM graphical representation shows a label below the element indicating how many times it can be repeated. If there is not any label below the element, the element can not be repeated. Whenever a compound element is repeated, all the mandatory children elements that compose the compound element must be repeated.

As an example, the compound element 10.4, Contact Address, can be repeated unlimited times. In the below example, the compound element Contact Address is repeated twice and so are all its children elements.

Metadata_Contact:

Contact_Information:

Contact_Organization_Primary:

Contact_Organization: IWMI (International Water Managment Institute)

Contact_Person: GIS Laboratory Manager

Contact_Position: GIS Laboratory Manager

Contact_Address:

Address_Type: physical address

Address: 127, Sunil Mawatha, Pelawatte,

City: Battaramulla

State_or_Province: Western Province

Country: Sri Lanka

Contact_Address:

Address_Type: mailing and physical address

Address: P. O. Box 2075

City: Colombo

State_or_Province: Western Province

Postal_Code: None

Country: Sri Lanka

Back to Top

Suggestions and Comments :csi@cgiar.org  
© 2004. CGIAR - Consortium for Spatial Information (CGIAR-CSI)