|
CSDGM
- The FGDC standard
CSDGM
standard - How the standard was developed?
The
Federal Geographic Data Committee (FGDC) in 1994 approved a first
version of a content standard for metadata, the Content Standard
for Digital Geospatial Metadata (CSDGM). The Executive Order 12906
signed by President Clinton in 1994, "Coordinating Geographic
Data Acquisition and Access: The National Spatial Data Infrastructure,"
requires all Federal agencies to use the CSDGM standard to document
data that they produce beginning as of January 1995. In June 1998,
the FGDC endorses the 2nd version of the FGDC Content Standard
for Digital Geospatial Metadata (FGDC-STD-001-1998). At the present
time all the US federal agencies that produce geospatial data
are required to use the CSDGM standard. This standard is complex,
but it provides a common framework for agencies to build detailed
metadata upon. State and local agencies have been encouraged to
adopt this metadata standard to help support the National Spatial
Data Infrastructure (NSDI). Metadata which follow the Content
Standards are machine-readable so that they can be searched and
parsed on distributed NSDI Clearinghouses.
The
standard provides a standard system for users to know:
- What
data are available
- Whether
the data meet their specific needs
- Where
to find the data
- How
to access the data
These
standards are now slowly being implemented also by governmental,
non-profit, and commercial participants worldwide that can make
their collections of spatial information searchable and accessible
on the Internet using free reference implementation software developed
by the FGDC.
For
more information about FGDC metadata standards visit the FGDC
web page
Metadata
Sections
The
Content Standard for Digital Geospatial Metadata (CSDGM) is composed
of several numbered chapters, called sections. These sections
are organized hierarchically with the highest level section defined
as metadata and numbered 0, being the entry point for the content
of the metadata. Underneath the metadata section there are ten
numbered sections, each one representing a different concept about
the dataset that the metadata describes. The first seven chapters
are the main sections while the remaining three are the supporting
sections. Sections describe geographical dataset as a hierarchy
of compound elements, data elements and information about the
values of the data element. The CSDGM defines data elements as
'logically primitive items of data', and is intended as the smallest
unit to describe a particular characteristic of a spatial dataset.
Compound elements contain groups of data elements and eventual
other compound elements and are used to organize the grouping
among elements in each section. The sections establish the name
and definition of each data and compound element, provide information
about the values of data element and refer to which elements are
mandatory and which can be repeated.
Each
section is organized in three parts: the section definition,
the production rules and the component elements list.
- The
section definition describes the name and definition of
the compound elements that identify the overall section.
- The
production rules synthetically specify any relationship
between compound elements and between compound elements and
data elements using a specific syntax developed by Yourdon.
Production rules also provide information on whether elements
are mandatory and/or repeatable. More detailed information on
production rules can be found in the Appendix
D (Production Rules and Symbols) of the FGDC
CSDGM Workbook.
- The
component elements list provides the names, definitions
and the information about the allowed types of values that characterize
each element of the section.
Main
Sections
The
first seven sections are the “Primary” metadata
sections and are considered the main sections of the metadata.
Of these seven sections, the first (Identification and Information)
and seventh section (Metadata Reference) are absolutely considered
mandatory, and are referred as the core elements of the metadata.
Below is a list of the seven main sections with a short description
of the information they contain.
Section
1 - Identification Information
This
section describes basic information about the data. What is
the name of the data? What is the data purpose? Who developed
the data? When was the data developed? What geographic area
does it cover? What are the themes associated with the data?
When was the data produced? Are there any plans to update the
data in the future? Are there restrictions on accessing or using
the data? If any, what browse graphic files illustrate the data?
What is the native dataset environment?
Section
2 - Data Quality
This
section describes the quality of the data. What is the positional
and attribute accuracy of the data? What events, parameters
and sources were used to build the data? What are the processes
steps used to build the data? Are the data complete? What is
the logical consistency and completeness of the data? Is information
available that allow users to decide if the data is suitable
for their purposes?
Section
3 - Spatial Data Organization
This
section describes the spatial information of the data. Are the
data in vector or raster format? How many vector or raster object
types contain the data? Are indirect spatial references (indications
other than geographic coordinates, such as city, country name,
street address) available to encode locations?
Section
4 - Spatial Reference
This
section describes the systems used to encode the information
as an earth location. What is the reference or projection system
to encode coordinates in the data set? What ellipsoid model,
horizontal or vertical datum are used? What is the latitude
and longitude resolution of the data? What parameters should
be used to convert to another coordinate system?
Section
5 - Entity and Attribute Information
This
section describes the entity, attribute and attribute value
information associated with the data. What and how many features
or entities are included in the data? What attributes have each
entity? What is the definition and domain value of each attribute?
What is the accuracy of the attribute values? Are attributes
values encoded? What do these code mean?
Section
6 - Distribution
This
section provide information about the distributor and how to
obtain the data. Who is the main contact for data distribution?
What formats and media are available for distribution? What
is the liability assumed by the distributor? What is the price
of the data? What are the required steps to obtain the data?
Is the data available online and where?
Section
7 - Metadata Reference
This
section provide information on the metadata itself and the responsible
party. Who and when compiled the metadata? What are the information
to contact the metadata creator? What is the metadata standard
name and version? What are the restriction about handling the
metadata?
Supporting
sections
The
last three sections are the minor or supporting sections of the
CSDGM content standard. The information in these sections are
used as support for the seven main sections and never stand alone.
Such sections define citation, temporal and contact information,
as supporting tables in a relational database that can be referred
multiple times to main sections and are therefore separately defined
once for convenience. Below are the three supporting sections
and a short description of their content.
Section
8 - Citation Information
This
section provide information about the reference to be used for
the data. The information of this section is very similar to
the information found for books citations. Who is the data author?
When was the data published? What is the official title of the
data? What is the version and the publication form? What are
the publication details for the data?
Section
9 - Time Period
This
section identifies the date and time for events of interest
for the data. What is the single year, month, day and time of
the day for one or more events? If the event occurred during
a range of time, what is the beginning and ending year, month,
day and time of the day for one or more events?
Section
10 - Contact Information
This
section provide details to contact the person or organization
to gather additional information about the data and its distribution.
Who is the person or organization to contact? Where are they
located? What is their physical and e-mail address? How and
when to contact the person or organization?
Data
types
The
CSDGM standard is a frame built upon two different types of elements:
compound and data elements.
- Data
element is the simplest entity of metadata, a unit of description
for any given characteristic of the data.
- Compound
data is a complex entity made up of other compound data and
data elements.
An
extremely appropriate analogy to understand this concept is the
hierarchical system of folders and files. As compound data, folders
contain other folders and files Files theirselves, as data elements,
can not be subdivided into smaller units and are the unique holder
of specific raw data.
Compound
data outline includes a name, definition and short name. A unique
data type, that is “compound”, is available for
compound elements.
Element
data forms include the name and definition of the data element,
data type to describe the data element, domain of the data type
and the data element short name. There are five data types, describing
the kind of value to be provided, that can be associated with
data elements:
- Integer
for integer numbers
- Real
for real numbers
- Text
for ASCII characters
- Date
for day of the year
- Time
for time of the day
The
date and time data values must confirm to the ANSI standard date
and time. For more information on specific value format to describe
calendar dates, time of the day, latitude/longitude values, Network
addresses and file names, consult the Appendix
B (Data Element Forms for Special Values) of the FGDC
CSDGM Workbook.
Every
compound or data element has assigned a unique short name of eight
or less characters. Short names function as element tag names
in XML, their functional role is to efficiently map the metadata
to assist with user implementation of the standard.
The form for the definition of a compound element is:
Compound
element name -- definition.
Type:
compound
Short
Name: unique code to represent the specific compound element
Example:
Source
Citation -- reference for a source data set.
Type:
compound
Short
Name: srccite
Spatial
Domain - the geographic areal domain of the data set.
Type:
compound
Short
Name: spdom
The
form for the definition of a data element is:
Data
element name -- definition.
Type:
(choice of "integer", "real", "text",
"date", or "time")
Domain:
(describes valid values that can be assigned)
Short
Name: unique code to represent the specific data element
Examples:
Purpose
-- a summary of the intentions with which the data set was developed.
Type:
text
Domain:
free text
Short
Name: purpose
North
Bounding Coordinate -- northern-most coordinate of the limit
of coverage expressed in latitude.
Type:
real
Domain:
-90.0 <= North Bounding Coordinate <= 90.0;
North
Bounding Coordinate >= South Bounding Coordinate
Short
Name: northbc
Process
Date -- the date when the event was completed.
Type:
date
Domain:
"Unknown" "Not complete" free date
Short
Name: procdate
Domain
values
The
information needed for data elements requires a description of
the domain values for their data type. The domain role is to specify
either a list of valid values, or a range values that can be assigned
to a data element. The domain for a data value can be specified
by one of the following three types:
- Specified
only by type. This
domain case is composed by the word “free” followed
by the type of the data element, such as free text, free integer.
Any valid value for the data type can represent freely the data
element domain. For example:
Abstract
-- a brief narrative summary of the data set.
Type:
text
Domain:
free text
Short
Name: abstract
- Specified
by a list of values, references to a list of values or a range
of values. The
values used to represent the data type have to be selected from
the value list or range reported. For example:
Progress
-- the state of the data set.
Type:
text
Domain:
"Complete" "In work" "Planned"
West
Bounding Coordinate -- western-most coordinate of the
limit of coverage expressed in longitude.
Type:
real
Domain:
-180.0 <= West Bounding Coordinate < 180.0
- Partially
specified by a set of values, followed by the “free”
convention. In
this case, the values can be selected from a reported list or
range of values, but if the reported values are not adequate
a free value can be used. For example:
Process
Date -- the date when the event was completed.
Type:
date
Domain:
"Unknown" "Not complete" free date
What
is mandatory?
The
main role of the CSDGM is to dictate what information and in which
format should be present in the metadata. Indeed, the standard
does not provide guidelines to how geospatial data is to be maintained,
but what information is mandatory. Therefore, the most critical
step to implement this standard is understanding which information
are required and ensure that these values are always recorded.
The standard defines all the metadata elements as either being
mandatory, mandatory if applicable, or optional.
•
Mandatory
All
the mandatory elements must be provided. Even if the information
is not known for a specific element, the value “Unknown”
or a similar value must be recorded for the element. In the
graphical representation of the CSDGM, mandatory data elements
are represented as non-shaded or yellow boxes.
•
Optional
Metadata
creators can provide optional elements based on their discretion.
In the graphical representation of the CSDGM, optional data
elements are represented as darkly shaded or light-blue boxes.
An example of optional data element is the element 10.8, Contact
Electronic Mail Address. The metadata producer might or might
include the e-mail address in the metadata based on its discretion.
Does the contact person have an e-mail or not? Does he wish
to provide it? The e-mail can be provided, but it is not necessary.
•
Mandatory if applicable
Some
elements can either be mandatory or not, based on the type of
dataset that is described. Some dataset show certain characteristics
that would make mandatory to describe mandatory if applicable
elements. In the graphical representation of the CSDGM, mandatory
if applicable data elements are represented as lightly shaded
or green boxes. As an example, the element 4.2, Vertical Coordinate
System Definition, is a mandatory if applicable element. If
a dataset has a vertical component, such a Digital Elevation
Model, the element is applicable and the relative information
needs to be provided. If a dataset does not have a vertical
component, such as land use classification, the information
of the element is not applicable and its values are not provided.
It is important to understand that optional and mandatory if
applicable elements are different. Often mandatory if applicable
elements have been treated as optional, altering the real meaning
of these last types of elements.
Repetition
of the elements
In
the CSDGM standard there are several elements that can be repeated
more than once, if there are several entries that can be used
in the same element. The standard defines the both the repeatable
elements and the limit of number of times that the element can
reoccur. When the CSDGM allows the element repetition, the CSDGM
graphical representation shows a label below the element indicating
how many times it can be repeated. If there is not any label below
the element, the element can not be repeated. Whenever a compound
element is repeated, all the mandatory children elements that
compose the compound element must be repeated.
As
an example, the compound element 10.4, Contact Address, can be
repeated unlimited times. In the below example, the compound element
Contact Address is repeated twice and so are all its children
elements.
Metadata_Contact:
Contact_Information:
Contact_Organization_Primary:
Contact_Organization:
IWMI (International Water Managment Institute)
Contact_Person:
GIS Laboratory Manager
Contact_Position:
GIS Laboratory Manager
Contact_Address:
Address_Type:
physical address
Address:
127, Sunil Mawatha, Pelawatte,
City:
Battaramulla
State_or_Province:
Western Province
Country:
Sri Lanka
Contact_Address:
Address_Type:
mailing and physical address
Address:
P. O. Box 2075
City:
Colombo
State_or_Province:
Western Province
Postal_Code:
None
Country:
Sri Lanka
Back to Top |