Data Dictionary
Last Modified:
7/19/99 11:59 AM
Contents:
1. Model DTD and tag
description.
Note: the model below assumes that dictionary tags
are defined elsewhere. Variables are referred to by name.
<!ELEMENT data-dictionary ((categorical | ordinal | continuous)+)>
<!ELEMENT categorical (category+)>
<!ATTLIST categorical
name CDATA #REQUIRED
><!ELEMENT category EMPTY>
<!ATTLIST category
value CDATA #REQUIRED
display-value CDATA #IMPLIED
proportion CDATA #IMPLIED
missing (true | false) "false"
><!ELEMENT ordinal (order+)>
<!ATTLIST ordinal
name CDATA #REQUIRED
><!ELEMENT order EMPTY>
<!ATTLIST order
value CDATA #REQUIRED
display-value CDATA #IMPLIED
rank CDATA #REQUIRED
proportion CDATA #IMPLIED
missing (true | false) "false"
><!-- The predicates indicate the values that represent missing values -->
<!ELEMENT continuous ((%predicates;)*)>
<!ATTLIST continuous
name CDATA #REQUIRED
minimum CDATA #IMPLIED
maximum CDATA #IMPLIED
mean CDATA #IMPLIED
median CDATA #IMPLIED
standard-deviation CDATA #IMPLIED
inter-quartile-range CDATA #IMPLIED
>
data-dictionary - marks the beginning of the container whose
contents define the complete set of fields referenced in any model in the file.
Any field referenced any place in the PMML file must be declared
here.
categorical - one of the three data types that a field
may have.
category - one of the values that a categorical field may take.
When exporting a PMML file, an entry should be made for each category that was
found in the training data. If this value represents missing data, the missing
attribute should be set to true.
ordinal - one of the three data types that
a field may have.
order - one of the values that an ordinal
field may take. When exporting a PMML file, an entry should be made for each
ordinal value that was found in the training data. If this value represents
missing data, the missing attribute should be set to true.
continuous
- one of the three data types that a field may have. To indicate which
continuous values represent missing values, construct a series of
predicates.