The ArcSDE
compressed binary representation of geometry is used to store binary geometry.
This binary representation requires that an offset and scale be applied
to the coordinates of a geometric object.
The resulting integer coordinates
are then encoded using the delta from the previous coordinate. Optionally,
a CAD or ANNO object is also appended to the geometric object.
Coordinate values
Internally,
all ArcSDE coordinates are 64-bit positive integers between 0 and 2147483647
(if defined using a 32-bit coordinate reference) or between 0 and 9007199254740990 (if defined
using a 64-bit coordinate reference).
Note that 64-bit coordinates are actually limited to a 53-bit range so that no information is
lost when converting to or from double precision floating point representation.
This format provides better data accuracy, data integrity, and processing
speed than real numbers. Developers should be aware of the internal integer
representation, because it is possible to attempt to store a number that
is too large in a layer. In that case, the ArcSDE software returns the
error SE_COORD_OUT_OF_BOUNDS. Developers never need to work directly with
the integer values.
Internally, all ArcSDE coordinates are 64-bit positive integers between 0 and
2147483647 (if defined using a 32-bit coordinate reference) or between 0 and
9007199254740990 (if defined using a 64-bit coordinate reference). Note that
64-bit coordinates are actually limited to a 53-bit range so that no information
is lost when converting to or from double precision floating point
representation.
Because real-world coordinates are often neither positive nor integer, ArcSDE
data requires an offset distance (a false origin) to ensure numbers are positive
and a minimum resolution multiplier (called the scale) to convert real numbers
to integers. Offset distances are specified in the same units as the data. The scale can be any positive value up to 2147483645 if using a 32-bit
coordinate reference, and up to 9007199254740990 if using a 64-bit coordinate
reference.
This section describes the logical view of how an ArcSDE feature's geometry
is represented in a binary stream. There are three issues to present:
coordinate ordering, multipart delineation, and point compression.
Coordinate ordering
An ArcSDE
feature's geometry is represented by one or more coordinates. The coordinates
consist of, at a minimum, an x,y pair. A feature might also have z (zed)
or measure (m) values associated with each x,y pair. Each of these values,
x, y, z, and m, is represented internally as 64-bit integers. The order
in which these coordinates are stored in a binary stream is x/y, x/y,
..., x/y, z, z, ..., z, m, m, ..., m (again, with the z and m values
being optional). A one-to-one correspondence exists between the z or
m values and the x,y pairs. In other words, for each z coordinate or measure
value present in the feature geometry, an x,y pair exists.
Multipart delineation
An ArcSDE
feature may have one or more geometric parts (a single-part or multipart
feature). Each part is delineated by a separator coordinate within the
binary stream which represents a feature's geometry. The separator coordinate
has a predefined value. Beginning with a feature's second part, the separator
is the first coordinate of the part's ordered coordinate list. The coordinate
list of a multipart feature is stored in a binary stream as x/y, x/y,
..., x/y, <separator>, x/y, x/y, ..., x/y, z, z, ..., z, <separator>,
z, z, ..., z, m, m, ..., m, <separator>, m, m, ..., m (again, the
z and m values are optional).
Point compression
Within the binary stream, each of the x/y, z, and measure values are
compressed in a byte-order independent manner. The compression of feature
coordinates is done in two steps:
- All values are converted to a relative-offset scheme,
- Each relative-offset value is packed into the minimum number of bytes required to represent the value.
This section
describes the physical view of how an ArcSDE feature's geometry is stored
in a binary stream. There are three issues to present: separators, point
compression, and the binary layout.
Part separators
The physical
representation of the separators which delineate the parts of a feature
is an x value of negative one (-1), a y value of zero (0), and the z
and m values are undefined. Separators do not require any special logic
when being compressed.
Point compression
The compression
or decompression of the coordinates stored in the binary stream is a two
step process: the conversion to/from the relative-offset scheme and the
packing/unpacking of bytes. To compress coordinates, the values are converted
to relative-offsets, then packed into a byte array. To decompress coordinates,
the byte array is unpacked, then the values are converted to absolute
values. Each step is described below.
Relative-offset value calculation
The goal
of converting coordinate values to a relative offset scheme is to make
the values as small as possible so that they require fewer bits to represent
them. In an array of relative-offset values, the first value is an absolute
value (stored as a 32-bit integer) while each subsequent value is the
offset, or difference, from the previous absolute value. Therefore, given
N absolute values, the relative-offset values are calculated by:
relative_value[0] = absolute_value[0]
relative_value[1] = absolute_value[1] - absolute_value[0]
[...]
relative_value[N-2] = absolute_value[N-2] - absolute_value[N-3]
relative_value[N-1] = absolute_value[N-1] - absolute_value[N-2]
Given N relative values, the absolute values are calculated by:
absolute_value[0] = relative_value[0]
absolute_value[1] = absolute_value[0] + relative_value[1]
[...]
absolute_value[N-2] = absolute_value[N-3] + relative_value[N-2]
absolute_value[N-1] = absolute_value[N-2] + relative_value[N-1]
|
This method is efficient because points within a feature are usually close to neighboring points.
Packing integer values
Relative-offset
values are generally represented with fewer bytes than absolute values.
The relative-offset values are packed into a series of bytes. The high-order
bit of each packed byte acts as a control bit to indicate whether the
(integer) value continues into the next byte. For example, if an integer
value is packed into three bytes, the high-order bit of bytes one and
two is set (indicating the integer value continues into the following
byte) and the high-order bit of byte three is not set (indicating that
it is the last byte of the integer value). The second bit of the first
byte acts as a sign bit. So the first packed byte contains one control
bit, one sign bit, and six data bits. All subsequent packed bytes contain
one control bit and seven data bits. Because fewer bits are available
to represent an integer, up to five packed bytes could be required to
represent an integer value (this is a worst case scenario and would only
occur when the integer value was greater than 134,217,727).
The record layout, by byte, for packed unsigned integers
0 Byte |
0 |
Control bit (0 = last byte, 1 = integer value continues into the next) |
0 |
1 |
Sign bit (0 = positive integer, 1 = negative integer) |
0 |
2-7 |
Next low-order six bits of the integer |
1-4 Byte |
0 |
Control bit (0 = last byte, 1 = integer value continues into the next) |
1-4 |
1-7 |
Next low-order seven bits of the integer |
Integer values
are packed by taking the low-order six or seven bits (by performing a
binary OR operation between the value to be packed and the hexadecimal
values 3F or 7F, respectively), depending on which packed byte the value
is being stored in, and storing them in the packed byte. The original
value is then shifted to the right (i.e., dividing the value) by six or
seven bits. If the new, shifted value is nonzero, the control bit
in the packed byte is set, and the steps are repeated again. This process
continues until the shifted value is zero. Unpacking is done in a similar
manner, but in the reverse order.
Binary layout
In addition
to the compressed coordinate values, additional information is stored
within the byte stream to provide information about the stored coordinate
values. The first eight bytes of the byte stream are reserved for the
additional information. Currently, two pieces of additional information
are stored within the byte stream: the size of the compressed point byte
stream and the dimension of the stored coordinates. Both values are stored
as packed integer values (as described previously). The length of the
coordinate byte stream is defined as the total length minus the reserved
eight bytes (i.e., the size of the compressed point byte stream) and is
stored in the first five bytes.
The coordinate
dimension indicates whether z and m values are present in the byte stream.
The dimension is a one-byte bit vector and is stored in the sixth byte
of the byte stream. The first low-order bit of the dimension vector indicates
whether z values are present, and the second low-order bit indicates whether
measure values are present. If the bit value is turned off (zero), then
the corresponding values are not present in the byte stream. If the bit
value is turned on (1), then the corresponding values are present. For
example, the dimension vector for two-dimensional coordinates has a hexadecimal
value of zero (0), for three-dimensional coordinates a value of one (1),
for measured two-dimensional coordinates a value of two (2), and for three-dimensional
coordinates with measures a value of three (3). The next two bytes of
the byte stream are not used currently, but are reserved for future use.
The compressed coordinate values are stored in the byte stream following
the reserved eight bytes.
The binary representation of a feature's geometry in ArcSDE
0-4 Bytes |
Coordinate stream length, packed integer format (byte stream length minus 8 reserved) |
5 |
Coordinate dimension mask, packed integer format |
6 |
Annotation dimension and entity type
bitmask |
7 |
Transmitted shape flags bitmask |
8+ |
Compressed coordinate values, packed
relative-offset format |
|