Desktop Help 10.0 - Best practices for building terrain datasets

The following are a few ideas to keep in mind when building a terrain dataset:

As a data delivery format, if you have a choice between ASCII and LAS formatted data for lidar, choose LAS. More information about the data is recorded in the headers of the LAS files, and being binary, they're read more efficiently.
The data should be in a projected coordinate system. Unknown coordinate systems are not recommended. Geographic coordinates, such as decimal degrees, are not supported.
It will be more straightforward when using the terrain dataset to have the z-values in the same unit of measure as the x-y values.
The data involved should be contiguous. There can be gaps in the sampling, as is typical over water bodies or obscured areas when building topographic models, but the collection should form a logical whole. Disparate collections are best represented as separate terrain datasets. For example, building a terrain dataset with measurements from two neighboring counties is okay. Building a terrain dataset with measurements from two counties on opposite sides of the state is not.
It's best to build a terrain dataset from data gathered based on the same data collection specifications and accuracy requirements. Consistency on the data side will enable consistency, in terms of both performance and pyramid accuracy, on the terrain dataset side.
Disable the enforcement of breaklines in lower level-of-detail pyramid levels. This will improve display performance at smaller scales. Even though the breaklines won't be strictly enforced, their vertices will still contribute to the terrain dataset. A possible exception is related to the breaklines used to delineate water features, considering that at even small scales these features need to stand out.
Avoid adding polygon features using polygon-based SFTypes where possible; they are more expensive to enforce than breaklines. Often, they can be added as line features. For example, you can add a lake boundary as a hardline rather than a hardreplace polygon. Replace polygons are only needed if you know there are other measurements in the polygon interior that need to be overridden.
Use terrain dataset groups to boost performance at smaller scales. Examples include detail-clip/generalized-clip and road centerline versus edge of pavement (two sided).
Use the fewest feature classes possible. This can improve build performance and, when breaklines are involved, speed up runtime use of the terrain dataset. Merge feature classes where appropriate.
Data should be clean and free from blunders before being used to build a terrain dataset. It is not designed to process raw data.
Review the extent of all participating feature classes before building the terrain dataset dataset. Make sure they are as expected. Blunders (outlying points) with data points are not uncommon and can create havoc on the terrain dataset build process.
Set the geoprocessing analysis extent before importing data to the study area. Points outside this extent will be excluded. This can prevent blunders from reaching the database.
Don't use large single point feature classes. Use a multipoint feature class for large point collections (anything over 500,000). Multipoints should be spatially clustered. Size limit per shape should be 5,000 if vertices don't give point IDs, otherwise 3,000 points per shape.
Do not use clip polygons as a means for extracting/processing subsets of a terrain dataset. All data gets triangulated/pyramided regardless, so using a clip polygon is not appropriate for this task. Instead, extract subsets into separate feature classes and use those to define the terrain dataset.
Make sure feature classes contributing to the terrain dataset have correct extents. If features have been deleted from a feature class, its extent may be outdated and incorrect. The geodatabase does not automatically recalculate the extents on any feature deletions due to the expense. Use IFeatureClassManage.UpdateExtent to correct this situation before creating a terrain dataset, otherwise it will define an incorrect tile system and attempt to build many tiles for which there is no data.
When using data containing different resolutions and/or densities, specify the point spacing from the data containing the smallest resolution when defining a terrain dataset that has more than 200,000 points. From a data analysis perspective, the data should not contain varying resolutions.

8/18/2010