| |
Home | Concepts | API | Samples |
Concepts > XML > Basic Principles | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
XPath Support | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
To perform an XPath-based XML document search, tags and their data must be indexed. In ArcSDE, tagged data can be indexed according to the data type. Three data types are supported:
The double data type can be used for numeric data, such as 10, 23.285, or 19950603. The string data type can be defined for text strings greater than or equal to 256 characters in length, while the varchar data type is appropriate for text strings shorter than 256 characters, such as coded values or abbreviations (CDC, UNESCO, or ACASIAN). A full-text index is created for the string data type. A B-tree index is created on double and varchar data types. A full-text index is not suitable for coded values (double) or short strings for the following reasons:
Note: Full-text indexes are not available on all database management systems. The XPath expression is defined according to the tag’s data type. For example, consider the following XML document:
<?xml
version = "1.0" ?> In this example, you would index three tags:
Based on the data of these tags, define the first tag as string, the second as double, and the third as the varchar data type. Search the XML document by defining the XPath for these tags:
This is a simple example. A string or varchar data type could be used for the second tag. However, in that case, you cannot use relational operators (<, >, <=, >=). Once the data type for a tag is defined, there are many ways to formulate an XPath expression to search XML documents. Location pathXPath is an expression language. The ArcSDE XML type implementation supports a limited number of XPath expressions. One of the main expressions is a location path. A location path is a direction and may consist of a sequence of steps to select a node in a tree. Each step is separated by a forward slash (/). A sequence of steps in the expression forms a relationship between nodes in the document. The relationship is called an axis. Every axis is followed by a node test. A node test indicates which nodes are to be selected. A location path can be abbreviated or unabbreviated. For example:
Only the abbreviated form of location paths is supported. In general, a step in a location path consists of the following: <axis><node_test><zero_or_many_ predicates> Therefore, an XPath expression can be broken into four parts:
Axes The axis contains a part of the document, defined from the perspective of the current node (also called the context node). It determines what general category of nodes may be considered for the node tests. The ArcSDE XML API supports the following three XPath axes:
A node test limits the specific elements or attributes that will be addressed. A node test can be a NameTest or TypeTest. A NameTest selects nodes by name. A TypeTest selects nodes based on the type of node. For example: NameTest TypeTest The ArcSDE API supports the following two types of NameTests for node tests:
Note: If the node name is qualified with a namespace or a namespace prefix, that qualifier must be present in the node's tag within the XML document. For example, a search that uses the full namespace will not find documents that use the namespace prefix in the given node's tag. For more information on XML namespaces, see http://www.w3.org/TR/REC-xml-names.
A predicate filters a node set from an XPath expression, thereby reducing the returned node set from an XPath query. Generally, predicates are enclosed in square brackets. [<node><relational_operator><value>]
A predicate can consist of multiple expressions that are separated by AND or OR. Parentheses can be used to group expressions for the purpose of changing the evaluation order. Positional predicates are not supported. For example, XPath allows an expression of the form /mynode[5] to return the 5th occurrence of the node "mynode". XPaths are not supported that test the nth position of a tag for a given value; all occurrences of tags with the given name will be searched.
XPath 1.0 provides support for many types of functions.
The majority of these functions are not supported by the ArcSDE XML type implementation. Only one string functioncontains()is supported. contains (string1, string2)Returns true if the string1 contains string2; otherwise, it returns false. Note that [contains(Title = "Bird")] is not the same as [Title = "Bird"]. The first expression returns true for all Titles containing the word "Bird" or another form of the word, such as "Birds" or "Birding". The second expression returns true only for the Title "Bird". The ArcSDE API supports the contains() function in the format [contains(string1, string2)]. Other forms, such as [contains(string1, string2)=boolean] or [contains(string1, string2)!=boolean] are not supported. Location path aliasesArcSDE also supports numeric aliases for location paths. Location path aliases do not have to be unique. For example, the FGDC Z39.50 code 31 refers to a Publicaton Date attribute, which is present in about half a dozen elements in the XML schema for FGDC metadata documents. Hence, a query on a path alias may actually search multiple location paths.When specifying a search criterion with an alias, the numerical alias value is specified in a location_alias function. The location_alias function is not part of the XPath standard. It is an ArcSDE extension used to specify a location path using a numeric value. For instance, for Example 17 in the "Supported XPath expressions" section, the following tag can be defined as alias 1: /metadata/Esri/idCitation/Date Supported XPath expressionsSome examples of XPath expressions are provided here for illustration purpose. These examples cover three tag data types: SE_XML_INDEX_STRING_TYPE, SE_XML_INDEX_DOUBLE_TYPE, and SE_XML_INDEX_VARCHAR_TYPE. Note that SE_XML_INDEX_VARCHAR_TYPE should be used for simple data or single words because it uses a B-tree index, while SE_XML_INDEX_STRING_TYPE should be used for text because it uses full-text indexing. Examples 9 through 12 demonstrate how you can construct ‘A AND B’, ‘(A AND B)’, ‘A OR B’ and ‘(A OR B)’ types of expressions, respectively. Example 13 shows how to use an ‘OR within contains clause’. Example 14 demonstrates the use of an ‘OR within predicate’, while Example 15 shows the use of an ‘AND within contains' clause. Other patterns are also supported. These include:
Each combination can be tested on different tag data types (string, double, and varchar). For example, the (A AND B[C AND D]) combination can be tested as
Use of the string or varchar data type does not restrict the operations on string or varchar data. For example, the contains() function can be used in XPath for a string or varchar data type. Examples 5 and 7 show the use of contains() or the equal (=) operator for the varchar data type. If the contains() function is used in the XPath expression for a tag search, like it is in example 5, the LIKE function is used in the SQL statement. If the contains() function is not used in a tag search, such as in example 7, the contains() function is used for a full-text search. The CONTAINS and LIKE functions used in SQL statements are not the same. For example, given there is a col_name value of 'Bird', these two SQL statements return different results: SELECT col_name FROM table_name WHERE CONTAINS(col_name, 'Birds'); SELECT col_name FROM table_name WHERE col_name LIKE '%Birds%'; The first statement returns the word 'Bird', whereas the second returns no records. ExamplesExample 1Find all documents with a /metadata/idCitation/Title element equal to 'Migratory Birds'. XPath: /metadata/Esri/idCitation[Title = 'Migratory Birds'] Metadata, Esri, and idCitation are all node tests. The slashes between them specify child axes. In other words, start at metadata and find all child nodes named 'idCitation'). Example 2Find all documents with a /metadata/idCitation/Title element containing the word 'Bird'. XPath: /metadata/Esri/idCitation[contains(Title, 'Bird')] Example 3Find all documents with a /metadata/idCitation/Title element containing the word 'Bird' or 'Virus'. XPath: /metadata/Esri/idCitation[contains(Title, 'Bird') OR contains(Title, ' Virus')] Note that OR is used within the contains() functions. Similarly, AND can be used if you want to find all documents that contain both words. Example 4Find all documents with a /metadata/idCitation/RespParty/OrgName element containing the word 'ACASIAN'. XPath: /metadata/Esri/idCitation/RespParty[OrgName = 'ACASIAN'] Example 5Find all documents with a /metadata/Esri element with a metaID attribute equal to 1001 and a /metadata/Esri/idCitation/RespParty/OrgName element equal to the word 'UNESCO'. XPath: /metadata/Esri[@MetaID = 1001] AND /metadata/Esri/idCitation/RespParty[OrgName = 'UNESCO'] Or you could use the following: XPath: /metadata/Esri/idcitation/Title[@area < 10000] AND /metadata/Esri/idCitation/RespParty[contains(OrgName, 'UNESCO')] Note: The same can be written as follows but this type of expression is not supported for ArcSDE XML. XPath: /metadata/Esri[@MetaID = 1001]/idCitation/RespParty[OrgName = 'UNESCO'] Example 6Find all documents with a /metadata/Esri/idCitation/Title element with an area attribute less than or equal to 250,000, a /metadata/Esri/idCitation/Title element containing the word 'America' or 'Africa', and a /metadata/Esri/idCitation/RespParty/OrgName element equal to the word 'ACASIAN'. XPath: /metadata/Esri/idCitation/Title[@area <= 250000] AND (/metadata/Esri/idCitation[contains(Title,'America') OR contains(Title,'Africa')]) AND /metadata/Esri/idCitation/RespParty[OrgName = 'ACASIAN'] Similarly, other XPaths can be defined as follows: XPath: /metadata/Esri/idCitation/Title[@area < 250000] AND (/metadata/Esri/idCitation[contains(Title,'America') OR contains(Title,'Africa')]) AND /metadata/Esri/idCitation[contains(Title,'Virus')] XPath: /metadata/Esri/idCitation/Title[@area < 250000] AND (/metadata/Esri/idCitation[contains(Title,'America') OR contains(Title,'Africa')]) AND /metadata/Esri[@MetaID >= 1001] Example 7Find all documents that contain specified values in any tag. A. Find all documents that contain the word 'Birds' in any tag. XPath: //*[contains (. , 'Birds')] B. Find all documents, where the value is equal to 4,200. XPath: //*[. = 4200] C. Find all documents that have the word ' ACASIAN' in any tag. XPath: //*[. = 'ACASIAN'] Or you could use: XPath: //*[contains(. , 'ACASIAN')] Example 8Find all documents where a /metadata/Esri/idCitation/Date node has value less than 19950101 and the document contains the phrase ‘'West Nile Virus'. XPath: /metadata/Esri/idCitation[Date < 19950101] AND //*[contains(.,'West Nile Virus')] Example 9Find all documents where /metadata/Esri/idCitation/Date node has date equal to 19940112 and any node contains the word 'UNESCO'. XPath: /metadata/Esri/idCitation[Date = 19940112] AND //*[contains(.,'UNESCO')] Example 10Find all documents where the attribute MetaID of a /metadata/Esri node is equal to 1001 and a /metadata/Esri/idCitation/Title node contains the word 'Africa'. XPath: (/metadata/Esri[@MetaID = 1001] AND /metadata/Esri/idCitation[contains(Title,'Africa')]) Example 11Find all documents where any node contains the word 'mosquitoes' or a /metadata/Esri/idCitation/Title node contains the word 'Africa'. XPath: //*[contains(.,'mosquitoes')] OR /metadata/Esri/idCitation[contains(Title,'Africa')] Example 12Find all documents where attribute MetaID of /metadata/Esri node has a value of 1003 or a /metadata/Esri/idCitation/Date node has date equal to 19940112. XPath: (/metadata/Esri[@MetaID = 1003] OR /metadata/Esri/idCitation[Date = 19940112]) Example 13Find all documents where any node contains the word 'Migratory Birds' or 'mosquitoes'. XPath: (//*[contains(.,'Migratory Birds') or contains(.,'mosquitoes') ] ) Example 14Find all documents where a /metadata/Esri/idCitation/RespParty/OrgName node has the word 'ACASIAN' or 'CDC' and any node contains the world 'mosquitoes'. XPath: (/metadata/Esri/idCitation/RespParty[OrgName = 'ACASIAN' OR OrgName = 'CDC'] AND //*[contains(.,'mosquitoes')]) Example 15Find all documents where any node contains the word 'Migratory Birds' and 'Virus' and a /metadata/Esri/idCitation/Date node has a value less than 20050902. XPath: //*[contains(.,'Migratory Birds') AND contains(.,'Virus')] AND /metadata/Esri/idCitation[Date < 20050902] Example 16Find all documents where a /metadata/Esri/@MetaID node has a value greater than or equal to 1000 and a /metadata/Esri/idCitation/Title node contains 'West Nile Virus' and 'Asia'. XPath: (/metadata/Esri[@MetaID >= 1000] AND /metadata/Esri/idCitation[contains(Title,'West Nile Virus') AND contains(Title,'Asia')]) Example 17Using aliases, find all documents with Date 19950603. XPath: location_alias(1) = 19950603 Unsupported XPathsThe following are a few examples of unsupported XPaths. Do not use these with ArcSDE XML API.
|
feedback |
privacy |
legal |