Monday, March 16, 2009

XSL/XPATH for BI Publisher

When you start developing RTF Templates and try to do something a bit advanced most likely you start seeing these words, XSL, XSLT, XPATH, XSL-FO…. And they might have scared or confused you. I’ve met many people use those terms without a correct understanding and everybody uses them in different ways so even I get sometimes confused when I talk with them. ;-) I think the part of the reason is coming from its own history.

What is XSL?

XSL stands for ‘Extensible Stylesheet Language’ and it was designed to describe how to transform and format XML files. But it split into different specifications (or languages) as listed below. So this XSL itself actually is really anything today, it rather be a family name if you will, which contains all the three related languages below.

Here is a list of the three languages that comprise the XSL.

  • XPATH – a language for navigating in XML documents
  • XSLT – a language for transforming XML documents
  • XSL-FO – a language for formatting XML documents

Let’s take at look at each language in more detail and why we need to know them for BI Publisher reports development.

Start from XPath…


XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document. XPath uses path expressions to navigate in XML documents. It is very similar to what we call ‘Path’ on Unix file system. Basically it acts as a navigation in XML files. Let’s say there is a parent node called ‘Department’ and it has a child node called ‘Employee’. Now you’re processing Employee node and want to get some data from the parent node, Department. This is where the XPath comes in. And it is in fact very simple to do this. You can type something like ‘../Department/BUDGET’. Yes, that’s it and it’s very similar to the path we use at the file system, right?

Path Expression


What makes XPath different from the path is its Path Expression. The Path Expression is very powerful and makes it much easier to access to any vaues in the XML files.
For example you can use ‘//’ (double slashes) to indicate that you want to get any Element and it doesn’t matter if what level the element is.

Here is a list of commonly used Path Expressions.

/

Selects from the root node

Example:
/Department (Selects the root element Department)
Department/Employee (Selects all Employee elements that are children of Department)


//

Selects nodes in the document from the current node that match the selection no matter where they are

Example:
//Department (Selects all Department elements no matter where they are in the XML)
Department//Employee (Selects all Employee elements that are descendant of the Department element, no matter where they are under the Department element)

.

Selects the current node

..

Selects the parent of the current node

@

Selects attributes

Example:
//@type (Selects all attributes that are named type)

Predicates

Also, there are some advanced expressions called ‘Predicates’. Predicates are something you might have seen in your template They are presented with square brackets. They are used to find a specific node and can have specific conditions to specify the node. For example, ‘/Department[1]’ will return the first Department node while ‘/Department[2]’ will return the second Department node. You can also have a condition in the predicates to pick a certain set of nodes only when the condition matches. For example, if you specify ‘/Department[Salary>5000]’ then it will return only the Department nodes that contains Salary element whose values are greater than 5000.

You can also specify relative position. The above example of ‘/Department[1]’ will always return the first Department node in the XML file. But if you have many groups or you’re grouping by a certain value (e.g. Department name) and you might want to pick the first node in the each group. In this case you can use ‘Department[first()]’ or ‘Department[position()=1].


Operators (Functions)


Also, XPath has its own Operators (or functions) that you can use to process or calculate your data in the XML file. For example there is a ‘substring’ function, which you can use to get a part of the data you want from a specified Element values. There are many other useful functions and all of the standard XPath functions can be used in the BI Publisher’s RTF Template. Here is a set of XPATH functions that are useful and we use in many cases with the RTF Template.

List of XPATH functions


  • substring()
  • substring-before('12/10','/')
  • replace("Bella Italia", "l", "")
  • upper-case() /lower-case()
  • contains()
  • distinct-values()
  • false()
  • sum()

Why this is for BI Publisher?


Now you have gone though the XPath basic and wondering ‘why do I need to know this?’ Here is list of example use cases where we think it is critical and very useful if you understand the XPath appropriately.

  • With for-each-group you need to specify a parent node and an element node where you want to group by with XPath appropriately
  • When you want to access to a parent node’s element values when you are processing tis child node.
  • For IF condition you might want to process data to build some valid conditions. For example if you want to get Employee name ‘Smith’ regardless whether it’s in upper case, lower case or the combination, you can use ‘upper-case()’ so that every type of ‘Smith’ will be matched.
  • Inside Chart definition you need to specify all the Element names appropriately with XPath. Also you can use the XPath functions to have conditions or process data inside the Chart.

These are just a few examples to list, but there are many other cases you can take advantage of the XPath and make your RTF Template development much easier and more flexible and powerful.

Also, note that using the XPath without an appropriate understanding might cause performance and resource allocation problems. Due to its flexibility you can achieve what you want to do by using many different ways with XPath. However, you might end up typing XPath codes that cause very resource intensive or unnecessary processing. The key is to have a right understanding of XPath and write the code in the most optimized way from maintenance and performance perspective.

At the next session, we’ll cover XSLT & XSL-FO. Stay tuned!

1 comment:

  1. Any real world examples using XPATH in RTF Template would be really appreciated.

    ReplyDelete