XPath is a tree representation of the XML( or HTML) document. It has a hierarchy structure to it.
XPath, the XML Path Language, is a query language for selecting nodes from an XML or HTML document. In addition, XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML document. XPath was defined by the World Wide Web Consortium (W3C).
XPath is used to navigate through elements and attributes in an XML or HTML document.
XPath Syntax:
XPaths consist of HTML tags and forward slashes. It has a hierarchy structure to it so the top level in every HTML page is the <html>
tag (Note: XPaths do not include the parenthesis(<,>)). Then it moves on down the tree structure which HTML has, there is more than likely a <body>
tag maybe followed closely by a <div>
, a <ul>
and several <li>
tags. Generally html tag is considered as root and the element you want to find is known as target element. A tag may have one or more children. A parent tag or node may be a child for another node.
We will use below HTML page to learn each locating strategy.
<html>
<body>
<form id="loginForm" class="login-form">
<input id="uname" name="username" type="text" />
<input name="password" type="password" />
<input name="continue" type="submit" value="Login" />
</form>
<a href="forgot-password.html" id="forgotPassword">Forgot Password</a>
</body>
<html>
There are 2 types of XPath formats,
- Absolute XPath
- Relative XPath
Absolute XPath is nothing but finding a target element in HTML from root to target. For instance, from above HTML if you want to find 'form' element located which is a child to first div. This should be,
/html/body/form
And Relative XPath is something you directly drill down to the target element. Like,
//form
Note: If you want to directly drill down to any target element the you should start with double slashes('//').
Let's see some different cases we mostly needed in XPaths.
Basic XPath:
XPath expression select nodes or list of nodes on the basis of attributes like ID , Name, Classname, etc. from the XML document as illustrated below.
To find user name element with 'name' attribute,
Xpath=//input[@name='username']
To find password element with 'id' attribute,
Xpath=//input[@id='password']
To find submit button element with 'type' attribute,
Xpath=//input[@type='submit']
Multiple tags with same name:
For instance look at the above HTML. There are 3 input tags and all of them are children to the form tag. In this case if you want to directly drill down to 2nd input tag, then you should specify the position of the input tag under the parent tag. Like this,
//input[2]
This will find the 2nd input tag element in the HTML. What if you have more div tags under the different parent or child element. This will not work. In this case you should always specify the parent tag where the path from parent to child is unique to other paths. Like this,
//form/input[2]
Contains:
There is also a contains function to see if a attribute of a tag contains some text. For example in above HTML look at the anchor tag. This tag has text 'Forgot Password'. Then, if you want to find out that element which contains the text 'Forgot Password'. Then it would be,
//a[contains(text(),'Forgot Password')]
|
AND & OR operators in Xpath:
In OR expression, two conditions are used, whether 1st condition OR 2nd condition should be true. It is also applicable if any one condition is true or maybe both. Means at least one condition must be true to find the element.
In the below XPath expression which is to find user name field from above HTML, it identifies the elements whose single or both conditions are true. In this case user name field can be found even one of the condition is met.
Xpath=//input[@name='username' OR @id='uname']
In AND expression, two conditions are used, both conditions should be true to find the element. It fails to find element if any one condition is false.
In the below XPath expression which is to find user name field from above HTML, it identifies the element whose both the conditions are true. In this case the user name field can only be found if both conditions are met.
Xpath=//input[@name='username' AND @id='uname']
Start-with function:
Start-with function finds the element whose attribute value changes on refresh or any operation on the webpage. In this expression, match the starting text of the attribute is used to find the element whose attribute changes dynamically. You can also find the element whose attribute value is static (not changes).
For example -: Suppose the ID of particular element changes dynamically like for every refresh and you cannot locate this element with Id as the value of Id is not static:
Id="username10"
Id=" username100"
Id=" usernameSomething"
and so on.. but the initial text is same. In this case, we use Start-with expression.
In the below expression, there are two elements with an id starting "username". In below example, XPath finds those element whose 'ID' is starting with 'username'.
Xpath=//input[starts-with(@id, 'username')]
Of course you can also use contains method where Id contains 'username'.
Text():
In this expression, with text function, we find the element with exact text match. In our case, we find the anchor element in above HTML with text "Forgot Password".
Xpath=//a[text() = 'Forgot Password')]
CSS Selectors:
For CSS identification please follow very useful post CSS Selectors from Sauce Labs.