XPath (XML Path Language) allows retrieval of elements and attribute values from tree-structured XML/HTML documents. This article introduces a browser add-in that simplifies checking this information directly in your browser.
What is Xpath?
I asked ChatGPT.
That's convenient, ChatGPT is; the HTML data of a website should appear in the same structure for each page template. This can be used to retrieve elements from within a page.
Utilization of Developer Tools
Commonly provided browsers have additional developer tools. By using this tool, you can obtain the path where the data is displayed.
For example, if you launch the Developer Tools on the XM Cloud - Enable Sitecore Content Hub Connector page, you will see the following screen
In the Tools screen on the right, you can select the appropriate tag by clicking on the "Select an element in the page to inspect it" icon and specifying the page.
If you hover the mouse cursor over an item that shows the date on the blog page, the source code will also be highlighted, as shown in the figure below.
By right-clicking on the relevant source code, you can choose from several ways to copy it from the Copy menu. Here, select Copy XPath.
You can get the following Xpath
//*[@id="__next"]/main/div[1]/div[2]/p/time
Check with your browser
There are various add-ins that allow you to check whether data can be obtained correctly using XPath for the page you are viewing, and we will introduce Xpath Helper.
- Xpath Helper ( Available in Chrome / Edge )
When this add-in is turned on, two text boxes appear above the browser.
Enter the XPath obtained in the previous step in the text box on the left, and the result will be displayed in the text box on the right.
It can also be used to check Meta data that is not actually displayed on the page. For example, use the following code
//meta[@property='og:description']/@content
The results are displayed as follows
This tool allows you to retrieve and utilize data from web pages.
Summary
In this article, we have shown how to use the "Xpath Finder" add-in to retrieve data from a web page. A simple procedure to check the structure of a page is enough with the developer tool, but this tool is also useful and will be introduced in a parallel blog post in the future.