Learning XPath selectors
In the previous section, we learned about CSS selectors and how to use them as well as functions provided by the rvest
package to extract contents from web pages.
CSS selectors are powerful enough to serve most needs of HTML node matching. However, sometimes an even more powerful technique is required to select nodes that meet more special conditions.
Take a look at the following web page a bit more complex than data/products.html
:
This web page is stored as a standalone HTML file at data/new-products.html
. The full source code is long we will only show the <body>
. here. Please go through the source code to get an impression of its structure:
<body> <h1>New Products</h1> <p>The following is a list of products</p> <div id="list" class="product-list"> <ul> <li> <span class="name">Product-A</span> <span class="price">$199...