title | output |
---|---|
Craigslist Lost and Found |
html_document |
This app is currently deployed here.
Craigslist Lost and Found is a simple app that reads from a Google Sheet that is "published to the web" and can be found here. This Google Sheet uses the built-in function IMPORTXML
to scrape Craigslist's Lost and Found listing. You can view the posts listed for any one day and also see the overall number and type of posts.
- Find the URL for the web page that you want to collect data from.
- Our example uses http://vancouver.craigslist.ca/search/laf.
- Get the XPath expression to scrape the elements you want from the page.
- Our example uses
//span[@class='pl']
.
- Enter the formula
=IMPORTXML(URL, XPATH_QUERY)
in a cell. Make sure there is a good amount of blank cells to the right or else the data may not fit and an error is thrown. If all goes well, a table of data populates to the right of the formula cell.
Resources on getting the XPath expression: It is useful to see the HTML of the page (eg. in Chrome you can right click -> Inspect Element).