Link Search Menu Expand Document

Google colab Activity

This example is adapted from the “Hitchhikers Guide to Python” html scraping tutorial.

  1. Open a Google Colab notebook here: https://colab.research.google.com/. Login with a Google account, then select “File -> New notebook” for a basic coding environment.

  2. Add a code cell by clicking + Code. Copy the script below into the code cell, then click the play icon to run the code.

Input

from lxml import html
import requests
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)
#This will create a list of buyers:
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of prices
prices = tree.xpath('//span[@class="item-price"]/text()')
print (buyers)
print (prices)

You should see the following output.

Output

['Carson Busses', 'Earl E. Byrd', 'Patty Cakes', 'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff', 'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup', 'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire', 'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']

['$29.95', '$8.37', '$15.26', '$19.25', '$19.25', '$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11', '$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68', '$15.00', '$114.07', '$10.09']

Explore the structure of the script reflected when we use the inspect function in a browser. Inspect element example for

Optional - Data Miner Activity

If you haven’t already, install Data Miner plugin for Chrome.

  1. Working from the “Recipes” tab, go to Google Scholar, do a search, and save the results of your search with a “Google Scholar” Recipe.

  2. Build a new recipe to scrape one of the following: