-
Notifications
You must be signed in to change notification settings - Fork 16
Showcase
Pascal Raszyk edited this page May 12, 2016
·
7 revisions
Here are some pretty one-liners (ok, one-statements!) to show you the idea of what Infoboxer can do.
puts Infoboxer.wp.get('Argentina').infobox.fetch('leader_name1')
# Prints:
# Cristina Fernández de Kirchner
Shows:
- simple page and infobox data extraction;
- readable representation of tree nodes.
table = Infoboxer.wp.get('Porsche 991').
sections('Engines' => 'Performance').
tables.first
headings = table.heading_row.cells.map(&:to_s)
# => ["Model", "Transmission", "Engine", "Top speed", "Acceleration 0-100", "Emissions"]
table.body_rows.
map{|tr|
headings.zip(tr.cells.map(&:to_s)).to_h
}
# => [{"Model"=>"Carrera", "Transmission"=>"7-speed man", "Engine"=>"3.4", "Top speed"=>"289 km/h", "Acceleration 0-100"=>"4.8", "Emissions"=>"211 g/km"},
# {"Model"=>"Carrera", "Transmission"=>"7-speed PDK", "Engine"=>"3.4", "Top speed"=>"287 km/h", "Acceleration 0-100"=>"4.6", "Emissions"=>"191 g/km"},
# ...and so on...
Shows:
- navigating by sections;
- tables;
- information extraction from tables.
Infoboxer.wp.get('Kilgore Trout').
sections('"Works" by Kilgore Trout' => /.*/).
lookup(:ListItem).map{|li|
{
title: li.lookup(:Italic).first.text,
mention: li.lookup(:Italic)[1].text,
type: li.in_sections.first.heading.text_
}
}
# => [{:title=>"Barring-gaffner of Bagnialto or This Year's Masterpiece", :mention=>"Breakfast of Champions", :type=>"Novels"},
# {:title=>"The Big Board", :mention=>"Slaughterhouse-Five", :type=>"Novels"},
# ...and so on
Shows:
- navigating by sections;
- nodes tree lookup;
- nodes text extraction.
Infoboxer.wikivoyage.get('Chiang Mai').
sections('See' => 'Elephants').templates(name: 'see').
fetch_hashes('name', 'address', 'price')
# => [{"name"=>#<Var(name): Baanchang Elephant Park>, "address"=>#<Var(address): 147/1 Rachadamnoen Rd>, "price"=>#<Var(price): 4500 baht a day (can be split b...>}
# ...and so on...
Shows:
- usage of other-than-Wikipedia sources,
- navigation by sections,
- usage of templates inside document body,
- complex fetching from templates.
Infoboxer.wp.
get('Argentina', 'Bolivia', 'Chile').
infobox.fetch('leader_name1').
lookup(:Wikilink).follow.
infobox.fetch_hashes('name', 'office', 'birth_date')
# => [{"name"=>#<Var(name): Cristina Fernández de Kirchner>, "office"=>#<Var(office): President of Argentina>, "birth_date"=>#<Var(birth_date): 1953-02-19>},
# ...and so on
Shows:
- extracting several pages at once (in one request to Wikipedia API!);
- working with list of pages, which is as simple as with list of nodes;
- following wikilinks and parsing page by link.
Next topics: