Scraping Digikey using Ruby and ScRUBYt
Sometimes the information you need in your application is already out there on a web site. In a BOM (Bill-of-Materials) app I'm doing, I want to be able
to lookup distributor part numbers from a manufacture part number.
In this example I use Digikey, as they have the vast majority of parts I use in projects.
I use Scrubyt, a ruby library, by Peter Szinek, and Glenn Gillen.
It does the heavy lifting.
The data returned in this example, is in the form of a table. ScRUBYt makes
parsing the table straight forward. You can use firebug in firebird to find your
data if your doing a different website.
Here is the base code
require 'rubygems'
require 'scrubyt'
require 'pp'
digikey_data = Scrubyt::Extractor.define do
thepartnumber ='C0805C104K3RACTU'
thesearch = 'http://search.digikey.com/scripts/DkSearch/dksus.dll?
lang=en&site=US&x=0&y=0&keywords=' + thepartnumber
fetch thesearch
table "/html/body/div[2]/table" do
row "//tr" do
dist_pn "/td[1]"
manuf_pn "/td[2]"
pn_description "/td[3]"
pn_oem "/td[5]"
end
end
#
end
part_data = digikey_data.to_flat_hash
pp part_data
The result:
sin-gwest-laptop:digikey gwest$ ruby digikey.rb
[{:pn_oem=>"Kemet",
:dist_pn=>"399-1168-2-ND",
:manuf_pn=>"C0805C104K3RACTU",
:pn_description=>"CAP .10UF 25V CERAMIC X7R 0805"},
{:pn_oem=>"Kemet",
:dist_pn=>"399-1168-1-ND",
:manuf_pn=>"C0805C104K3RACTU",
:pn_description=>"CAP .10UF 25V CERAMIC X7R 0805"},
{:pn_oem=>"Kemet",
:dist_pn=>"399-1168-6-ND",
:manuf_pn=>"C0805C104K3RACTU",
:pn_description=>"CAP .10UF 25V CERAMIC X7R 0805"}]
sin-gwest-laptop:digikey gwest$
- Company:
- Person:
- Technology:


Recent comments
1 year 23 weeks ago
1 year 23 weeks ago
1 year 25 weeks ago
1 year 27 weeks ago
1 year 42 weeks ago
1 year 45 weeks ago
1 year 45 weeks ago
1 year 45 weeks ago
1 year 46 weeks ago
1 year 48 weeks ago