Dell Scraper – NINet

Due to failings in the past, our inventory system is in need of a revamp as it is missing both systems and information as part of the revamp we decided to scrap everything we new and reinventroy everything. We use a lot of dell systems and I figured that as we had collected the service tags of a number of systems we could get the info about that system from the dell site. I wrote a quick python script to take a list of service tags and output the system info as a csv file in the format “tag, shipdate, SysType, Ram, HDD, Support”.

E.g.

"4GXCB1H","2008-03-29","Latitude D430","2GB","80GB","3Y NBD INTERNATIONAL (NEXT BUSINESS DAY);UPG TO 4Y NBD INTERNATIONAL (NEXT BUSINE"

Place a text file called “taglist.txt” in the same directory as the script with the dell service tags listed 1 per line. When the script is done the data will be in the file called “SystemList.txt”.

4GXCB1H
5GXAB1H

#!/usr/bin/python
#
# Copyright 2009 Ryan McLean <ryanm *at* ninet *.* org>
#
# dellscraper.py
#
# Dell Warranty Scraper

###Imports
from urllib2 import Request, urlopen, URLError
import datetime, re, sys, os, string

def processHTML(rawhtml):
  #Grab Shipdate
  res = re.search('Ship\sDate:</td><td\s.*?>(\d\d/\d\d/\d\d\d\d)</td>', rawhtml, re.IGNORECASE)
  shipdate = res.group(1)

  #Grab System Type
  res = re.search('System\sType:</td><td\s.*?>(.*?)</td></tr>', rawhtml, re.IGNORECASE)
  SysType = res.group(1)

  #Breakdown
  bd1 = re.search('<span>Part Description</span></td></tr><tr><td .*?><img .*? /></td></tr>(.*</table>)', rawhtml, re.IGNORECASE)
  table = bd1.group(1)
  table = re.sub('</table>','',table)
  table = re.sub('<tr>.*?</td><td','<td',table)
  table = re.sub('<td valign="top">|<td valign="top"></td>','',table)
  table = re.sub('<td valign="top">','',table)
  table = re.sub('<.*?>', 'XXX', table)
  table = re.sub('X{2,}', '<x>', table)

  #Get RAM
  res = re.search('<x>MEMORY\s:\s(\d+[M|G]B).*?<x>',table)
  Ram = res.group(1)

  #Get HDD
  res = re.search('<x>.*?(\d+GB).*?<x>',table)
  HDD = res.group(1)

  #Get Support Details
  Support = []
  res = re.search('<x>(\dY[R]?\s.*?)<x>',table)
  if (res):
    Support.append(res.group(1))

  res = ""

  res = re.search('<x>(UPG\sTO\s\dY[R]?\s.*?)<x>',table)
  if (res):
    Support.append(res.group(1))

  res = ""

  res = re.search('<x>(\dY[R]?\sPRO.*?)<x>',table)
  if (res):
    Support.append(res.group(1))

  return shipdate, SysType, Support, Ram, HDD

#fileI/O
def readfile(inFile):
  if os.path.isfile(inFile) == False:
    exit()

  thisfile = open(inFile, 'r')

  tags = []
  taga = tags.append
  for line in thisfile.readlines():
    if line.strip(): # check for empty lines
      if line[-1] == '\n':
        taga(line[0:-1]) # If line ends in line break remove it
      else:
        taga(line)
  thisfile.close()
  return tags

def writefile(outfile, outline):
  thisfile = open(outfile, 'a')
  thisfile.write(outline + "\n")
  thisfile.close()

#format data and write
def fdata(tag, shipdate, SysType, Support, Ram, HDD):
  d,m,y = shipdate.split('/')
  shipdate = y + '-' + m + '-' + d

  ts = ''
  for s in Support:
    ts = ts + ';' + s
  ts = re.sub('^;','',ts)
  Support = ts

  if re.match('\d\d\d\dMB',Ram):
    res = re.search('(\d)\d\d\dMB',Ram)
    Ram = res.group(1) + "GB"

  outstr = '"' + tag + '","' + shipdate + '","' + SysType + '","' + Ram + '","' + HDD + '","' + Support + '"'

  return outstr

###  MAIN ###

url = 'http://support.euro.dell.com/support/topics/topic.aspx/emea/shared/support/my_systems_info/en/details?c=uk&cs=RC1050265&l=en&s=bsd&~ck=anavml&~tab=2&~wsf=tabs&servicetag='

filename = 'taglist.txt'
tags = readfile(filename)

for tag in tags:
  try:
    req = Request(url + tag)
    response = urlopen(req)
    rawhtml = response.read()
  except URLError, e:
    if hasattr(e, 'reason'):
      print 'We failed to reach a server.'
      print 'Reason: ', e.reason
    elif hasattr(e, 'code'):
      print 'The server couldn\'t fulfill the request.'
      print 'Error code: ', e.code
    else:
      shipdate, SysType, Support, Ram, HDD = processHTML(rawhtml)
      outstr = fdata(tag, shipdate, SysType, Support, Ram, HDD)
      writefile("./SystemList.txt", outstr)

3 thoughts on “Dell Scraper”

Douglas Reading says:

2011-07-12 at 09:12

Hi, I really like this script and I want to be able to use it but nothing outputs the SystemList.txt file when it completes. Can you point me in the right direction?

Ryan says:

2011-07-12 at 10:20

Douglas,

Dell changed the layout of their website and broke the script.

I’ll try and update it in the next few days, or at least get a powershell version done.

Regards,

Ryan

Ryan says:

2011-07-13 at 12:24

Douglas,

Actually 3 questions:
1) What country are you in? I developed this against UK systems and just noticed that “India” for example had a different layout.

2) Can you provide me (via the contact page) a service tag of one of your systems if you are not in the UK

3) Are you just after the Warranty Info or are you after the System details (HDD, RAM) as well?

Regards,

Ryan

3 thoughts on “Dell Scraper”

Leave a Reply Cancel reply