Day 1: Parsing DMIDecode output

In our first day we will write a dmidecode parser in nim

What to expect ?

let sample1 = """
# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: LENOVO
        Product Name: 20042
        Version: Lenovo G560
        Serial Number: 2677240001087
        UUID: CB3E6A50-A77B-E011-88E9-B870F4165734
        Wake-up Type: Power Switch
        SKU Number: Calpella_CRB
        Family: Intel_Mobile
"""

import dmidecode, tables

var obj : Table[string, dmidecode.Section]
obj = parseDMI(sample)
for secname, sec in obj:
    echo secname & " with " & $len(sec.props)
    for k, p in sec.props:
        echo "k : " & k & " => " & p.val 
        if len(p.items) > 0:
            for i in p.items:
                echo "\t\t I: ", i

Implementation

a while ago at work (https://github.com/zero-os/0-core) we needed to parse some dmidecode output, and it sounds like an good problem with enough concepts to get my feet wet in nim.

nimble ready!

mkdir dmidecode
cd dmidecode
nimble init

So how does dmidecode output look like?

# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: LENOVO
        Product Name: 20042
        Version: Lenovo G560
        Serial Number: 2677240001087
        UUID: CB3E6A50-A77B-E011-88E9-B870F4165734
        Wake-up Type: Power Switch
        SKU Number: Calpella_CRB
        Family: Intel_Mobile

or

Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 1.40
  • DMIDecode output is some meta like comments, versions and one or more sections
  • Section: consists of a
    • handle line
    • title line
    • one or more indented properties
  • Property: consists of
    • key
    • optional value
    • optional list of indented items

Mapping DMI to nim structures

So ourplan is to have an api like

dmifile = parseDMI(source)
dmifile["section1"]["property1"].value

Let's describe the document structure we have

import  sequtils, tables, strutils

type 
    Property* = ref object
        val*: string
        items*: seq[string]
type
    Section* = ref object
        handleLine*, title*: string
        props* : Table[string, Property]

method addItem(this: Property, item: string) =
    this.items.add(item)

As our parsing will depend on the indentation level we can use this handy function to get the indentation level of a line (number of spaces before the first asciiLetter)

proc getIndentLevel(line: string) : int = 
    for i, c in pairs(line):
        if not c.isSpaceAscii():
            return i
    return 0

It'd have been nicer to use takewhile, but it's not available in nim stdlib

    getindentlevel = lambda l:  len(list(takewhile(lambda c: c.isspace(), l)))

Parsing DMI source into nim structures

There're many ways to parse the DMI (e.g using regex which would be fairly simple "feel free to implement it" and kindly send me a PR to update this tutorial)

proc parseDMI* (source: string) : Table[string, Section]=

In plain english for output like this

Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 1.40

we have couple of states

type 
    ParserState = enum
        noOp, sectionName, readKeyValue, readList
  • noOp: no action yet
  • sectionName: read sectionName
  • readKeyValue: read a line has colon : in it into a key value pair
  • readList: when the next line has greater indentation level than the property line

so our state is noOp until we reach line Handle 0x0000, DMI type 0, 24 bytes then moves to sectionName

for line BIOS Information then state changes to reading properties

        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:

then we notice the indentation on the next line is greater than the one on the current line

                PCI is supported
        Characteristics:

so state moves into readList to read the items related to property Characterstics

                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported

and again it notices the indentation is of the next line is less than the current line

        BIOS Revision: 1.40
                Targeted content distribution is supported

so state switches again into readKeyValue

  • if we encounter an empty line:
    • if not in parsing state then it's a noOp we ignore meta and empty lines
    • if in parsing state current Section isn't nil we finish parsing the section object

proc parseDMI* (source: string) : Table[string, Section]=
    
    var
        state : ParserState = noOp
        lines = strutils.splitLines(source)
        sects = initTable[string, Section]()
        
        p: Property = nil
        s: Section = nil 
        k, v: string

Here we define the current state, code lines, initialize a table sects from sectionName to Section Object and variables p current property, s current section, k, v current property key, value

    for i, l in pairs(lines):

Start looping on index, line using pairs

pairs is kinda like enumerate in python

        if l.startsWith("Handle"):
            s = new Section
            s.props = initTable[string, Property]()
            s.handleline = l
            state = sectionName
            continue 

If we encounter the string Handle

  • create new section object and initialize it's props table
  • keep track of the handle line
  • switch state to reading sectionName
  • continue the loop to move to the title line
        if l == "": # can be just new line before reading any sections. 
            if s != nil:
                sects[s.title] = s
            continue

if line is empty and we have a section object not nil we finish the section and continue

        if state == sectionName:  # current line is the title line
            s.title = l
            state = readKeyValue  # change state into reading key value pairs

If state is sectionName:

  • this line is a title line
  • change state for the upcoming to readKeyValue
        elif state == readKeyValue:
            let pair = l.split({':'})
            k = pair[0].strip()
            if len(pair) == 2:
                v = pair[1].strip()
            else:                 # value can be empty
                v = ""
            p = Property(val: v)
            p.items = newSeq[string]()
            p.val = v

If state is readKeyValue

  • split the line on colon : to get key, value pair and set v to "" if not present
  • make current Property p and initialize its related fields items, val
            # current line indentation is <  nextline indentation => change state to readList
            if i < len(lines) and (getIndentlevel(l) < getIndentlevel(lines[i+1])) :
                state = readList

If the next line indentation is greater this means we're should be reading list of items regarding the current property p

            else:
                # add key/value pair directly
                s.props[k] = p

If not finish the property

        elif state == readList:
            # keep adding the current line to current property items and if dedented => change state to readKeyValue
            p.add_item(l.strip())
            if getindentlevel(l) > getindentlevel(lines[i+1]):
                state = readKeyValue 
                s.props[k] = p

if state is readList

  • keep adding items to current property p
  • if the indentation level decreased change state to readKeyValue and finish property
    return sects