Nim Days

Nim days book is about my journey using Nim and creating useful/practical things with it including:

  • ini parser
  • bencode parser
  • links checker
  • tictactoe (commandline and gui)
  • testing framework
  • build system
  • tcp router
  • redis parser
  • redis client
  • assets bundler
  • terminal table
  • dotfiles manager
  • urlshortening application

This book is influenced by the great books Practical Common Lisp, Real World Haskell and I'm planning to follow the same model of having the book available for free online.

Reporting issues

You can report issues or create pull requests on the book repository

Day 1: Parsing DMIDecode output

In our first day we will write a dmidecode parser in nim

What to expect ?

let sample1 = """
# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: LENOVO
        Product Name: 20042
        Version: Lenovo G560
        Serial Number: 2677240001087
        UUID: CB3E6A50-A77B-E011-88E9-B870F4165734
        Wake-up Type: Power Switch
        SKU Number: Calpella_CRB
        Family: Intel_Mobile
"""

import dmidecode, tables

var obj : Table[string, dmidecode.Section]
obj = parseDMI(sample)
for secname, sec in obj:
    echo secname & " with " & $len(sec.props)
    for k, p in sec.props:
        echo "k : " & k & " => " & p.val 
        if len(p.items) > 0:
            for i in p.items:
                echo "\t\t I: ", i

Implementation

a while ago at work (https://github.com/zero-os/0-core) we needed to parse some dmidecode output, and it sounds like an good problem with enough concepts to get my feet wet in nim.

nimble ready!

mkdir dmidecode
cd dmidecode
nimble init

So how does dmidecode output look like?

# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: LENOVO
        Product Name: 20042
        Version: Lenovo G560
        Serial Number: 2677240001087
        UUID: CB3E6A50-A77B-E011-88E9-B870F4165734
        Wake-up Type: Power Switch
        SKU Number: Calpella_CRB
        Family: Intel_Mobile

or

Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 1.40
  • DMIDecode output is some meta like comments, versions and one or more sections
  • Section: consists of a
    • handle line
    • title line
    • one or more indented properties
  • Property: consists of
    • key
    • optional value
    • optional list of indented items

Mapping DMI to nim structures

So ourplan is to have an api like

dmifile = parseDMI(source)
dmifile["section1"]["property1"].value

Let's describe the document structure we have

import  sequtils, tables, strutils

type 
    Property* = ref object
        val*: string
        items*: seq[string]
type
    Section* = ref object
        handleLine*, title*: string
        props* : Table[string, Property]

method addItem(this: Property, item: string) =
    this.items.add(item)

As our parsing will depend on the indentation level we can use this handy function to get the indentation level of a line (number of spaces before the first asciiLetter)

proc getIndentLevel(line: string) : int = 
    for i, c in pairs(line):
        if not c.isSpaceAscii():
            return i
    return 0

It'd have been nicer to use takewhile, but it's not available in nim stdlib

    getindentlevel = lambda l:  len(list(takewhile(lambda c: c.isspace(), l)))

Parsing DMI source into nim structures

There're many ways to parse the DMI (e.g using regex which would be fairly simple "feel free to implement it" and kindly send me a PR to update this tutorial)

proc parseDMI* (source: string) : Table[string, Section]=

In plain english for output like this

Getting SMBIOS data from sysfs.
SMBIOS 2.6 present.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 1.40

we have couple of states

type 
    ParserState = enum
        noOp, sectionName, readKeyValue, readList
  • noOp: no action yet
  • sectionName: read sectionName
  • readKeyValue: read a line has colon : in it into a key value pair
  • readList: when the next line has greater indentation level than the property line

so our state is noOp until we reach line Handle 0x0000, DMI type 0, 24 bytes then moves to sectionName

for line BIOS Information then state changes to reading properties

        Vendor: LENOVO
        Version: 29CN40WW(V2.17)
        Release Date: 04/13/2011
        ROM Size: 2048 kB
        Characteristics:

then we notice the indentation on the next line is greater than the one on the current line

                PCI is supported
        Characteristics:

so state moves into readList to read the items related to property Characterstics

                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported

and again it notices the indentation is of the next line is less than the current line

        BIOS Revision: 1.40
                Targeted content distribution is supported

so state switches again into readKeyValue

  • if we encounter an empty line:
    • if not in parsing state then it's a noOp we ignore meta and empty lines
    • if in parsing state current Section isn't nil we finish parsing the section object

proc parseDMI* (source: string) : Table[string, Section]=
    
    var
        state : ParserState = noOp
        lines = strutils.splitLines(source)
        sects = initTable[string, Section]()
        
        p: Property = nil
        s: Section = nil 
        k, v: string

Here we define the current state, code lines, initialize a table sects from sectionName to Section Object and variables p current property, s current section, k, v current property key, value

    for i, l in pairs(lines):

Start looping on index, line using pairs

pairs is kinda like enumerate in python

        if l.startsWith("Handle"):
            s = new Section
            s.props = initTable[string, Property]()
            s.handleline = l
            state = sectionName
            continue 

If we encounter the string Handle

  • create new section object and initialize it's props table
  • keep track of the handle line
  • switch state to reading sectionName
  • continue the loop to move to the title line
        if l == "": # can be just new line before reading any sections. 
            if s != nil:
                sects[s.title] = s
            continue

if line is empty and we have a section object not nil we finish the section and continue

        if state == sectionName:  # current line is the title line
            s.title = l
            state = readKeyValue  # change state into reading key value pairs

If state is sectionName:

  • this line is a title line
  • change state for the upcoming to readKeyValue
        elif state == readKeyValue:
            let pair = l.split({':'})
            k = pair[0].strip()
            if len(pair) == 2:
                v = pair[1].strip()
            else:                 # value can be empty
                v = ""
            p = Property(val: v)
            p.items = newSeq[string]()
            p.val = v

If state is readKeyValue

  • split the line on colon : to get key, value pair and set v to "" if not present
  • make current Property p and initialize its related fields items, val
            # current line indentation is <  nextline indentation => change state to readList
            if i < len(lines) and (getIndentlevel(l) < getIndentlevel(lines[i+1])) :
                state = readList

If the next line indentation is greater this means we're should be reading list of items regarding the current property p

            else:
                # add key/value pair directly
                s.props[k] = p

If not finish the property

        elif state == readList:
            # keep adding the current line to current property items and if dedented => change state to readKeyValue
            p.add_item(l.strip())
            if getindentlevel(l) > getindentlevel(lines[i+1]):
                state = readKeyValue 
                s.props[k] = p

if state is readList

  • keep adding items to current property p
  • if the indentation level decreased change state to readKeyValue and finish property
    return sects

Day 2: Parsing Bencode

nim-bencode is a library to encode/decode torrent files Bencode

What to expect?

import bencode, tables, strformat

let encoder = newEncoder()
let decoder = newDecoder()

let btListSample1 = @[BencodeType(kind:btInt, i:1), BencodeType(kind:btString, s:"hi") ]
var btDictSample1 = initOrderedTable[BencodeType, BencodeType]()
btDictSample1[BencodeType(kind:btString, s:"name")] = BencodeType(kind:btString, s:"dmdm")
btDictSample1[BencodeType(kind:btString, s:"lang")] = BencodeType(kind:btString, s:"nim")
btDictSample1[BencodeType(kind:btString, s:"age")] = BencodeType(kind:btInt, i:50)
btDictSample1[BencodeType(kind:btString, s:"alist")] = BencodeType(kind:btList, l:btListSample1)

var testObjects = initOrderedTable[BencodeType, string]()
testObjects[BencodeType(kind: btString, s:"hello")] = "5:hello"
testObjects[BencodeType(kind: btString, s:"yes")] = "3:yes"
testObjects[BencodeType(kind: btInt, i:55)] = "i55e"

testObjects[BencodeType(kind: btInt, i:12345)] = "i12345e"
testObjects[BencodeType(kind: btList, l:btListSample1)] = "li1e2:hie"
testObjects[BencodeType(kind:btDict, d:btDictSample1)] = "d4:name4:dmdm4:lang3:nim3:agei50e5:alistli1e2:hiee"


for k, v in testObjects.pairs():
    echo $k & " => " & $v
    doAssert(encoder.encodeObject(k) == v)
    doAssert(decoder.decodeObject(v) == k)

Implementation

So according to Bencode we have some datatypes

  • strings and those are encoded with the string length followed by a colon and the string itself length:string, e.g yes will be encoded into 3:yes
  • ints those are encoded between i, e letters, e.g 59 will be encoded into i59e
  • lists can contain any of the bencode types and it's encoded with l, e, e.g list of 1, 2 numbers is encoded into li1ei2e or with spaces for verbosity l i1e i2e e
  • dicts are mapping from strings to any type and encoded between letters d, e, e.g name => hi and num => 3 is encoded into d4:name2:hi3:numi3ee or with spaces for verbosity d 4:name 2:hi 3:num i3e e

Imports

import strformat, tables, json, strutils, hashes

As we will be dealing a lot with strings, tables

Types

type 
    BencodeKind* = enum
        btString, btInt, btList, btDict

So as we mentioned about bencode data types we can define an enum to represents the kinds

    BencodeType* = ref object
        case kind*: BencodeKind 
        of BencodeKind.btString: s* : string 
        of BencodeKind.btInt: i*    : int
        of BencodeKind.btList: l*   : seq[BencodeType]
        of BencodeKind.btDict: d*  : OrderedTable[BencodeType, BencodeType]

    Encoder* = ref object
    Decoder* = ref object 
  • Encoder a simple class to represent encoding operations
  • Decoder a simple class to represent decoding operations
  • For BencodeType we make use of variant objects case classes in other languages. worth noticing variant objects are the same technique used for json module.

So we can use it like this

BencodeType(kind: btString, s:"hello")
BencodeType(kind: btInt, i:55)
let btListSample1 = @[BencodeType(kind:btInt, i:1), BencodeType(kind:btString, s:"hi") ]
BencodeType(kind: btList, l:btListSample1)

So general rule for the case classes is you have a kind defined in an enum and a constructor value u create the object with.

If you're coming from Haskell or a similar language

data BValue = BInt Integer
            | BStr B.ByteString
            | BList [BValue]
            | BDict (M.Map BValue BValue)
            deriving (Show, Eq, Ord)

Please, note if you define your own variant you should define hash, == procs to be able to compare or hash the values.

proc hash*(obj: BencodeType): Hash = 
    case obj.kind
    of btString : !$(hash(obj.s))
    of btInt : !$(hash(obj.i))
    of btList: !$(hash(obj.l))
    of btDict: 
        var h = 0
        for k, v in obj.d.pairs:
            h = h !& hash(k) !& hash(v)
        !$(h)
  • hash proc returns Hash and depending on the kind we return the hash of the underlying stored objects, strings, ints, lists or calculate a new hash if needed
  • !& consider it like merging the two hashes together
  • !$ is used to finalize the Hash object
proc `==`* (a, b: BencodeType): bool =
    ## Check two nodes for equality
    if a.isNil:
        if b.isNil: return true
        return false
    elif b.isNil or a.kind != b.kind:
        return false
    else:
        case a.kind
        of btString:
            result = a.s == b.s
        of btInt:
            result = a.i == b.i
        of btList:
            result = a.l == b.l
        of btDict:
            if a.d.len != b.d.len: return false
            for key, val in a.d:
                if not b.d.hasKey(key): return false
                if b.d[key] != val: return false
            result = true

define equality operator on BencodeTypes to determine when they're equal by defining proc for operator ==

proc `$`* (a: BencodeType): string = 
    case a.kind
    of btString:  fmt("<Bencode {a.s}>")
    of btInt: fmt("<Bencode {a.i}>")
    of btList: fmt("<Bencode {a.l}>")
    of btDict: fmt("<Bencode {a.d}")

Define a simple toString proc using the $ operator.

Encoding

proc encode(this: Encoder,  obj: BencodeType) : string

we add forward declarating to encode proc because to encode a list we might encode another values strings, or even lists so we will recursively call encode if needed, feel free to skip to the next part.

proc encode_s(this: Encoder, s: string) : string=
    # TODO: check len
    return $s.len & ":" & s

To encode a string we said we will put encoded with its length + : + string itself

proc encode_i(this: Encoder, i: int) : string=
    # TODO: check len
    return fmt("i{i}e") 

To encode an int we put it between i, e chars

proc encode_l(this: Encoder, l: seq[BencodeType]): string =
    var encoded = "l"
    for el in l:
        encoded &= this.encode(el)
    encoded &= "e"
    return encoded
  • To encode a list of elements of type BencodeType we put their encoded values between l, e chars
  • Notice the call to this.encode that's why we needed the forward declaration.
proc encode_d(this: Encoder, d: OrderedTable[BencodeType, BencodeType]): string =
    var encoded = "d"
    for k, v in d.pairs():
        assert k.kind == BencodeKind.btString
        encoded &= this.encode(k) & this.encode(v)

    encoded &= "e"
    return encoded
  • To encode a dict we enclose the encoded value of the pairs between d, e
  • Notice the recursive call to this.encode to the keys and values
  • Notice the assertion the kind of the keys must be a btString according to Bencode specs.
proc encode(this: Encoder,  obj: BencodeType) :  string =
    case obj.kind
    of BencodeKind.btString:  result =this.encode_s(obj.s)
    of BencodeKind.btInt :  result = this.encode_i(obj.i)
    of BencodeKind.btList : result = this.encode_l(obj.l)
    of BencodeKind.btDict : result = this.encode_d(obj.d)

Simple proxy to encode obj of BencodeType

Decoding

proc decode(this: Decoder,  source: string) : (BencodeType, int)

Forward declaration for decode same as decode

proc decode_s(this: Decoder, s: string) : (BencodeType, int) =
    let lengthpart = s.split(":")[0]
    let sizelength = lengthpart.len
    let strlen = parseInt(lengthpart)
    return (BencodeType(kind:btString, s: s[sizelength+1..strlen+1]), sizelength+1+strlen)

Create a BencodeType of after decoding a string reverse operation of encode_s Basically and read string of length sizelength after the colon and construct a BencodeType of kind btString out of it

proc decode_i(this: Decoder, s: string) : (BencodeType, int) =
    let epos = s.find('e')
    let i = parseInt(s[1..<epos])
    return (BencodeType(kind:btInt, i:i), epos+1)

Extract the number between i, e chars and construct BencodeType of kind btInt out of it

proc decode_l(this: Decoder, s: string): (BencodeType, int) =
    # l ... e
    var els = newSeq[BencodeType]()
    var curchar = s[1]
    var idx = 1
    while idx < s.len:
        curchar = s[idx]
        if curchar == 'e':
            idx += 1
            break
    
        let pair = this.decode(s[idx..<s.len])
        let obj = pair[0]
        let nextobjpos = pair[1] 
        els.add(obj)
        idx += nextobjpos
    return (BencodeType(kind:btList, l:els), idx)

Decoding the list can be bit tricky

  • Its elements are between l, e chars
  • So we start trying to decode objects starting from the first letter after the l until we reach the final e e.g
li1ei2ee

will be parsed like the following

li120ei492ee
 $   $
  • will consume the object i120e and set the cursor to the beginning of the second object i492e
  • after all the objects are consumed we consume the end character e and we are done
  • That's why all decode procs return int value to let us now how much characters to skip
proc decode_d(this: Decoder, s: string): (BencodeType, int) =
    var d = initOrderedTable[BencodeType, BencodeType]()
    var curchar = s[1]
    var idx = 1
    var readingKey = true
    var curKey: BencodeType
    while idx < s.len:
        curchar = s[idx]
        if curchar == 'e':
            break
        let pair = this.decode(s[idx..<s.len])
        let obj = pair[0]
        let nextobjpos = pair[1]
        if readingKey == true:
            curKey = obj
            readingKey = false
        else:
            d[curKey] = obj
            readingKey = true
        idx += nextobjpos
    return (BencodeType(kind:btDict, d: d), idx)
  • Same technique as above
  • Basically we read one object if we don't have a current key then we set it as the current key
  • If we have a current key object then the object we read is the value, so we set the currentKey to that value and change mode to readingKey again.
proc decode(this: Decoder,  source: string) : (BencodeType, int) =
    var curchar = source[0]
    var idx = 0
    while idx < source.len:
        curchar = source[idx]
        case curchar
        of 'i':
            let pair = this.decode_i(source[idx..source.len])
            let obj = pair[0]
            let nextobjpos = pair[1] 
            idx += nextobjpos
            return (obj, idx)
        of 'l':
            let pair = this.decode_l(source[idx..source.len])
            let obj = pair[0]
            let nextobjpos = pair[1] 
            idx += nextobjpos
            return (obj, idx)
        of 'd':
            let pair = this.decode_d(source[idx..source.len])
            let obj = pair[0]
            let nextobjpos = pair[1] 
            idx += nextobjpos
            return (obj, idx)
        else: 
            let pair = this.decode_s(source[idx..source.len])
            let obj = pair[0]
            let nextobjpos = pair[1] 
            idx += nextobjpos
            return (obj, idx)

Starts decoding based on the beginning of character encoding object i for int, l for lists, d for dicts and otherwise tries to parse string

proc newEncoder*(): Encoder =
    new Encoder

proc newDecoder*(): Decoder = 
    new Decoder

Simple constructor procs for newEncoder, newDecoder

proc encodeObject*(this: Encoder, obj: BencodeType) : string =
    return this.encode(obj)

encodeObject dispatch the call to encode proc.

proc decodeObject*(this: Decoder, source:string) : BencodeType =
    let p = this.decode(source)
    return p[0]

decodeObject provides a friendlier API to return the BencodeType from decode instead of BencodeType, how many to read int

Day 3: Talking to C (FFI and libmagic)

Libmagic is a magic number recognition library, remember everytime you called file utility on a file to know its type?

➜  file /usr/bin/rm
/usr/bin/rm: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=cbae26b2a032b1ce3129d56aee2bcf70dd8deeb0, stripped
➜  nim-magic file /
/: directory
➜  file /usr/include/stdio.h
/usr/include/stdio.h: C source, ASCII text

What to expect?

import magic

echo magic.guessFile("/usr/bin/rm")

The output should be something like

ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=cbae26b2a032b1ce3129d56aee2bcf70dd8deeb0, stripped

Implementation

FFI Chapter of Nim in Action is freely available.

Step 0: Imports

from os import fileExists, expandFilename

Step 1: Get the library info

Well, libmagic has libmagic.so in your library path /usr/lib/libmagic.so and a header file magic.h in /usr/include/magic.h. create a constant for the libmagic library name.

const libName* = "libmagic.so"

Step 2: Extract constants

We should extract the constants from the header

#define MAGIC_NONE              0x0000000 /* No flags */
#define MAGIC_DEBUG             0x0000001 /* Turn on debugging */
#define MAGIC_SYMLINK           0x0000002 /* Follow symlinks */
#define MAGIC_COMPRESS          0x0000004 /* Check inside compressed files */
#define MAGIC_DEVICES           0x0000008 /* Look at the contents of devices */
#define MAGIC_MIME_TYPE         0x0000010 /* Return the MIME type */
#define MAGIC_CONTINUE          0x0000020 /* Return all matches */
#define MAGIC_CHECK             0x0000040 /* Print warnings to stderr */
....

So in nim It'd be something like this

const  MAGIC_NONE*  = 0x000000                 # No flags 
const  MAGIC_DEBUG* = 0x000001                 # Turn on debugging 
const  MAGIC_SYMLINK* = 0x000002                 # Follow symlinks 
const  MAGIC_COMPRESS* = 0x000004                # Check inside compressed files 
const  MAGIC_DEVICES* = 0x000008                 # Look at the contents of devices 
const  MAGIC_MIME_TYPE* = 0x000010            # Return only the MIME type 
const  MAGIC_CONTINUE* = 0x000020             # Return all matches 
const  MAGIC_CHECK* = 0x000040                 # Print warnings to stderr 
const  MAGIC_PRESERVE_ATIME* = 0x000080        # Restore access time on exit 
const  MAGIC_RAW* = 0x000100                    # Don't translate unprint chars 
const  MAGIC_ERROR* = 0x000200                 # Handle ENOENT etc as real errors 
const  MAGIC_MIME_ENCODING* = 0x000400         # Return only the MIME encoding 
const  MAGIC_NO_CHECK_COMPRESS* = 0x001000     # Don't check for compressed files 
const  MAGIC_NO_CHECK_TAR* = 0x002000         # Don't check for tar files 
const  MAGIC_NO_CHECK_SOFT* = 0x004000         # Don't check magic entries 
const  MAGIC_NO_CHECK_APPTYPE* = 0x008000        # Don't check application type 
const  MAGIC_NO_CHECK_ELF* = 0x010000            # Don't check for elf details 
const  MAGIC_NO_CHECK_ASCII* = 0x020000         # Don't check for ascii files 
const  MAGIC_NO_CHECK_TOKENS* = 0x100000         # Don't check ascii/tokens 

Step 3: Extract the types

typedef struct magic_set *magic_t; so the only type we have is a pointer to some struct (object)

type Magic = object
type MagicPtr* = ptr Magic 

Step 4: Extract procedures

magic_t magic_open(int);
void magic_close(magic_t);

const char *magic_getpath(const char *, int);
const char *magic_file(magic_t, const char *);
const char *magic_descriptor(magic_t, int);
const char *magic_buffer(magic_t, const void *, size_t);

const char *magic_error(magic_t);
int magic_getflags(magic_t);
int magic_setflags(magic_t, int);

int magic_version(void);
int magic_load(magic_t, const char *);
int magic_load_buffers(magic_t, void **, size_t *, size_t);

int magic_compile(magic_t, const char *);
int magic_check(magic_t, const char *);
int magic_list(magic_t, const char *);
int magic_errno(magic_t);

we only care about magic_open, magic_load, magic_close, magic_file, magic_error

# magic_t magic_open(int);
proc magic_open(i:cint) : MagicPtr {.importc, dynlib:libName.}

magic_open is a proc declared in dynamic lib libmagic.so, that is takes a cint "compatible c int" i and returns a MagicPtr.

From the manpage

The function magic_open() creates a magic cookie pointer and returns it. It returns NULL if there was an error allocating the magic cookie. The flags argument specifies how the other magic functions should behave

# void magic_close(magic_t);
proc magic_close(p:MagicPtr): void {.importc,  dynlib:libName.}

magic_close is a proc declared in dynlib libmagic.so and takes an argumnet p of type MagicPtr and returns void

From the manpage

The magic_close() function closes the magic(5) database and deallocates any resources used.

#int magic_load(magic_t, const char *);
proc magic_load(p:MagicPtr, s:cstring) : cint {.importc, dynlib: libName.}

magic_load is a proc declared in dynlib libmagic.so takes argument p of type MagicPtr and a cstring "compatible c string" s and returns a cint

From manpage:

The magic_load() function must be used to load the colon separated list of database files passed in as filename, or NULL for the default database file before any magic queries can performed.

#int magic_errno(magic_t);
proc magic_error(p: MagicPtr) : cstring  {.importc, dynlib:libName.}

magic_errno is a proc declared in dynlib libmagic.so and takes argument p of type MagicPtr and returns a cstring

From manpage

The magic_error() function returns a textual explanation of the last error, or NULL if there was no error.

#const char *magic_file(magic_t, const char *);
proc magic_file(p:MagicPtr, filepath: cstring): cstring {.importc, dynlib: libName.} 

magic_file is proc declared in dynlib libmagic.so takes argument p of type MagicPtr and a filepath of type cstring and returns a cstring

From manpage:

The magic_file() function returns a textual description of the contents of the filename argument, or NULL if an error occurred. If the filename is NULL, then stdin is used.

Step 5: Friendly API

It'd be annoying for people to write C code and take care of pointers and such in a higher level language like nim

So let's expose a proc guessFile takes a filepath and flags and internally use the functions we exposed through the FFI in the previous step.

proc guessFile*(filepath: string, flags: cint = MAGIC_NONE): string =
    var mt : MagicPtr
    mt = magic_open(flags)
    discard magic_load(mt, nil)

    if fileExists(expandFilename(filepath)):
        result = $magic_file(mt, cstring(filepath))
    magic_close(mt)

Only one note here to convert from cstring to string we use the toString operator $

        result = $magic_file(mt, cstring(filepath))

Day 4: LinksChecker

What to expect ?

We will be writing a simple linkschecker in both sequential and asynchronous style in nim

Implementation

Step 0: Imports

import  os, httpclient
import strutils
import times
import asyncdispatch

Step 1: Data types

type
    LinkCheckResult = ref object 
        link: string
        state: bool

LinkCheckResult is a simple representation for a link and its state

Step 2: GO Sequential!

proc checkLink(link: string) : LinkCheckResult  =
    var client = newHttpClient()
    try:
        return LinkCheckResult(link:link, state:client.get(link).code == Http200)
    except:
        return LinkCheckResult(link:link, state:false)

Here, we have a proc checkLink takes a link and returns LinkCheckResult

  • newHttpClient() to create a new client
  • client.get to send a get request to a link and it returns a response
  • response.code gives us the HTTP status code, and we consider a link is valid if its status == 200
  • client.get raises error for invalid structured links that's why we wrapped it a try/except block
proc sequentialLinksChecker(links: seq[string]): void = 
    for index, link in links:
        if link.strip() != "":
            let result = checkLink(link)
            echo result.link, " is ", result.state

Here, sequentialLinksChecker proc takes sequence of links and executes checkLink on them sequentially

LINKS: @["https://www.google.com.eg", "https://yahoo.com", "https://reddit.com", "https://none.nonadasdet", "https://github.com", ""]
SEQUENTIAL::
https://www.google.com.eg is true
https://yahoo.com is true
https://reddit.com is true
https://none.nonadasdet is false
https://github.com is true
7.716497898101807

On my lousy internet it took 7.7 seconds to finish :(

Step 3: GO ASYNC!

We can do better than waiting on IO requests to finish

proc checkLinkAsync(link: string): Future[LinkCheckResult] {.async.} =
    var client = newAsyncHttpClient()

    let future = client.get(link)
    yield future
    if future.failed:
        return LinkCheckResult(link:link, state:false)
    else:
        let resp = future.read()
        return LinkCheckResult(link:link, state: resp.code == Http200) 

Here, we define a checkLinkAsync proc

  • to declare a proc as async we use async pragma
  • notice the client is of type newAsyncHttpClient that doesn't block on .get calls
  • client.get returns immediately a future that can either fail, and we can infer know that from future.failed or succeed
  • yield future means okay i'm done for now dear event loop you can schedule other tasks and continue my execution when you have more update on my fancy future when the eventloop comes back because the future now has some updates
  • clearly, if the future failed we return the link with a false state
  • otherwise, we get the response object that's enclosed in the future by calling read

proc asyncLinksChecker(links: seq[string]) {.async.} = 
    # client.maxRedirects = 0
    var futures = newSeq[Future[LinkCheckResult]]()
    for index, link in links:
        if link.strip() != "":
            futures.add(checkLinkAsync(link))
    
    # waitFor -> call async proc from sync proc, await -> call async proc from async proc
    let done = await all(futures)
    for x in done:
        echo x.link, " is ", x.state

Here, we have another async procedure asyncLinksChecker that will take a sequence of links and create futures for all of them and wait when they finish and give us some results

  • futures is a sequence for the future results of all the LinkCheckResults for all the links passed to asyncLinksChecker proc
  • we loop on the links and get future for the execution of checkLinkAsync and add it to the futures sequence.
  • we now ask to force to block until we get all of the results out of the futures into done variable
  • then we print all the results
  • Please notice await is used only to call async proc from another async proc, and waitFor is used to call async proc from sync proc
ASYNC::
https://www.google.com.eg is true
https://yahoo.com is true
https://reddit.com is true
https://none.nonadasdet is false
https://github.com is true
 is false
3.601503849029541

Step 4 simple cli

proc main()=
    echo "Param count: ", paramCount()
    if paramCount() == 1:
        let linksfile = paramStr(1)
        var f = open(linksfile, fmRead)
        let links = readAll(f).splitLines()
        echo "LINKS: " & $links
        echo "SEQUENTIAL:: "
        var t = epochTime()
        sequentialLinksChecker(links)
        echo epochTime()-t
        echo "ASYNC:: "
        t = epochTime()
        waitFor asyncLinksChecker(links)
        echo epochTime()-t

    else:
        echo "Please provide linksfile"
main()

the only interesting part is waitFor asyncLinksChecker(links) as we said to call async proc from sync proc like this main proc you will need to use waitFor

Extra, threading

import threadpool
proc checkLinkParallel(link: string) : LinkCheckResult {.thread.} =
    var client = newHttpClient()
    try:
        return LinkCheckResult(link:link, state:client.get(link).code == Http200)
    except:
        return LinkCheckResult(link:link, state:false)

Same as before, only thread pragma i used to note that proc will be executed within a thread

proc threadsLinksChecker(links: seq[string]): void = 
    var LinkCheckResults = newSeq[FlowVar[LinkCheckResult]]()
    for index, link in links:
        LinkCheckResults.add(spawn checkLinkParallel(link))  
    
    for x in LinkCheckResults:
        let res = ^x
        echo res.link, " is ", res.state
  • spawned tasks or threads returns a value of type FlowVar[T], where T is the return value of the spawned proc
  • To get the value of a FlowVar we use ^ operator.

Note: you should use nim.cfg with flags -d:ssl to allow working with https

Day 5: Creating INI Parser

this is a pure Ini parser for nim

Nim has advanced parsecfg

What to expect ?

let sample1 = """

[general]
appname = configparser
version = 0.1

[author]
name = xmonader
email = notxmonader@gmail.com


"""

var d = parseIni(sample1)

# doAssert(d.sectionsCount() == 2)
doAssert(d.getProperty("general", "appname") == "configparser")
doAssert(d.getProperty("general","version") == "0.1")
doAssert(d.getProperty("author","name") == "xmonader")
doAssert(d.getProperty("author","email") == "notxmonader@gmail.com")

d.setProperty("author", "email", "alsonotxmonader@gmail.com")
doAssert(d.getProperty("author","email") == "alsonotxmonader@gmail.com")
doAssert(d.hasSection("general") == true)
doAssert(d.hasSection("author") == true)
doAssert(d.hasProperty("author", "name") == true)
d.deleteProperty("author", "name")
doAssert(d.hasProperty("author", "name") == false)

echo d.toIniString()
let s = d.getSection("author")
echo $s

Implementation

You can certainly use regular expressions, like pythons configparser, but we will go for a simpler approach here, also we want to keep it pure so we don't depend on pcre

Ini sample


[general]
appname = configparser
version = 0.1

[author]
name = xmonader
email = notxmonader@gmail.com

Ini file consists of one or more sections and each section consists of one or more key value pairs separated by =

Define your data types

import tables, strutils

We will use tables extensively

type Section = ref object
    properties: Table[string, string]

Section type contains properties table represents key value pairs

proc setProperty*(this: Section, name: string, value: string) =
    this.properties[name] = value

To set property in the underlying properties table

proc newSection*() : Section =
    var s = Section()
    s.properties = initTable[string, string]()
    return s

To create new Section object

proc `$`*(this: Section) : string =
    return "<Section" & $this.properties & " >"

Simple toString proc using $ operator

type Ini = ref object
    sections: Table[string, Section]

Ini type represents the whole document and contains a table section from sectionName to Section object.

proc newIni*() : Ini = 
    var ini = Ini()
    ini.sections = initTable[string, Section]()
    return ini

To create new Ini object

proc `$`*(this: Ini) : string = 
    return "<Ini " & $this.sections & " >"

define friendly toString proc using $ operator

Define API

proc setSection*(this: Ini, name: string, section: Section) =
    this.sections[name] = section

proc getSection*(this: Ini, name: string): Section =
    return this.sections.getOrDefault(name)

proc hasSection*(this: Ini, name: string): bool =
    return this.sections.contains(name)

proc deleteSection*(this: Ini, name:string) =
    this.sections.del(name)

proc sectionsCount*(this: Ini) : int = 
    echo $this.sections
    return len(this.sections)

Some helper procs around Ini objects for manipulating sections.


proc hasProperty*(this: Ini, sectionName: string, key: string): bool=
    return this.sections.contains(sectionName) and this.sections[sectionName].properties.contains(key)

proc setProperty*(this: Ini, sectionName: string, key: string, value:string) =
    echo $this.sections
    if this.sections.contains(sectionName):
        this.sections[sectionName].setProperty(key, value)
    else:
        raise newException(ValueError, "Ini doesn't have section " & sectionName)

proc getProperty*(this: Ini, sectionName: string, key: string) : string =
    if this.sections.contains(sectionName):
        return this.sections[sectionName].properties.getOrDefault(key)
    else:
        raise newException(ValueError, "Ini doesn't have section " & sectionName)


proc deleteProperty*(this: Ini, sectionName: string, key: string) =
    if this.sections.contains(sectionName) and this.sections[sectionName].properties.contains(key):
        this.sections[sectionName].properties.del(key)
    else:
        raise newException(ValueError, "Ini doesn't have section " & sectionName)

More helpers around properties in the section objects managed by Ini object

proc toIniString*(this: Ini, sep:char='=') : string =
    var output = ""
    for sectName, section in this.sections:
        output &= "[" & sectName & "]" & "\n"
        for k, v in section.properties:
            output &= k & sep & v & "\n" 
        output &= "\n"
    return output

Simple proc toIniString to convert the nim structures into Ini text string

Parse!

OK, here comes the cool part

Parser states

type ParserState = enum
    readSection, readKV

Here we have two states

  • readSection: when we are supposed to extract section name from the current line
  • readKV: when we are supposed to read the line in key value pair mode

ParseIni proc

proc parseIni*(s: string) : Ini = 

Here we define a proc parseIni that takes a string s and creates an Ini object

    var ini = newIni()
    var state: ParserState = readSection
    let lines = s.splitLines
    
    var currentSectionName: string = ""
    var currentSection = newSection()
  • ini is the object to be returned after parsing
  • state the current parser state (weather it's readSection or readKV)
  • lines input string splitted into lines as we are a lines based parser
  • currentSectionName to keep track of what section we are currently in
  • currentSection to populate ini.sections with Section object using setSection proc
   for line in lines:

for each line

         if line.strip() == "" or line.startsWith(";") or line.startsWith("#"):
            continue

We continue if line is safe to igore empty line or starts with ; or #

        if line.startsWith("[") and line.endsWith("]"):
            state = readSection

if line startswith [ and ends with ] then we set parser state to readSection

        if state == readSection:
            currentSectionName = line[1..<line.len-1]
            ini.setSection(currentSectionName, currentSection)
            state = readKV
            continue

if parser state is readSection

  • extract section name between [ and ]
  • add section object to the ini under the current section name
  • change state to readKV to read key value pairs
  • continue the loop on the nextline as we're done processing the section name.
        if state == readKV:
            let parts = line.split({'='})
            if len(parts) == 2:
                let key = parts[0].strip()
                let val = parts[1].strip()
                ini.setProperty(currentSectionName, key, val)

if state is readKV

  • extract key and val by splitting the line on =
  • setProperty under the currentSectionName using key and val
    return ini

Here we return the populated ini object.

Day 6: Manage your dotfiles easily with nistow

Today we will create a tool to manage our dotfiles easily.

Dotfiles layout

        i3
        `-- .config
            `-- i3
                `-- config

So we have here a directory named i3 in the very top indicates APP_NAME and under it a tree of config paths. Here it means config file is supposed to be linked under .config/i3/config relative to destination directory

Home directory is the default destination.

What do we expect?

➜  ~ nistow --help
    Stow 0.1.0
        -h | --help     : show help
        -v | --version  : show version
        --verbose       : verbose messages
        -s | --simulate : simulate stow operation
        -f | --force    : override old links
        -a | --app      : application path to stow
        -d | --dest     : destination to stow to
  • --simulate flag used to simulate on the filesystem without actual linking
  • --app application directory that's compatible with the dotfiles layoud described above.
  • --dest destination to symlink files under, defaults to home dir.
nistow --app=/home/striky/wspace/dotfiles/localdir --dest=/tmp/tmpconf --verbose

Implementation

proc writeHelp() = 
    echo """
Stow 0.1.0 (Manage your dotfiles easily)

Allowed arguments:
    -h | --help     : show help
    -v | --version  : show version
    --verbose       : verbose messages
    -s | --simulate : simulate stow operation
    -f | --force    : override old links
    -a | --app      : application path to stow
    -d | --dest     : destination to stow to

    """

writeHelp is a simple proc to write help string to the stdout

proc writeVersion() =
    echo "Stow version 0.1.0"

To write version

proc cli*() =

Entry point for out commandline application

  var 
    simulate, verbose, force: bool = false
    app, dest: string = ""

Variables represents various options we allow in the application.

  if paramCount() == 0:
    writeHelp()
    quit(0)

If no arguments passed we will write the help string and exit or quit according to nim with exit status 0

  for kind, key, val in getopt():
    case kind
    of cmdLongOption, cmdShortOption:
        case key
        of "help", "h": 
            writeHelp()
            quit()
        of "version", "v":
            writeVersion()
            quit()
        of "simulate", "s": simulate = true
        of "verbose": verbose = true
        of "force", "f": force = true
        of "app", "a": app = val
        of "dest", "d": dest = val 
        else:
          discard
    else:
      discard 

Here we parse the commandline string using getopt.

  for kind, key, val in getopt():
    case kind
    of cmdLongOption, cmdShortOption:

So for --app=/home/striky/dotfiles/i3 -f kind for --app is cmdLongOption and for -f is cmdShortOption key for --app is app and for -f is f val for --app is /home/striky/dotfiles/i3 val for -f we set to true in our parsing, because it's mainly like a switch boolean if it exists it means we want it set to true.

  if dest.isNilOrEmpty():
    dest = getHomeDir()

Here we set default dest to homeDir

  if app.isNilOrEmpty():
    echo "Make sure to provide --app flags"
    quit(1)

Here we exit with error exit status 1 if app isn't set.

  try:
    stow(getLinkableFiles(appPath=app, dest=dest), simulate=simulate, verbose=verbose, force=force)
  except ValueError:
    echo "Error happened: " & getCurrentExceptionMsg()

Here we try to stow all the linkable files in app dir to dest dir and pass all the options we collected from the command line arguments simulate, verbose, force, and wrapped around try/except to show error to the user

when isMainModule:
  cli()

invoke our entry point cli if this module is the main module.

OK! back to stow and getLinkableFiles

We start with getLinkableFiles. Remember the dotfiles hierarchy?

    # appPath: application's dotfiles directory
    #     we expect dir to have the hierarchy.
    #     i3
    #     `-- .config
    #         `-- i3
    #         `-- config

We want to get all the files in there with full path and the link file to each one will be exactly the same except for the appPath name will be changed to dest path

[/home/striky/wspace/dotfiles/i3]/.config/i3/config -> [/home/striky]/.config/i3/config
__________________appPath________                      _____dest____
type
  LinkInfo = tuple[original:string, dest:string] 

Simple type to represent the original path and where to symlink to

proc getLinkableFiles*(appPath: string, dest: string=expandTilde("~")): seq[LinkInfo] =

    # collects the linkable files in a certain app.

    # appPath: application's dotfiles directory
    #     we expect dir to have the hierarchy.
    #     i3
    #     `-- .config
    #         `-- i3
    #         `-- config

    # dest: destination of the link files : default is the home of user.

getLinkableFiles is a proc takes appPath and dest and returns a seq of LinkInfo contains this transformation for each file.

[/home/striky/wspace/dotfiles/i3]/A_FILE_PATH -> [/home/striky]A_FILE_PATH
__________________apppath________                _____dest____
  var appPath = expandTilde(appPath)
  if not dirExists(appPath):
    raise newException(ValueError, fmt("App path {appPath} doesn't exist."))
  var linkables = newSeq[LinkInfo]()
  for filepath in walkDirRec(appPath, yieldFilter={pcFile}):
    let linkpath = filepath.replace(appPath, dest)
    var linkInfo : LinkInfo = (original:filepath, dest:linkpath)
    linkables.add(linkInfo)
  return linkables

Here, we walk over the appPath dir using walkDirRec and specify in yieldFilter argument that we're interested in pcFile "file path component", just call it entries of type regular file.

proc stow(linkables: seq[LinkInfo], simulate: bool=true, verbose: bool=true, force: bool=false) = 
    # Creates symoblic links and related directories

    # linkables is a list of tuples (filepath, linkpath) : List[Tuple[file_path, link_path]]
    # simulate does simulation with no effect on the filesystem: bool
    # verbose shows log messages: bool

  for linkinfo in linkables:
    let (filepath, linkpath) = linkinfo
    if verbose:
      echo(fmt("Will link {filepath} -> {linkpath}"))

    if not simulate:
      createDir(parentDir(linkpath))
      if not fileExists(linkpath):
        createSymlink(filepath, linkpath)
      else:
        if force:
          removeFile(linkpath)
          createSymlink(filepath, linkpath)
        else:
          if verbose:
            echo(fmt("Skipping linking {filepath} -> {linkpath}"))

stow is pretty easy procedure, it takes in a list of LinksInfo that has all the information (original filename and destination symlink) and does the symlinking based on if it's not a simulation and prints the messages if verbose is set to true

Feel free to send improvements to this tutorial or nistow :)

Complete source code available here https://github.com/xmonader/nistow

Day 7: Shorturl service

Today, we will develop a url shortening service like bit.ly or something

imports

import jester, asyncdispatch, htmlgen, json, os, strutils, strformat, db_sqlite
  • jester: is sinatra like framework

  • asyncdispatch: for async/await instructions

  • htmlgen: to generate html pages

  • json: to parse json string into nim structures and dump json structures to strings

  • db_sqlite: to work on sqlite databse behind our application

Database connection

# hostname can be something configurable "http://ni.m:5000"
let hostname = "localhost:5000"
var theDb : DbConn
  • hostname is the basepath for our site to access it, and can be configurable using /etc/hosts file or using even reverse proxy like caddy, or in real world case you will have a dns record for your site.

  • theDb is the connection object to work with sqlite database.

if not fileExists("/tmp/mytest.db"):
  theDb = open("/tmp/mytest.db", nil, nil, nil)
  theDb.exec(sql("""create table urls (
      id   INTEGER PRIMARY KEY,
      url  VARCHAR(255) NOT NULL
     )"""
  ))
else:
  theDb = open("/tmp/mytest.db", nil, nil, nil)
  • We check if the database file doesn't exist /tmp/mytest.db we create a urls table otherwise we just get the connection and do nothing

Jester and http endpoints

routes:
  • jester defines a DSL to work on routes
METHOD ROUTE_PATH:
    ##codeblock
  • METHOD can be get post or any http verb

  • ROUTE_PATH is the path accessed on the server for instance /users, /user/52, here 52 is a query parameter when route is defined like this/user/@id

HOME page

Here we handle GET requests on /home path on our server:

 get "/home":
  var htmlout = """
    <html>
      <title>NIM SHORT</title>
      <head>
        <script
      src="https://code.jquery.com/jquery-3.3.1.min.js"
      integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8="
      crossorigin="anonymous"></script>

      <script>
        function postData(url, data) {
          // Default options are marked with *
          return fetch(url, {
            body: JSON.stringify(data), // must match 'Content-Type' header
            cache: 'no-cache', // *default, no-cache, reload, force-cache, only-if-cached
            credentials: 'same-origin', // include, same-origin, *omit
            headers: {
              'user-agent': 'Mozilla/4.0 MDN Example',
              'content-type': 'application/json'
            },
            method: 'POST', // *GET, POST, PUT, DELETE, etc.
            mode: 'cors', // no-cors, cors, *same-origin
            redirect: 'follow', // manual, *follow, error
            referrer: 'no-referrer', // *client, no-referrer
          })
          .then(resp => resp.json())
      }

      $(document).ready(function() {
        $('#btnsubmit').on('click', function(e){
          e.preventDefault();
          postData('/shorten', {url: $("#url").val()})
          .then( data => {
            let id = data["id"]
            $("#output").html(`<a href="%%hostname/${id}">Shortlink: ${id}</a>`);
           });
      });
    });
      </script>
      </head>
      <body>
          <div>
            <form>
              <label>URL</label>
              <input type="url" name="url" id="url" />
              <button id="btnsubmit" type="button">SHORT!</button
            </form>
          </div>

          <div id="output">

          </div>
      </body>
    </html>
    """
    htmlout = htmlout.replace("%%hostname", hostname)
    resp  htmlout
  • Include jquery framework

  • Create a form with in div tag with 1 textinput to allow user to enter a url

  • override form submission to do an ajax request

  • on the button shorturl click event we send a post request to /shorten endpoint in the background using fetch api and whenever we get a result we parse the json data and extract the id from it and put the new url in the output div

  • resp to return a response to the user and it can return a http status too

Shorten endpoint

  post "/shorten":
    let url = parseJson(request.body).getOrDefault("url").getStr()
    if not url.isNilOrEmpty():
      var id = theDb.getValue(sql"SELECT id FROM urls WHERE url=?", url)
      if id.isNilOrEmpty():
        id = $theDb.tryInsertId(sql"INSERT INTO urls (url) VALUES (?)", url)
      var jsonResp = $(%*{"id": id})
      resp Http200, jsonResp
    else:
      resp Http400, "please specify url in the posted data."

Here we handle POST requests on /shorten endpoint

  • get the url from parsed json post data. please note that POST data is available under request.body explained in the previous section

  • if url is passed we try to check if it's there in our urls table, if it's there we return it, otherwise we insert it in the table.

  • if the url isn't passed we return a badrequest 400 status code.

  • parseJson: loads json from a string and you can get value using getOrDefault and getStr to get string value, there's getBool, and so on.

  • getValue to get the id from the result of the select statement returns the first column from the first row in the result set

  • tryInsertId executes insert statement and returns the id of the new row

  • after successfull insertion we would like to return json serialized string to the user $(%*{"id": id})

  • %* is a macro to convert nim struct into json node and to convert it to string we wrap $ around it

Shorturls redirect

  get "/@Id":
    let url = theDb.getValue(sql"SELECT url FROM urls WHERE id=?", @"Id")
    if url.isNilOrEmpty():
      resp Http404, "Don't know that url"
    else:
      redirect url
  • Here we fetch whatever path @Id the user trying to access except for /home and /shorten and we try to get the long url for that path

  • If the path is resolved to a url we redirect the user to to or we show an error message

  • @"Id" gets the value of @Id query parameter : notice the @ position in both situation

RUN

runForever()

start jester webserver

Code is available here https://gist.github.com/xmonader/d41a5c9f917eadb90d3025e7b7e748dd

Day 8: minitest

I'm a big fan of Practical Common Lisp and It has a chapter on building a unittest framework using macros and I didn't get the chance to tinker with nim macros just yet, So today we will be building almost the same thing in nim.

So what's up?

Imagine you want to check for some expression and print a specific message donating the expression

  doAssert(1==2, "1 == 2 failed")

Here we want to assure that 1==2 or show a message with 1==2 failed and it goes on for whatever we want to check for

  doAssert(1+2==3, "1+2 == 3 failed")
  doAssert(5*2==10, "5*2 == 10 failed")

We can already see the boilerplate here, repeating the expression twice one for the check and one for the message itself.

What to expect?

We expect having a DSL to remove the boilerplate we're suffering from in the prev. section.

  check(3==1+2)
  check(6+5*2 == 16)

And this will print

3 == 1 + 2 .. passed
6 + 5 * 2 == 16 .. passed

And it should evolve to allow grouping of test checks

  check(3==1+2)
  check(6+5*2 == 16)
  
  suite "Arith":
    check(1+2==3)
    check(3+2==5)

  suite "Strs":
    check("HELLO".toLowerAscii() == "hello")
    check("".isNilOrEmpty() == true)

Resulting something like this

3 == 1 + 2 .. passed
6 + 5 * 2 == 16 .. passed
==================================================
Arith
==================================================
 1 + 2 == 3 .. passed
 3 + 2 == 5 .. passed
==================================================
Strs
==================================================
 "HELLO".toLowerAscii() == "hello" .. passed
 "".isNilOrEmpty() == true .. passed

Implementation

So nim has two way to do macros

templates

Which are like functions that called in compilation time like preprocessor

From the nim manual

template `!=` (a, b: untyped): untyped =
  # this definition exists in the System module
  not (a == b)

assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6))

so in compile time 5 != 6 will be converted into not ( 5 == 6) and the whole expression will be assert(not ( 5== 6))

So what're we gonna do is check for the passed expression to convert it to a string to be printed in the terminal output and if the expression fails we append failed message or any other custom failure message

template check*(exp:untyped, failureMsg:string="failed", indent:uint=0): void =
  let indentationStr = repeat(' ', indent) 
  let expStr: string = astToStr(exp)
  var msg: string
  if not exp:
    if msg.isNilOrEmpty():
      msg = indentationStr & expStr & " .. " & failureMsg
  else:
    msg = indentationStr & expStr & " .. passed"
      
  echo(msg)
  • untyped means the expression doesn't have to have a type yet, imagine passing variable name that doesn't exist yet defineVar(myVar, 5) so here myVar needs to be untyped or the compiler will complain. check the manual for more info https://nim-lang.org/docs/manual.html#templates

  • astToStr converts the AST exp to a string

  • indent amount of spaces prefixing the message.

Macros

Nim provides us with a way to access the AST in a very low level when we templates don't cut it.

What we expected is having a suite macro

  suite "Strs":
    check("HELLO".toLowerAscii() == "hello")
    check("".isNilOrEmpty() == true)

that takes a name for the suite and bunch of statements

  • Please note there're two kind of macros and we're interested in the statements macro here
  • Statments macro is a macro that has colon : operator followed by bunch of statements

dumpTree

dumpTree is amazing to debug the ast and print them in a good visual way


  dumpTree:
    suite "Strs":
      check("HELLO".toLowerAscii() == "hello")

Ident ident"suite"
    StrLit Strs
    StmtList
      Call
        Ident ident"check"
        Infix
          Ident ident"=="
          Call
            DotExpr
              StrLit HELLO
              Ident ident"toLowerAscii"
          StrLit hello

  • dumpTree says it got Identifier Ident named suite
  • suite contains StringLiteral node with value Strs
  • suite contains StmtList node
  • first statement in StmtList is a call statement
  • call statement consist of procedure name check in this case and args list and so on..
macro suite*(name:string, exprs: untyped) : typed = 

Here, we define a macro suite takes name and bunch of statements exprs

  • Macro must return an AST in our case will be list of statements of check call statemenets
  • Need the messages to be indented

To achieve the indentation we can either print tab before calling check or overwrite check to pass indent option, we will go with overwrite the check call ASTs

  var result = newStmtList()

We will be returning a list of statments right?

  let equline = newCall("repeat", newStrLitNode("="), newIntLitNode(50))

statement node that equals repeat("=", 50)

  let writeEquline = newCall("echo", equline)

statement node the equals echo repeat("=", 50)

  add(result, writeEquline, newCall("echo", name))
  add(result, writeEquline)

this will generate

================
$name
================

Now we iterate over the passed statements to suite macro and check for its kind

  for i in 0..<exprs.len:
    var exp = exprs[i]
    let expKind = exp.kind
    case expKind
    of nnkCall:
      case exp[0].kind
      of nnkIdent:
        let identName = $exp[0].ident
        if identName == "check":
  • If we're in a check call we will convert it from check(expr) => check(expr, "", 1)
          var checkWithIndent = exp
          checkWithIndent.add(newStrLitNode(""))
          checkWithIndent.add(newIntLitNode(1))
          add(result, checkWithIndent)

otherwise we add any other statement as is unprocessesed.

      else:
        add(result, exp) 
    else:
      discard
        
  return result

Code is available on https://github.com/xmonader/nim-minitest

Day 9: Tic tac toe

Who didn't play Tic tac toe with his friends? :)

What to expect

Today we will implement tic tac toe game in Nim, with 2 modes

  • Human vs Human
  • Human vs AI

Implementation

So, let's get it. The winner in the game is the first one who manages to get 3 cells on the board to be the same in the same column or row or diagonally first.

imports

import sequtils, tables, strutils, strformat, random, os, parseopt2

randomize()

Constraints and objects

As the game allow turns we should have a way to keep track of the next player

let NEXT_PLAYER = {"X":"O", "O":"X"}.toTable

Here we use a table to tell us the next player

Board

type 
  Board = ref object of RootObj
    list: seq[string]

Here we define a simple class representing the board

  • list is a sequence representing the cells maybe cells is a better name
  • please note list is just a sequenece of elements 0 1 2 3 4 5 6 7 8 but we visualize it as
0 1 2
3 4 5
6 7 9

instead of using a 2d array for the sake of simplicity

let WINS = @[ @[0,1,2], @[3,4,5], @[6,7,8], @[0, 3, 6], @[1,4,7], @[2,5,8], @[0,4,8], @[2,4,6] ]

We talked WIN patterns cells in the same row or the same column or in same diagonal

proc newBoard(): Board =
  var b = Board()
  b.list = @["0", "1", "2", "3", "4", "5", "6", "7", "8"]
  return b

this is the initializer of the board and sets the cell value to the string represention of its index

Winning
proc done(this: Board): (bool, string) =
    for w in WINS:
        if this.list[w[0]] == this.list[w[1]] and this.list[w[1]]  == this.list[w[2]]:
          if this.list[w[0]] == "X":
            return (true, "X")
          elif this.list[w[0]] == "O":
            return (true, "O")
    if all(this.list, proc(x:string):bool = x in @["O", "X"]) == true:
        return (true, "tie")
    else:
        return (false, "going")

Here we check for the state of the game and the winner if all of the item in WIN patterns are the same

proc `$`(this:Board): string =
  let rows: seq[seq[string]] = @[this.list[0..2], this.list[3..5], this.list[6..8]]
  for row  in rows:
    for cell in row:
      stdout.write(cell & " | ")
    echo("\n--------------")

Here we have the string representation of the board so we can show it as 3x3 grid in a lovely way

proc emptySpots(this:Board):seq[int] =
    var emptyindices = newSeq[int]()
    for i in this.list:
      if i.isDigit():
        emptyindices.add(parseInt(i))
    return emptyindices

Here we have a simple helper function that returns the empty spots indices the spots that doesn't have X or O in it, remember all the cells are initialized to the string representation of their indices.

Game

type
  Game = ref object of RootObj
    currentPlayer*: string
    board*: Board
    aiPlayer*: string
    difficulty*: int


proc newGame(aiPlayer:string="", difficulty:int=9): Game =
  var
    game = new Game

  game.board = newBoard()
  game.currentPlayer = "X"
  game.aiPlayer = aiPlayer
  game.difficulty = difficulty
  
  return game
        # 0 1 2
        # 3 4 5
        # 6 7 8 

Here we have another object representing the game and the players and the difficulty and wether it has an AI player or not and who is the current player

  • difficulty is only logical in case of AI, it means when does the AI start calculating moves and considering scenarios, 9 is the hardest, 0 is the easiest.
proc changePlayer(this:Game) : void =
  this.currentPlayer = NEXT_PLAYER[this.currentPlayer]   

Simple procedure to switch turns between players

Start the game


proc startGame*(this:Game): void=
    while true:
        echo this.board
        if this.aiPlayer != this.currentPlayer:
          stdout.write("Enter move: ")
          let move = stdin.readLine()
          this.board.list[parseInt($move)] = this.currentPlayer
        this.change_player()
        let (done, winner) = this.board.done()

        if done == true:
          echo this.board
          if winner == "tie":
              echo("TIE")
          else:
              echo("WINNER IS :", winner )
          break           

Here if we don't have aiPlayer if not set it's just a game with 2 humans switching turns and checking for the winner after each move

Minmax and AI support

Minmax is an algorithm mainly used to predict the possible moves in the future and how to minimize the losses and maximize the chances of winning

  • https://www.youtube.com/watch?v=6ELUvkSkCts
  • https://www.youtube.com/watch?v=CwziaVrM_vc&t=1199s

type 
  Move = tuple[score:int, idx:int]

We need a type Move on a certain idx to represent if it's a good/bad move depending on the score

  • good means minimizing chances of the human to win or making AI win => high score +10
  • bad means maximizing chances of the human to win or making AI lose => low score -10

So let's say we are in this situation

O X X
X 4 5 
X O O

And it's AI turn we have two possible moves (4 or 5)

O X X
X 4 O 
X O O

this move (to 5) is clearly wrong because the next move to human will allow him to complete the diagonal (2, 4, 6) So this is a bad move we give it score -10 or

O X X
X O 5 
X O O

this move (to 4) minimizes the losses (leads to a TIE instead of making human wins) so we give it a higher score

proc getBestMove(this: Game, board: Board, player:string): Move =
        let (done, winner) = board.done()
        # determine the score of the move by checking where does it lead to a win or loss.
        if done == true:
            if winner ==  this.aiPlayer:
                return (score:10, idx:0)
            elif winner != "tie": #human
                return (score:(-10), idx:0)
            else:
                return (score:0, idx:0)
            
        let empty_spots = board.empty_spots()
        var moves = newSeq[Move]() 
        for idx in empty_spots:
            # we calculate more new trees depending on the current situation and see where the upcoming moves lead
            var newboard = newBoard()

            newboard.list = map(board.list, proc(x:string):string=x)
            newboard.list[idx] = player
            let score = this.getBestMove(newboard, NEXT_PLAYER[player]).score
            let idx = idx
            let move = (score:score, idx:idx)
            moves.add(move)
        
        if player == this.aiPlayer:
          return max(moves)          
          # var bestScore = -1000
          # var bestMove: Move 
          # for m in moves:
          #   if m.score > bestScore:
          #     bestMove = m
          #     bestScore = m.score
          # return bestMove
        else:
          return min(moves)          
          # var bestScore = 1000
          # var bestMove: Move 
          # for m in moves:
          #   if m.score < bestScore:
          #     bestMove = m
          #     bestScore = m.score
          # return bestMove

Here we have a highly annotated getBestMove procedure to calculate recursively the best move for us

Now our startGame should look like this

proc startGame*(this:Game): void=
    while true:
        ##old code

        ## AI check
        else:
            if this.currentPlayer == this.aiPlayer:
              let emptyspots = this.board.emptySpots()
              if len(emptyspots) <= this.difficulty:
                  echo("AI MOVE..")
                  let move = this.getbestmove(this.board, this.aiPlayer)
                  this.board.list[move.idx] = this.aiPlayer
              else:
                  echo("RANDOM GUESS")
                  this.board.list[emptyspots.rand()] = this.aiPlayer
  
        ## oldcode    

Here we allow the game to use difficulty which means when does the AI starts calculating the moves and making the tree? from the beginning 9 cells left or when there're 4 cells left? you can set it the way you want it, and until u reach the starting difficulty situation AI will use random guesses (from the available emptyspots) instead of calculating

CLI entry

proc writeHelp() = 
  echo """
TicTacToe 0.1.0 (MinMax version)
Allowed arguments:
  -h | --help         : show help
  -a | --ai           : AI player [X or O]
  -l | --difficulty   : destination to stow to
  """

proc cli*() =
  var 
    aiplayer = ""
    difficulty = 9

  for kind, key, val in getopt():
    case kind
    of cmdLongOption, cmdShortOption:
        case key
        of "help", "h": 
            writeHelp()
            # quit()
        of "aiplayer", "a":
          echo "AIPLAYER: " & val
          aiplayer = val
        of "level", "l": difficulty = parseInt(val)
        else:
          discard
    else:
      discard 

  let g = newGame(aiPlayer=aiplayer, difficulty=difficulty)
  g.startGame()


when isMainModule:
  cli()

Code is available on https://github.com/xmonader/nim-tictactoe/blob/master/src/nim_tictactoe_cli.nim

Day 10: Tic tac toe with GUI!!

Hopefully, you're done with day 9 and enjoyed playing tic tac toe.

Expectation

It's fun to play on the command line, but it'd be very cool to have some GUI with some buttons using libui bindings in Nim

  • make sure to install it using nimble install ui

Implementation

In the previous day we reached some good abstraction separating the logic for the command line gui and the minmax algorithm and it's not tightly coupled

minimal ui application

proc gui*() = 
  var mainwin = newWindow("tictactoe", 400, 500, true)
  show(mainwin)
  mainLoop()

when isMainModule:
  # cli()
  init()
  gui()

Here we create a window 400x500 with a title tictactoe and we show it and start its mainLoop getting ready to receive and dispatch events

TicTacToe GUI

We can imagine the gui to be something like that

---------------------------------------------
|  ---------------------------------------  |
+  | INFO LABEL | button to restart       | +
|  ---------------------------------------| |
+  |--------------------------------------| +
|  |  btn     |    btn  |   btn           | |
+  |--------------------------------------| +
|  |  btn     |    btn  |   btn           | |
+  |--------------------------------------| +
|  |  btn     |    btn  |   btn           | |
+  |--------------------------------------| +
---------------------------------------------
  • a window that contains a vertical box
  • the vertical box contains 4 rows
  • first row to show information about the current game and a button to reset the game
  • and the other rows represent the 3x3 tictactoe grid that will reflect game.list :)
  • and 9 buttons to be pressed to set X or O
  • we will support human vs AI so when human presses a button it gets disabled and the AI presses the button that minimizes its loss and that button gets disabled too.
proc gui*() = 
  var mainwin = newWindow("tictactoe", 400, 500, true)

  # game object to contain the state, the players, the difficulty,...
  var g = newGame(aiPlayer="O", difficulty=9)

  var currentMove = -1
  mainwin.margined = true
  mainwin.onClosing = (proc (): bool = return true)


  # set up the boxes 
  let box = newVerticalBox(true)
  let hbox0 = newHorizontalBox(true)
  let hbox1 = newHorizontalBox(true)
  let hbox2 = newHorizontalBox(true)
  let hbox3 = newHorizontalBox(true)
  # list of buttons 
  var buttons = newSeq[Button]()

  # information label
  var labelInfo = newLabel("Info: Player X turn")
  hbox0.add(labelInfo)

  # restart button
  hbox0.add(newButton("Restart", proc() = 
                            g =newGame(aiPlayer="O", difficulty=9)
                            for i, b in buttons.pairs:
                              b.text = $i
                              b.enable()))

Here we setup the layout we just described and create a button Restart that resets the game again and restore the buttons text and enables them all

  # create the buttons
  for i in countup(0, 8):
    var handler : proc() 
    closureScope:
      let senderId = i
      handler = proc() =
        currentMove = senderId
        g.board.list[senderId] = g.currentPlayer
        g.change_player()
        labelInfo.text = "Current player: " & g.currentPlayer
        for i, v in g.board.list.pairs:
          buttons[i].text = v
        let (done, winner) = g.board.done()
        if done == true:
          echo g.board
          if winner == "tie":
              labelInfo.text = "Tie.."
          else:
            labelInfo.text = winner & " won."
        else:
          aiPlay()
        buttons[senderId].disable()

    buttons.add(newButton($i, handler))
  • Here we create the buttons please notice we are using closureScope feature to capture the button id to keep track of which button is clicked
  • after pressing set set the text of the button to X
  • we disable the button so we don't receive anymore events.
  • switch turns
  • update the information label whether about the next player or the game state
  • if the game is still going we ask the AI for a move

  # code to run when the game asks the ai to play (after each move from the human..)
  proc aiPlay() = 
    if g.currentPlayer == g.aiPlayer:
      let emptySpots = g.board.emptySpots()
      if len(emptySpots) <= g.difficulty:
        let move = g.getBestMove(g.board, g.aiPlayer)
        g.board.list[move.idx] = g.aiPlayer
        buttons[move.idx].disable()
      else:
        let rndmove = emptyspots.rand()
        g.board.list[rndmove] = g.aiPlayer
    g.change_player()
    labelInfo.text = "Current player: " & g.currentPlayer

    for i, v in g.board.list.pairs:
      buttons[i].text = v
      
    let (done, winner) = g.board.done()

    if done == true:
      echo g.board
      if winner == "tie":
          labelInfo.text = "Tie.."
      else:
        labelInfo.text = winner & " won."

  • using minmax algorithm from the previous day we calculate the best move
  • change the button text to O
  • disable the button
  • update the information label

 hbox1.add(buttons[0])
 hbox1.add(buttons[1])
 hbox1.add(buttons[2])

 hbox2.add(buttons[3])
 hbox2.add(buttons[4])
 hbox2.add(buttons[5])

 hbox3.add(buttons[6])
 hbox3.add(buttons[7])
 hbox3.add(buttons[8])
 
 box.add(hbox0, true)
 box.add(hbox1, true)
 box.add(hbox2, true)
 box.add(hbox3, true)
 mainwin.setChild(box)

  • Here we add the buttons to their correct rows in the correct columns and set the main widget
  show(mainwin)
  mainLoop()

when isMainModule:
  init()
  gui()

Code is available on https://github.com/xmonader/nim-tictactoe/blob/master/src/nim_tictactoe_gui.nim

Day 11 ( Bake applications)

I used to work on application 2 years ago, and It was a bit like ansible defining recipes to create applications and managing their dependencies.

What to expect

Today we will be doing something very simple to track our dependencies and print the bash commands for each task like Makefile.

HEADERS = program.h headers.h

default: program

program.o: program.c $(HEADERS)
    gcc -c program.c -o program.o

program: program.o
    gcc program.o -o program

clean:
    -rm -f program.o
    -rm -f program

Basically, makefile consists of

  • Variables
  • Targets
  • Dependencies

variables like HEADERS=..., targets whatever precedes the : like clean, program, program.o, dependencies are what a target depends on, so for instance program target that generates the executable requires program.o dependency to be executed first.

Example API usage

Normal usage

  var b = initBake()
  b.add_task("publish", @["build-release"], "print publish")
  b.add_task("build-release", @["nim-installed"], "print exec command to build release mode")
  b.add_task("nim-installed", @["curl-installed"], "print curl LINK | bash")
  b.add_task("curl-installed", @["apt-installed"], "apt-get install curl")
  b.add_task("apt-installed", @[], "code to install apt...")
  b.run_task("publish")

OUTPUT:

code to install apt...
apt-get install curl
print curl LINK | bash
print exec command to build release mode
print publish

Circular dependencies

  var b = initBake()
  b.add_task("publish", @["build-release"], "print publish")
  b.add_task("build-release", @["nim-installed"], "print exec command to build release mode")
  b.add_task("nim-installed", @["curl-installed"], "print curl LINK | bash")
  b.add_task("curl-installed", @["publish", "apt-installed"], "apt-get install curl")
  b.add_task("apt-installed", @[], "code to install apt...")
  b.run_task("publish")

Output:

Found cycle please fix:@["build-release", "nim-installed", "curl-installed", "publish", "build-release"]

Implementation

Imports

import strformat, strutils, tables, sequtils, algorithm

Graphs

Graphs are very powerful data structure and used to solve lots of problems, like getting the shortest route and detecting circular dependencies in our code today :)

So How to represent graph? Well, we will use Adjaceny list

Objects


type Task = object
  requires*: seq[string]
  actions*: string
  name*: string

proc `$`(this: Task): string = 
  return fmt("Task {this.name} Requirements: {this.requires} , actions {this.actions}")

Task object represnts a target in makefile language, and it has a name, actions code and list of dependencies

type Bake = ref object
  tasksgraph* : Table[string, seq[string]]
  tasks*      : Table[string, Task]

Bake object has tasksgraph adjaceny list representing the tasks and their dependencies and tasks table that maps taskname to task object

Adding a task


proc addTask*(this: Bake, taskname: string, deps: seq[string], actions:string) : void = 
  var t =  Task(name:taskname, requires:deps, actions:actions)
  this.tasksgraph[taskname] = deps
  this.tasks[taskname] = t
  • We update the adjacency list with (taskname and its dependencies)
  • Add task object to tasks Table with key task name

Running tasks


proc runTask*(this: Bake, taskname: string): void =
  # CODE OMITTED FOR FINIDNG CYCLES..

  var deps = newSeq[string]()
  var seen = newSeq[string]()

  this.runTaskHelper(taskname, deps, seen)      

  for tsk in deps:
      let t = this.tasks.getOrDefault(tsk)
      echo(t.actions)

  • Before running a task we should check if it has a cycle first.
  • Keep track of dependencies and the seen tasks so far so we don't run seen tasks again. (for instance if we have target install-wget and target install-curl and both require target apt-get update, so we want to run apt-get update only once )

for example

code to install apt...
apt-get install curl
print curl LINK | bash
print exec command to build release mode
print publish
  • Call runTaskHelper procedure to walk through all the tasks and their dependencies and get us a list of deps each will update deps variable as we will be sending it by reference
  • After getting correct dependencies tasks sorted we execute in our case we will just echo actions property

and now to runTaskHelper that basically updates our dependencies list and put the task execution in order


proc runTaskHelper(this: Bake, taskname: string, deps: var seq[string], seen: var seq[string]) : void = 
  if taskname in seen:
    echo "[+] Solved {taskname} before no need to repeat action"
  var tsk = this.tasks.getOrDefault(taskname)

  seen.add(taskname)
  if len(tsk.requires) > 0:
    for c in this.tasksgraph[tsk.name]:
      this.runTaskHelper(c, deps, seen)
  deps.add(taskname)

Detecting cycles

To detect a cycle we use DFS depth first search algorithm basically going from one node as deep as we can go for each of its neigbours and Graph coloring. Youtube Lecture

Explanation from geeksforgeeks

    WHITE : Vertex is not processed yet.  Initially
            all vertices are WHITE.

    GRAY : Vertex is being processed (DFS for this 
        vertex has started, but not finished which means
        that all descendants (ind DFS tree) of this vertex
        are not processed yet (or this vertex is in function
        call stack)

    BLACK : Vertex and all its descendants are 
            processed.

    While doing DFS, if we encounter an edge from current 
    vertex to a GRAY vertex, then this edge is back edge 
    and hence there is a cycle.

OK, back to nim

1- Defining colors

type NodeColor = enum
  ncWhite, ncGray, ncBlack

2- Graph has Cycle

proc graphHasCycle(graph: Table[string, seq[string]]): (bool, Table[string, string]) =
  var colors = initTable[string, NodeColor]()
  for node, deps in graph:
    colors[node] = ncWhite
  
  var parentMap = initTable[string, string]()
  var hasCycle = false 
  for node, deps in graph:
    parentMap[node] = "null"
    if colors[node] == ncWhite:
      hasCycleDFS(graph, node, colors, hasCycle, parentMap)
    if hasCycle:
      return (true, parentMap)
  return (false, parentMap)

3- Depth First Function

proc hasCycleDFS(graph:Table[string, seq[string]] , node: string, colors: var Table[string, NodeColor], has_cycle: var bool, parentMap: var Table[string, string]) =
  if hasCycle:
      return
  colors[node] = ncGray 

  for dep in graph[node]:
    parentMap[dep] = node
    if colors[dep] == ncGray:
      hasCycle = true   
      parentMap["__CYCLESTART__"] = dep
      return
    if colors[dep] == ncWhite:  
      hasCycleDFS(graph, dep, colors, hasCycle, parentMap)
  colors[node] = ncBlack  

What's next?

  • support for variables
  • recipes maybe using yaml file
  • modules like ansible?

Day 12: Implementing Redis Protocol

Today we will implement RESP (REdis Serialization Protocol) in Nim. Hopefully you read Day 2 on bencode data format (encoding/parsing) because we will be using the same techniques.

RESP

From redis protocol page.

Redis clients communicate with the Redis server using a protocol called RESP (REdis Serialization Protocol). While the protocol was designed specifically for Redis, it can be used for other client-server software projects.

RESP is a compromise between the following things:

Simple to implement.
Fast to parse.
Human readable.
RESP can serialize different data types like integers, strings, arrays. There is also a specific type for errors. Requests are sent from the client to the Redis server as arrays of strings representing the arguments of the command to execute. Redis replies with a command-specific data type.

So, basically we have 5 types (ints, strings, bulkstrings, errors, arrays)

What do we expect?

  • able to decode strings into Reasonable structures in Nim
  echo decodeString("*3\r\n:1\r\n:2\r\n:3\r\n\r\n")
  # # @[1, 2, 3]
  echo decodeString("+Hello, World\r\n")
  # # Hello, World
  echo decodeString("-Not found\r\n")
  # # Not found
  echo decodeString(":1512\r\n")
  # # 1512
  echo $decodeString("$32\r\nHello, World THIS IS REALLY NICE\r\n")
  # Hello, World THIS IS REALLY NICE
  echo decodeString("*2\r\n+Hello World\r\n:23\r\n")
  # @[Hello World, 23]
  echo decodeString("*2\r\n*3\r\n:1\r\n:2\r\n:3\r\n\r\n*5\r\n:5\r\n:7\r\n+Hello Word\r\n-Err\r\n$6\r\nfoobar\r\n")
  # @[@[1, 2, 3], @[5, 7, Hello Word, Err, foobar]]
  echo $decodeString("*4\r\n:51231\r\n$3\r\nfoo\r\n$-1\r\n$3\r\nbar\r\n")
  # @[51231, foo, , bar]
  • able to encode Nim structures representing Redis values into RESP
  echo $encodeValue(RedisValue(kind:vkStr, s:"Hello, World"))
  # # +Hello, World
  echo $encodeValue(RedisValue(kind:vkInt, i:341))
  # # :341
  echo $encodeValue(RedisValue(kind:vkError, err:"Not found"))
  # # -Not found
  echo $encodeValue(RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)]  ))
  # #*2
  # #+Hello World
  # #:23

  echo $encodeValue(RedisValue(kind:vkBulkStr, bs:"Hello, World THIS IS REALLY NICE"))
  # #$32
  # # Hello, World THIS IS REALLY NICE  

Implementation

Imports and constants

Let's starts with main imports

import strformat, strutils, sequtils,
const CRLF = "\r\n"
const REDISNIL = "\0\0"
  • CRLF is really important because lots of the protocol depends on that separator \r\n
  • REDISNIL \0\0 to represent Nil values

Data types

Again, as in Bencode chapter we will define a variant RedisValue that represents All redis datatypes strings, errors, bulkstrings, ints, arrays


  ValueKind = enum
    vkStr, vkError, vkInt, vkBulkStr, vkArray

  RedisValue* = ref object
    case kind*: ValueKind
    of vkStr: s*: string
    of vkError : err*: string
    of vkInt: i*: int
    of vkBulkStr: bs*: string
    of vkArray: l*: seq[RedisValue]

Let's add $, hash, == procedures


import hashes

proc `$`*(obj: RedisValue): string = 
  result = case obj.kind
  of vkStr : obj.s
  of vkBulkStr: obj.bs
  of vkInt : $obj.i
  of vkArray: $obj.l
  of vkError: obj.err

proc hash*(obj: RedisValue): Hash = 
  result = case obj.kind
  of vkStr : !$(hash(obj.s))
  of vkBulkStr: !$(hash(obj.bs))
  of vkInt : !$(hash(obj.i))
  of vkArray: !$(hash(obj.l))
  of vkError: !$(hash(obj.err))

proc `==`* (a, b: RedisValue): bool =
  ## Check two nodes for equality
  if a.isNil:
      result = b.isNil
  elif b.isNil or a.kind != b.kind:
      result = false
  else:
      case a.kind
      of vkStr:
          result = a.s == b.s
      of vkBulkStr:
          result = a.s == b.s
      of vkInt:
          result = a.i == b.i
      of vkArray:
          result = a.l == b.l
      of vkError:
          result = a.err == b.err

Encoder

Encoding is just converting the variant RedisValue to the correct representation according to RESP

Encode simple strings

To encode simple strings specs says OK should be +OK\r\n


proc encodeStr(v: RedisValue) : string =
  return fmt"+{v.s}{CRLF}"

Encode Errors

To encode errors we should precede it with - and end it with \r\n. So Notfound should be encoded as -Notfound\r\n

proc encodeErr(v: RedisValue) : string =
  return fmt"-{v.err}{CRLF}"

Encode Ints

Ints are encoded :NUM\r\n so 95 is :95\r\n

proc encodeInt(v: RedisValue) : string =
  return fmt":{v.i}{CRLF}"

Encode Bulkstrings

From RESP page

Bulk Strings are used in order to represent a single binary safe string up to 512 MB in length.

Bulk Strings are encoded in the following way:

A "$" byte followed by the number of bytes composing the string (a prefixed length), terminated by CRLF.
The actual string data.
A final CRLF.
So the string "foobar" is encoded as follows:

"$6\r\nfoobar\r\n"
When an empty string is just:

"$0\r\n\r\n"
RESP Bulk Strings can also be used in order to signal non-existence of a value using a special format that is used to represent a Null value. In this special format the length is -1, and there is no data, so a Null is represented as:

"$-1\r\n"
proc encodeBulkStr(v: RedisValue) : string =
  return fmt"${v.bs.len}{CRLF}{v.bs}{CRLF}"

Encode Arrays

To encode an array we do * followed by array length then \r\n then encode each element then end the array encoding with \r\n

  • As we are calling encode we should forward declared it

proc encode*(v: RedisValue) : string 
proc encodeArray(v: RedisValue): string = 
  var res = "*" & $len(v.l) & CRLF
  for el in v.l:
    res &= encode(el)
  res &= CRLF
  return res

So for instance to encode encodeValue(RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)] )) The result should be

*2\r\n
+Hello World\r\n
:23\r\n
\r\n

Encode any data type

Here we switch on the passed variant and dispatch the encoding to the reasonable encoder.

proc encode*(v: RedisValue) : string =
  case v.kind 
  of vkStr: return encodeStr(v)
  of vkInt:    return encodeInt(v)
  of vkError:  return encodeErr(v)
  of vkBulkStr: return encodeBulkStr(v)
  of vkArray: return encodeArray(v)

Decoder

Decoding is converting RESP representation into the correct Nim structures RedisValue, Basically the reverse of what we did in the previous chapter

Please note: Basic strategy is Returning the RedisValue and the length of processed characters

Decode simple string

proc decodeStr(s: string): (RedisValue, int) =
  let crlfpos = s.find(CRLF)
  return (RedisValue(kind:vkStr, s:s[1..crlfpos-1]), crlfpos+len(CRLF))

So, Here we are creating RedisValue of kind vkStr of the string between + and \r\n

Decode errors

proc decodeError(s: string): (RedisValue, int) =
  let crlfpos = s.find(CRLF)
  return (RedisValue(kind:vkError, err:s[1..crlfpos-1]), crlfpos+len(CRLF))

Here we are creating RedisValue of kind vkError of the string between - and \r\n

Decode ints

Nums as we said are the values between : and \r\n so we parseInt of the characters between : and \r\n and create RedisValue of kind vkInt with that parsed int.

proc decodeInt(s: string): (RedisValue, int) =
  var i: int
  let crlfpos = s.find(CRLF)
  let sInt = s[1..crlfpos-1]
  if sInt.isDigit():
    i = parseInt(sInt)
  return (RedisValue(kind:vkInt, i:i), crlfpos+len(CRLF))

Decode bulkstrings

Bulkstrings are between $ followed by the string length and \r\n

  • string length == 0: empty string
  • string length == -1: nil
  • string length > 0: string with data

proc decodeBulkStr(s:string): (RedisValue, int) = 
  let crlfpos = s.find(CRLF)
  var bulklen = 0
  let slen = s[1..crlfpos-1]
  bulklen = parseInt(slen)
  var bulk: string
  if bulklen == -1:
      bulk = nil
      return (RedisValue(kind:vkBulkStr, bs:REDISNIL), crlfpos+len(CRLF))
  else:
    let nextcrlf = s.find(CRLF, crlfpos+len(CRLF))
    bulk = s[crlfpos+len(CRLF)..nextcrlf-1] 
    return (RedisValue(kind:vkBulkStr, bs:bulk), nextcrlf+len(CRLF))

Decode arrays

This is the trickiest part is to decode array

  • first we need to get the length between * and \r\n
  • then decode objects array length times, and add them to arr
  • As we are calling decode we should forward declared it
proc decode(s: string): (RedisValue, int)
proc decodeArray(s: string): (RedisValue, int) =
  var arr = newSeq[RedisValue]()
  var arrlen = 0
  var crlfpos = s.find(CRLF)
  var arrlenStr = s[1..crlfpos-1]
  if arrlenStr.isDigit():
     arrlen = parseInt(arrlenStr)
  
  var nextobjpos = s.find(CRLF)+len(CRLF)
  var i = nextobjpos 
  
  if arrlen == -1:
    
    return (RedisValue(kind:vkArray, l:arr), i)
  
  while i < len(s) and len(arr) < arrlen:
    var pair = decode(s[i..len(s)])
    var obj = pair[0]
    arr.add(obj)
    i += pair[1]
  return (RedisValue(kind:vkArray, l:arr), i+len(CRLF))

So this RESP

*2\r\n
+Hello World\r\n
:23\r\n
\r\n

Should be decoded to RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)] )

Decode any object

Based on the first character we dispatch to the correct decoder then we skip the processed count in the string to decode the next object.

proc decode(s: string): (RedisValue, int) =
  var i = 0 
  while i < len(s):
    var curchar = $s[i]
    if curchar == "+":
      var pair = decodeStr(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "-":
      var pair = decodeError(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "$":
      var pair = decodeBulkStr(s[i..len(s)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == ":":
      var pair = decodeInt(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "*":
      var pair = decodeArray(s[i..len(s)])
      let obj = pair[0]
      let count =  pair[1]
      i += count 
      return (obj, i)
    else:
      echo fmt"Unrecognized char {curchar}"
      break

Preparing commands

In redis, commands are sent as List of RedisValues

so GET USER is converted to *2\r\n$3\r\nGET\r\n$4\r\nUSER\r\n\r\n

proc prepareCommand*(this: Redis, command: string, args:seq[string]): string =
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)

  return encode(arr)

nim-resp

That day is based on nim-resp project, and on-going effort to create a redis client in Nim, it supports pipelining feature and all of the previous code. Feel free to send PRs or open issues

Day 13: Implementing Redis Client

Today we will implement a redis client for Nim. Requires reading Day 12 to create redis parser

Redisclient

We want to create a client to communicate with redis servers

As library designers we should keep in mind How people are going to use our library, specially if it's doing IO Operations and we need to make decisions about what kind of APIs are we going to support (blocking or nonblocking ones) or should we duplicate the functionality for both interfaces. Lucky us Nim is pretty neat when it comes to providing async, sync interfaces for your library.

What do we expect?

  • Sync APIs: blocking APIs
  let con = open("localhost", 6379.Port)
  echo $con.execCommand("PING", @[])
  echo $con.execCommand("SET", @["auser", "avalue"])
  echo $con.execCommand("GET", @["auser"])
  echo $con.execCommand("SCAN", @["0"])
  • Async APIs: Nonblocking APIs around async/await
  let con = await openAsync("localhost", 6379.Port)
  echo await con.execCommand("PING", @[])
  echo await con.execCommand("SET", @["auser", "avalue"])
  echo await con.execCommand("GET", @["auser"])
  echo await con.execCommand("SCAN", @["0"])
  echo await con.execCommand("SET", @["auser", "avalue"])
  echo await con.execCommand("GET", @["auser"])
  echo await con.execCommand("SCAN", @["0"])

  await con.enqueueCommand("PING", @[])
  await con.enqueueCommand("PING", @[])
  await con.enqueueCommand("PING", @[])
  echo await con.commitCommands()
 
  • Pipelining
  con.enqueueCommand("PING", @[])
  con.enqueueCommand("PING", @[])
  con.enqueueCommand("PING", @[])
  
  echo $con.commitCommands()

Implementation

Imports and constants

Let's starts with main imports

import redisparser, strformat, tables, json, strutils, sequtils, hashes, net, asyncdispatch, asyncnet, os, strutils, parseutils, deques, options, net

Mainly

  • redisparser because we will be manipulating redis values so let's not decouple the parsing and transport
  • asyncnet, asyncdispatch for async sockets APIs
  • net for SSL and blocking APIs

Data types

Thinking of the expected APIs we talked about earlier we have some sort of client that has exactly the same operations with different blocking policies, so we can abstract it a bit

type
  RedisBase[TSocket] = ref object of RootObj
    socket: TSocket
    connected: bool
    timeout*: int
    pipeline*: seq[RedisValue]

Base class parameterized on TSocket that has

  • socket: socket object that can be the blocking net.Socket or the nonoblocking asyncnet.AsyncSocket
  • connected: flag to indicate the connection status
  • timeout: to timeout (raise TimeoutError) after certain amount of seconds
  Redis* = ref object of RedisBase[net.Socket]

Here we say Redis is a sub type of RedisBase and the type of transport socket we are using is the blocking net.Socket

  AsyncRedis* = ref object of RedisBase[asyncnet.AsyncSocket]

Same, but here we say the socket we use is non blocking of type asyncnet.AsyncSocket

Opening Connection

proc open*(host = "localhost", port = 6379.Port, ssl=false, timeout=0): Redis =
  result = Redis(
    socket: newSocket(buffered = true),
  )
  result.pipeline = @[]
  result.timeout = timeout
  ## .. code omitted for supporting SSL
  result.socket.connect(host, port)
  result.connected = true

Here we define open proc the entry point to get sync redis client Redis. We do some initializations regarding the endpoint and the timeout and setting that on our Redis new object.

proc openAsync*(host = "localhost", port = 6379.Port, ssl=false, timeout=0): Future[AsyncRedis] {.async.} =
  ## Open an asynchronous connection to a redis server.
  result = AsyncRedis(
    socket: newAsyncSocket(buffered = true),
  )
  ## .. code omitted for supporting SSL
  result.pipeline = @[]
  result.timeout = timeout
  await result.socket.connect(host, port)
  result.connected = true

Exactly the same thing for openAsync, but instead of returning Redis we return a Future of potential AsyncRedis object

Executing commands

Our APIs will be created around execCommand proc that will send some command with arguments formatted with redis protocol (using the redisparser library) to a server using Our socket and then read a complete parsable RedisValue back to the user (using readForm proc)

  • Sync version

proc execCommand*(this: Redis|AsyncRedis, command: string, args:seq[string]): RedisValue =
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)
  this.socket.send(encode(arr))
  let form = this.readForm()
  let val = decodeString(form)
  return val
  • Async version

proc execCommandAsync*(this: Redis|AsyncRedis, command: string, args:seq[string]): Future[RedisValue] =
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)
  await this.socket.send(encode(arr))
  let form = await this.readForm()
  let val = decodeString(form)
  return val

It'd be very annoying to do provide duplicate procs for every single API get and asyncGet ... etc

Multisync FTW!

Nim provides a very neat feature multisync pragma that allows us to use the async definition in sync scopes

Here is the details from nim

Macro which processes async procedures into both asynchronous and synchronous procedures. The generated async procedures use the async macro, whereas the generated synchronous procedures simply strip off the await calls.


proc execCommand*(this: Redis|AsyncRedis, command: string, args:seq[string]): Future[RedisValue] {.multisync.} =
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)
  await this.socket.send(encode(arr))
  let form = await this.readForm()
  let val = decodeString(form)
  return val

Readers

readForm is the other main proc in our client. readForm is responsible for reading X amount of bytes from the socket until we have a complete RedisValue object.

  • readMany as the redis protocol encodes some information about the values lengths we can totally make use of that, so let's build a primitive readMany that reads X amount of the socket

proc readMany(this:Redis|AsyncRedis, count:int=1): Future[string] {.multisync.} =
  if count == 0:
    return ""
  let data = await this.receiveManaged(count)
  return data

Here again to make sure our code works with sync and async usages we use multisync if the count required is 0 we return empty string without any fancy things with the socket otherwise we delegate to the receiveManaged proc

  • receivedManaged a bit into details version on how we read the data from the socket (could be combined in the readMany proc code)
proc receiveManaged*(this:Redis|AsyncRedis, size=1): Future[string] {.multisync.} =
  result = newString(size)
  when this is Redis:
    if this.timeout == 0:
      discard this.socket.recv(result, size)
    else:
      discard this.socket.recv(result, size, this.timeout)
  else:
    discard await this.socket.recvInto(addr result[0], size)
  return result

We check the type of this object using when/is combo to dispatch to the correct implementation (sync or async) with timeouts or not

  • recv has multiple versions one of them takes a Timeout this.timeout if the user wants to timeout after a while
  • recvInto is the async version and doesn't support timeouts

readForm

readForm is used to retrieve a complete RedisValue from the server using the primitives we provided like 1readManyorreceiveManaged`

Remember how we decode strings into RedisValue objects?

  echo decodeString("*3\r\n:1\r\n:2\r\n:3\r\n\r\n")
  # # @[1, 2, 3]
  echo decodeString("+Hello, World\r\n")
  # # Hello, World
  echo decodeString("-Not found\r\n")
  # # Not found
  echo decodeString(":1512\r\n")
  # # 1512
  echo $decodeString("$32\r\nHello, World THIS IS REALLY NICE\r\n")
  # Hello, World THIS IS REALLY NICE
  echo decodeString("*2\r\n+Hello World\r\n:23\r\n")
  # @[Hello World, 23]
  echo decodeString("*2\r\n*3\r\n:1\r\n:2\r\n:3\r\n\r\n*5\r\n:5\r\n:7\r\n+Hello Word\r\n-Err\r\n$6\r\nfoobar\r\n")
  # @[@[1, 2, 3], @[5, 7, Hello Word, Err, foobar]]
  echo $decodeString("*4\r\n:51231\r\n$3\r\nfoo\r\n$-1\r\n$3\r\nbar\r\n")
  # @[51231, foo, , bar]

We will be doing exactly the same, but the only tricky part is we are reading from a socket and we can't move freely forward/backward without consuming data.

The way we were decoding strings into RedisValues was by peeking on the first character to see what type we are decoding simple string, bulkstring, error, int, array


proc readForm(this:Redis|AsyncRedis): Future[string] {.multisync.} =
  var form = ""
  ## code responsible of reading a complete parsable string representing RedisValue from the socket
  return form
  • Setup the loop
  while true:
    let b = await this.receiveManaged()
    form &= b
    ## ...

as long as we aren't done reading a complete form yet we read just 1 byte and append it to the form string we will be returning (in the beginning that byte can be one of (+, -, :, $, *)

  • Simple String
    if b == "+":
      form &= await this.readStream(CRLF)
      return form

If the character we peeking at is + we read until we consume the \r\n CRLF (from redisparser library) because strings in redis protocl are contained between + and CRLF

but wait! what's readStream? It's a small proc we need to consume bytes from the socket until we reach [and consume] a certain character

proc readStream(this:Redis|AsyncRedis, breakAfter:string): Future[string] {.multisync.} =
  var data = ""
  while true:
    if data.endsWith(breakAfter):
      break
    let strRead = await this.receiveManaged()
    data &= strRead
  return data
  • Errors
    elif b == "-":
      form &= await this.readStream(CRLF)
      return form

Exactly the same as Simple strings but we check on - instead of +

  • Ints
    elif b == ":":
      form &= await this.readStream(CRLF)
      return form

Same, serialized between : and CRLF

  • Bulkstrings
    elif b == "$":
      let bulklenstr = await this.readStream(CRLF)
      let bulklenI = parseInt(bulklenstr.strip()) 
      form &= bulklenstr
      if bulklenI == -1:
        form &= CRLF

    else:
      form &= await this.readMany(bulklenI)
      form &= await this.readStream(CRLF)

    return form

From RESP page

Bulk Strings are used in order to represent a single binary safe string up to 512 MB in length.

Bulk Strings are encoded in the following way:

A "$" byte followed by the number of bytes composing the string (a prefixed length), terminated by CRLF.
The actual string data.
A final CRLF.
So the string "foobar" is encoded as follows:

"$6\r\nfoobar\r\n"
When an empty string is just:

"$0\r\n\r\n"
RESP Bulk Strings can also be used in order to signal non-existence of a value using a special format that is used to represent a Null value. In this special format the length is -1, and there is no data, so a Null is represented as:

"$-1\r\n"

So we can have 1- 0 for empty strings $0\r\n\r\n:read from $ until we consume CRLF and CRLF 2- number of bytes to read: read from $ N amounts of bytes then consume CRLF 3- -1 for nils read from $ until we consume CRLF

  • Arrays
    elif b == "*":
        let lenstr = await this.readStream(CRLF)
        form &= lenstr
        let lenstrAsI = parseInt(lenstr.strip())
        for i in countup(1, lenstrAsI):
          form &= await this.readForm()
        return form

Arrays can be bit tricky. To encode an array we do * followed by array length then \r\n then encode each element then end the array encoding with \r\n

As the arrays encode their length we know how many inner forms or items we need to read from the socket while reading the array

Pipelining

From redis pipelining page

A Request/Response server can be implemented so that it is able to process new requests even if the client didn't already read the old responses. This way it is possible to send multiple commands to the server without waiting for the replies at all, and finally read the replies in a single step.

This is called pipelining, and is a technique widely in use since many decades. For instance many POP3 protocol implementations already supported this feature, dramatically speeding up the process of downloading new emails from the server.
Redis supports pipelining since the very early days, so whatever version you are running, you can use pipelining with Redis. This is an example using the raw netcat utility:
$ (printf "PING\r\nPING\r\nPING\r\n"; sleep 1) | nc localhost 6379
+PONG
+PONG
+PONG

So the idea we maintain a sequence of commands commands to be executed enqueueCommand and send them commitCommands and reset the pipeline sequence afterwards


proc enqueueCommand*(this:Redis|AsyncRedis, command:string, args: seq[string]): Future[void] {.multisync.} = 
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)
  this.pipeline.add(arr)

proc commitCommands*(this:Redis|AsyncRedis) : Future[RedisValue] {.multisync.} =
  for cmd in this.pipeline:
    await this.socket.send(cmd.encode())
  var responses = newSeq[RedisValue]()
  for i in countup(0, len(this.pipeline)-1):
    responses.add(decodeString(await this.readForm()))
  this.pipeline = @[]
  return RedisValue(kind:vkArray, l:responses)

Higher level APIs

are basically procs around the execCommand proc and with using multisync pargma you can have them enabled for both sync and async execution

proc del*(this: Redis | AsyncRedis, keys: seq[string]): Future[RedisValue] {.multisync.} =
  ## Delete a key or multiple keys
  return await this.execCommand("DEL", keys)


proc exists*(this: Redis | AsyncRedis, key: string): Future[bool] {.multisync.} =
  ## Determine if a key exists
  let val = await this.execCommand("EXISTS", @[key])
  result = val.i == 1

nim-redisclient

That day is based on nim-redisclient project which is using some higher level API code from Nim/redis. Feel free to send PRs or open issues

Day 14: Nim Assets (bundle your assets into single binary)

Today we will implement nimassets project heavily inspired by go-bindata

nimassets

Typically while developing projects we have assets like (icons, images, template files, css, javascript..etc) and It can be annoying to distribute them with your application or even risk losing them or misconfiguring paths or messed-up packaging script, so packaging all of them into the same binary would be an interesting option to have. these concerns were the reason to have something like go-bindata or Qt resource system

What do we expect?

  • Having single binary that has the actually resources into the executable.
  • Generating nim file out of the resources we want to bundle. Maybe something like nimassets -d=templatesdir -o=assetsfile.nim
  • Easy access to these bundled resources using getAsset proc
import assetsfile

echo assetsfile.getAsset("templatesdir/index.html")

The plan

So from a very highlevel

[ Resource1 ]                                
[ Resource2 ]   -> converter (nimassets) ->  [Nim file Representing the resources list]
[ Resource3 ]                                

The generated file should look like


import os, tables, strformat, base64, ospaths

var assets = initTable[string, string]()

proc getAsset*(path: string): string = 
  result = assets[path].decode()

assets[RESOURCE1_PATH] = BASE64_ENCODE(RESOURCE1_CONTENT)
assets[RESOURCE2_PATH] = BASE64_ENCODE(RESOURCE2_CONTENT)
assets[RESOURCE3_PATH] = BASE64_ENCODE(RESOURCE3_CONTENT)
...
...
...
...

  • We store the resource path and its base64 encoded content in assets table
  • We will expose 1 proc getAsset that takes path and returns the content by decoding base64 content

Implementation

Let's go top down approach for the implementation

Command line arguments

const buildBranchName* = staticExec("git rev-parse --abbrev-ref HEAD") ## \
const buildCommit* = staticExec("git rev-parse HEAD")  ## \
# const latestTag* = staticExec("git describe --abbrev=0 --tags") ## \

const versionString* = fmt"0.1.0 ({buildBranchName}/{buildCommit})"

proc writeHelp() = 
    echo fmt"""
nimassets {versionString} (Bundle your assets into nim file)
    -h | --help         : show help
    -v | --version      : show version
    -o | --output       : output filename
    -f | --fast         : faster generation
    -d | --dir          : dir to include (recursively)
"""

proc writeVersion() =
    echo fmt"nimassets version {versionString}"

proc cli*() =
  var 
    compress, fast : bool = false
    dirs = newSeq[string]()
    output = "assets.nim"
  
  if paramCount() == 0:
    writeHelp()
    quit(0)
  
  for kind, key, val in getopt():
    case kind
    of cmdLongOption, cmdShortOption:
        case key
        of "help", "h": 
            writeHelp()
            quit()
        of "version", "v":
            writeVersion()
            quit()
        of "fast", "f": fast = true
        of "dir", "d": dirs.add(val)
        of "output", "o": output = val 
        else:
          discard
    else:
      discard 
  for d in dirs:
    if not dirExists(d):
      echo fmt"[-] Directory doesnt exist {d}"
      quit 2 # 2 means dir doesn't exist.
  # echo fmt"compress: {compress} fast: {fast} dirs:{dirs} output:{output}"
  createAssetsFile(dirs, output, fast, compress)

when isMainModule:
  cli()

Pretty simple, we accept list of directories (using -d or --dir flag) to bundle into a nim file defined using output flag (assets.nim by default)

--fast flag indicates if we should use threading or not to speed up a little compress used to allow compression we will pass it always as false

for version information (branch and commit id) we used some git commands combined with staticExec to ensure these values are available at compile time

createAssetsFile

this proc is the entry to our application as it receives seq of the directories we want to bundle, the output filename, code optimization, and will make use of compress flag in the future

proc createAssetsFile(dirs:seq[string], outputfile="assets.nim", fast=false, compress=false) =
  var generator: proc(s:string): string
  var data = assetsFileHeader

  if fast:
    generator = generateDirAssetsSpawn
  else:
    generator = generateDirAssetsSimple

  for d in dirs:
    data &= generator(d)
  
  writeFile(outputfile, data)

Here we write (the header of the assets file and the result of generating the bundle of each directory) to the outputfile

and either we bundle files one by one (using generateDirAssetsSimple) or separately (using generateDirAssetsSpawn)

generateDirAssetsSimple

proc generateDirAssetsSimple(dir:string): string =
  var key, val, valString: string

  for path in expandTilde(dir).walkDirRec():
    key = path
    val = readFile(path).encode()
    valString = " \"\"\"" & val & "\"\"\" "
    result &= fmt"""assets.add("{path}", {valString})""" & "\n\n"

We walk recursively on the directory using walkDirRec and write down the part assets[RESOURECE_PATH] = ENCODE_BASE64(RESOURCE CONTENT) for each file in the directory.

generateDirAssetsSpawn

proc handleFile(path:string): string {.thread.} =
  var val, valString: string
  val = readFile(path).encode()
  valString = " \"\"\"" & val & "\"\"\" "
  result = fmt"""assets.add("{path}", {valString})""" & "\n\n"

proc generateDirAssetsSpawn(dir: string): string = 
  var results = newSeq[FlowVar[string]]()
  for path in expandTilde(dir).walkDirRec():
    results.add(spawn handleFile(path))

  # wait till all of them are done.
  for r in results:
    result &= ^r

the same but as generateDirAssetsSimple but using spawn to do generate the assets table entry

And that's basically it.

nimassets

All of the code is based on nimassets project. Feel free to send a PR or report issues.

Day 15: TCP Router (Routing TCP traffic)

Today we will implement a tcp router or tcp portforwarder as it works against only 1 endpoint.

What do we expect?

let opts = ForwardOptions(listenAddr:"127.0.0.1", listenPort:11000.Port, toAddr:"127.0.0.1", toPort:6379.Port)
var f = newForwarder(opts)
asyncCheck f.serve()
runForever()

and then you can do

redis-client -p 11000
> PING
PONG

The plan

  • Listen on listenPort on address listenAddr and accept connections.
  • On every new connection (incoming)
    • open a socket to toPort on toAddr (outgoing)
    • whenever data is ready on any of both ends write the data to the other one

How ready?

Linux provides APIs like select, poll to watch or monitor set of file descriptors and allows you to do some action on whatever ready file descriptor for reading or writing.

The select() function gives you a way to simultaneously check multiple sockets to see if they have data waiting to be recv()d, or if you can send() data to them without blocking, or if some exception has occurred.

Please check Beej's guide to network programming for more on that

Imports

import  strformat, tables, json, strutils, sequtils, hashes, net, asyncdispatch, asyncnet, os, strutils, parseutils, deques, options, net

Types

Options for the server specifying on which address to listen and where to forward the traffic.

type ForwardOptions = object
  listenAddr*: string
  listenPort*: Port
  toAddr*: string
  toPort*: Port
type Forwarder = object of RootObj
  options*: ForwardOptions


proc newForwarder(opts: ForwardOptions): ref Forwarder =
  result = new(Forwarder)
  result.options = opts

Represents the server the forwarder

and newForwarder creates a forwader and sets its options

Server setup

proc serve(this: ref Forwarder) {.async.} =
  var server = newAsyncSocket(buffered=false)
  server.setSockOpt(OptReuseAddr, true)
  server.bindAddr(this.options.listenPort, this.options.listenAddr)
  echo fmt"Started tcp server... {this.options.listenAddr}:{this.options.listenPort} "
  server.listen()
  
  while true:
    let client = await server.accept()
    echo "..Got connection "

    asyncCheck this.processClient(client)

We will utilize async/await features of nim to build our server.

  • Create a new socket with newAsyncSocket (make sure to set buffered to false so Nim doesn't try to read all requested data)

  • setSockOpts allows you to make the socket reusable

SO_REUSEADDR is used in servers mainly because it's common that you need to restart the server for the sake of trying or changing configurations (some use SIGHUP to update the configuration as a pattern) and if there were active connections the next time you start the server will fail.

  • bindAddr binds the server to certian address and port listenAddr and listenPort
  • then we start a loop to recieve connections.
  • we should call await processClient right? why asyncCheck processClient

await vs asyncCheck

  • await means execute that async action and block the execution until you get a result.
  • asyncCheck means execute async action and don't block a suitable name might be discard or discardAsync

No we can answer the question why call asyncCheck processClient instead of await processClient is because we will block the event machine until processClient completely executes which defeats the purpose of concurrency and accepting/handling multiple clients.

Process a client

Establish the connection

proc processClient(this: ref Forwarder, client: AsyncSocket) {.async.} =
  let remote = newAsyncSocket(buffered=false)
  await remote.connect(this.options.toAddr, this.options.toPort)
  ...

First thing is to get a socket to the endpoint where we forward the traffic defined in the ForwardOptions toAddr and toPort

No we could've established a loop and reading data from the client socket and write it to the remote socket

Problem is we may get out of sync, sometimes the remote sends data once a client connects to it before reading anything from the client. Maybe the remote sends information like server version or some metadata or instructions on protocol and it may not we can't be sure that it's waiting on recieving data always as the first step. So what we can do is watch the file descriptors and whoever has data we write to the other one.

e.g

  • remote has data: we read recv it and write send it to the client.
  • client has data: we read recv it and write send it to the remote.

The remote has data

  proc remoteHasData() {.async.} =
    while not remote.isClosed and not client.isClosed:
      echo " in remote has data loop"
      let data = await remote.recv(1024)
      echo "got data: " & data
      await client.send(data)
    client.close()
    remote.close()

The client has data

  proc clientHasData() {.async.} =
    while not client.isClosed and not remote.isClosed:
      echo "in client has data loop"
      let data = await client.recv(1024)
      echo "got data: " & data
      await remote.send(data)
    client.close()
    remote.close()

Run the data processors

Now let's register clientHasData and remoteHasData procs to the event machine and LET'S NOT BLOCK on any of them (remember if you don't want to block then you need asyncCheck)

  try:
    asyncCheck clientHasData()
    asyncCheck remoteHasData()
  except:
    echo getCurrentExceptionMsg()

So now our processClient should look like


proc processClient(this: ref Forwarder, client: AsyncSocket) {.async.} =
  let remote = newAsyncSocket(buffered=false)
  await remote.connect(this.options.toAddr, this.options.toPort)

  proc clientHasData() {.async.} =
    while not client.isClosed and not remote.isClosed:
      echo "in client has data loop"
      let data = await client.recv(1024)
      echo "got data: " & data
      await remote.send(data)
    client.close()
    remote.close()

  proc remoteHasData() {.async.} =
    while not remote.isClosed and not client.isClosed:
      echo " in remote has data loop"
      let data = await remote.recv(1024)
      echo "got data: " & data
      await client.send(data)
    client.close()
    remote.close()
  
  try:
    asyncCheck clientHasData()
    asyncCheck remoteHasData()
  except:
    echo getCurrentExceptionMsg()

Let's forward to redis


let opts = ForwardOptions(listenAddr:"127.0.0.1", listenPort:11000.Port, toAddr:"127.0.0.1", toPort:6379.Port)
var f = newForwarder(opts)
asyncCheck f.serve()
runForever()

runForever begins a never ending global dispatch poll loop

our full code

# This is just an example to get you started. A typical binary package
# uses this file as the main entry point of the application.

import  strformat, tables, json, strutils, sequtils, hashes, net, asyncdispatch, asyncnet, os, strutils, parseutils, deques, options, net

type ForwardOptions = object
  listenAddr*: string
  listenPort*: Port
  toAddr*: string
  toPort*: Port

type Forwarder = object of RootObj
  options*: ForwardOptions

proc processClient(this: ref Forwarder, client: AsyncSocket) {.async.} =
  let remote = newAsyncSocket(buffered=false)
  await remote.connect(this.options.toAddr, this.options.toPort)

  proc clientHasData() {.async.} =
    while not client.isClosed and not remote.isClosed:
      echo "in client has data loop"
      let data = await client.recv(1024)
      echo "got data: " & data
      await remote.send(data)
    client.close()
    remote.close()

  proc remoteHasData() {.async.} =
    while not remote.isClosed and not remote.isClosed:
      echo " in remote has data loop"
      let data = await remote.recv(1024)
      echo "got data: " & data
      await client.send(data)
    client.close()
    remote.close()
  
  try:
    asyncCheck clientHasData()
    asyncCheck remoteHasData()
  except:
    echo getCurrentExceptionMsg()

proc serve(this: ref Forwarder) {.async.} =
  var server = newAsyncSocket(buffered=false)
  server.setSockOpt(OptReuseAddr, true)
  server.bindAddr(this.options.listenPort, this.options.listenAddr)
  echo fmt"Started tcp server... {this.options.listenAddr}:{this.options.listenPort} "
  server.listen()
  
  while true:
    let client = await server.accept()
    echo "..Got connection "

    asyncCheck this.processClient(client)

proc newForwarder(opts: ForwardOptions): ref Forwarder =
  result = new(Forwarder)
  result.options = opts

let opts = ForwardOptions(listenAddr:"127.0.0.1", listenPort:11000.Port, toAddr:"127.0.0.1", toPort:6379.Port)
var f = newForwarder(opts)
asyncCheck f.serve()
runForever()

This project is very simple, but helped us tackle multiple concepts like how to utilize async/await and asyncCheck interesting use cases (literally @dom96 explained it to me). Of course, It can be extended to support something like forwarding TLS traffic based on SNI So you can serve multiple backends (with domains) using a single Public IP :)

Please feel free to contribute by opening PR or issue on the repo.

Day 16: Ascii Tables

ASCII tables are everywhere, every time you issue SQL select or use tools like docker to see your beloved containers or seeing your todo list in a fancy terminal todo app

What to expect

Being able to render tables in the terminal, control the widths and the rendering characters.

 var t = newAsciiTable()
  t.tableWidth = 80
  t.setHeaders(@["ID", "Name", "Date"])
  t.addRow(@["1", "Aaaa", "2018-10-2"])
  t.addRow(@["2", "bbvbbba", "2018-10-2"])
  t.addRow(@["399", "CCC", "1018-5-2"])
  printTable(t)

+---------------------------+---------------------------+---------------------------+
|ID                         |Name                       |Date                       |
+---------------------------+---------------------------+---------------------------+
|1                          |Aaaa                       |2018-10-2                  |
+---------------------------+---------------------------+---------------------------+
|2                          |bbvbbba                    |2018-10-2                  |
+---------------------------+---------------------------+---------------------------+
|399                        |CCC                        |1018-5-2                   |
+---------------------------+---------------------------+---------------------------+


or let nim decides for you

  t.tableWidth = 0
  printTable(t)
+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
+---+-------+---------+
|2  |bbvbbba|2018-10-2|
+---+-------+---------+
|399|CCC    |1018-5-2 |
+---+-------+---------+

or even remote the separators between the rows.

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
|2  |bbvbbba|2018-10-2|
|399|CCC    |1018-5-2 |
+---+-------+---------+

Why not to do it manually?

Well if you want to write code like this

      var widths = @[0,0,0,0]  #id, name, ports, root
      for k, v in info:
        if len($v.id) > widths[0]:
          widths[0] = len($v.id)
        if len($v.name) > widths[1]:
          widths[1] = len($v.name)
        if len($v.ports) > widths[2]:
          widths[2] = len($v.ports)
        if len($v.root) > widths[3]:
          widths[3] = len($v.root)
      
      var sumWidths = 0
      for w in widths:
        sumWidths += w
      
      echo "-".repeat(sumWidths)

      let extraPadding = 5
      echo "| ID"  & " ".repeat(widths[0]+ extraPadding-4) & "| Name" & " ".repeat(widths[1]+extraPadding-6) & "| Ports" & " ".repeat(widths[2]+extraPadding-6 ) & "| Root" &  " ".repeat(widths[3]-6)
      echo "-".repeat(sumWidths)
  

      for k, v in info:
        let nroot = replace(v.root, "https://hub.grid.tf/", "").strip()
        echo "|" & $v.id & " ".repeat(widths[0]-len($v.id)-1 + extraPadding) & "|" & v.name & " ".repeat(widths[1]-len(v.name)-1 + extraPadding) & "|" & v.ports & " ".repeat(widths[2]-len(v.ports)+extraPadding) & "|" & nroot & " ".repeat(widths[3]-len(v.root)+ extraPadding-2) & "|"
        echo "-".repeat(sumWidths)
      result = ""

be my guest :)

imports

Not much, but we will deal with lots of strings

import strformat, strutils

Types

Let's think a bit about the entities of a Table.

well we have Table, headers, rows, columns and each row has a cell

Cell


type Cell* = object
  leftpad*: int
  rightpad: int
  pad*: int
  text*: string

Describes the Cell and we define properties like leftpad and rightpad to set the padding around the text in the cell. Also, we used pad general property to create equal leftpad and rightpad

proc newCell*(text: string, leftpad=1, rightpad=1, pad=0): ref Cell =
  result = new Cell
  result.pad = pad
  if pad != 0:
    result.leftpad = pad
    result.rightpad = pad
  else:
    result.leftpad = leftpad
    result.rightpad = rightpad
  result.text = text
proc len*(this:ref Cell): int =
  result = this.leftpad + this.text.len + this.rightpad

Cell length is the length of the whitespaces in the paddings left and right + the text length.

proc `$`*(this:ref Cell): string =
  result = " ".repeat(this.leftpad) & this.text & " ".repeat(this.rightpad)

String representation of our Cell.

proc newCellFromAnother(another: ref Cell): ref Cell =
  result = newCell(text=another.text, leftpad=another.leftpad, rightpad=another.rightpad)

Little helper procedure to properties from a cell to another

Table

Now let's talk a bit about the table


type AsciiTable* = object 
  rows: seq[seq[string]]
  headers: seq[ref Cell]
  rowSeparator*: char
  colSeparator*: char 
  cellEdge*: char 
  widths: seq[int]
  suggestedWidths: seq[int]
  tableWidth*: int
  separateRows*: bool

AsciiTable describes a table.

  • headers makes sense to a seq of strings @["id", "name", ...] or a list of Cells. we will describe it using a seq of Cell.
  • tableWidth: you set the total size of the table.
  • rowSeparator: character separates rows
  • colSeparator: character separates columns
  • cellEdge: character on the edge of each cell Remeber that's how our table looks
+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
+---+-------+---------+
|399|CCC    |1018-5-2 |
+---+-------+---------+

We see each row is separated by rowSeparator - line and cellEdge + on the edgeof every cell and the columns are separated by colSeparator |

  • separateRows property allows us to remove the separator between rows

without separator

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
|2  |bbvbbba|2018-10-2|
|399|CCC    |1018-5-2 |
+---+-------+---------+

with separator

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
+---+-------+---------+
|2  |bbvbbba|2018-10-2|
+---+-------+---------+
|399|CCC    |1018-5-2 |
+---+-------+---------+
proc newAsciiTable*(): ref AsciiTable =
  result = new AsciiTable
  result.rowSeparator='-'
  result.colSeparator='|'
  result.cellEdge='+'
  result.tableWidth=0
  result.separateRows=true
  result.widths = newSeq[int]()
  result.suggestedWidths = newSeq[int]()
  result.rows = newSeq[seq[string]]()
  result.headers = newSeq[ref Cell]()

Helper to initialize the table.

proc columnsCount*(this: ref AsciiTable): int =
  result = this.headers.len

helper to get the number of columns.

proc setHeaders*(this: ref AsciiTable, headers:seq[string]) =
  for s in headers:
    var cell = newCell(s)
    this.headers.add(cell)

proc setHeaders*(this: ref AsciiTable, headers: seq[ref Cell]) = 
  this.headers = headers

Allow the usage of strings directly as for headers or customized Cells

proc setRows*(this: ref AsciiTable, rows:seq[seq[string]]) =
  this.rows = rows

proc addRow*(this: ref AsciiTable, row:seq[string]) =
  this.rows.add(row)

Helpers to add rows to the table data structure

proc printTable*(this: ref AsciiTable) =
  echo(this.render())

this will print the rendered table which is prepared using render proc.

proc reset*(this:ref AsciiTable) =
  this.rowSeparator='-'
  this.colSeparator='|'
  this.cellEdge='+'
  this.tableWidth=0
  this.separateRows=true
  this.widths = newSeq[int]()
  this.suggestedWidths = newSedq[int]()
  this.rows = newSeq[seq[string]]()
  this.headers = newSeq[ref Cell]()

Resets table defaults.

Rendering the table.

Let's assume for a second that widths property has all the information about the size of each column based on its index e.g widths => [5, 10, 20] means

  • column 0 can hold maximum of 5 char cell.
  • column 1 can hold maximum of 10 chars cell.
  • column 2 can hold maximum of 20 chars cell.

the column cells size can't be varied so we set the size to the LONGEST item in the column. it's bit tedious so we will get back to it later.

proc oneLine(this: ref AsciiTable): string =
  result &= this.cellEdge
  for w in this.widths:
    result &= this.rowSeparator.repeat(w) & this.cellEdge
  result &= "\n"

oneLine helps in creating such line

+---+-------+---------+

So how does it work? 1- add the cellEdge + on the left 2- add colSeparator - until you consume the size of the width of the column you are at and then add cellEdge again. 3- add new line. \n

Steps for each width.

+
+---+
+---+-------+
+---+-------+---------+
proc render*(this: ref AsciiTable): string =
  this.calculateWidths()

We start by calling our magic function calculateWidths

  # top border
  result &= this.oneline()

Generate the top border line of the table.

  # headers
  for colidx, h in this.headers:
    result &= this.colSeparator & $h & " ".repeat(this.widths[colidx]-len(h) )
  
  result &= this.colSeparator
  result &= "\n"
  # finish headers 

  # line after headers

Now the headers

|ID |Name   |Date     |

So we start with colSeparator | for each header defined in this.headers the print the content of the header (which is a cell so we print the leftpad + text + rightpad ) and add colSeparator | to the end of the items

  result &= this.oneline()

Add another line, So our table looks like this now.

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
  # start rows
  for r in this.rows:
    # start row
    for colidx, c in r:
      let cell = newCell(c, leftpad=this.headers[colidx].leftpad, rightpad=this.headers[colidx].rightpad)
      result &= this.colSeparator & $cell & " ".repeat(this.widths[colidx]-len(cell)) 
    result &= this.colSeparator
    result &= "\n"

Now exactly the same for each row, we get the row and print it the same way we printed the headers and follow it by a new line.

Our table looks like this now

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
    if this.separateRows: 
        result &= this.oneLine()
    # finish row

Now we need to decide: are all the rows have line separating them or they don't. In case if they have separators we finish the row by adding another oneLine

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
+---+-------+---------+

or if it doesn't have separators and we want our table to look like this in the end

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
|2  |bbvbbba|2018-10-2|

we don't add oneLine

  # don't duplicate the finishing line if it's already printed in case of this.separateRows
  if not this.separateRows:
      result &= this.oneLine()
  return result

if we don't separateRows we add the final oneLine to the table

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
|2  |bbvbbba|2018-10-2|
+---+-------+---------+   <- the final oneLine

if we do separateRows we shouldn't add another oneLine or our table will be rendered like

+---+-------+---------+
|ID |Name   |Date     |
+---+-------+---------+
|1  |Aaaa   |2018-10-2|
+---+-------+---------+
|2  |bbvbbba|2018-10-2|
+---+-------+---------+
+---+-------+---------+

Now back to calculating widths

Back to the magic function. To be honest, it's not magical it's just bit tedious. So the basic idea is:

proc calculateWidths(this: ref AsciiTable) =
  var colsWidths = newSeq[int]()

a list of column widths

  if this.suggestedWidths.len == 0:
    for h in this.headers:
      colsWidths.add(h.len) 
  else:
    colsWidths = this.suggestedWidths

the user might suggest some widths via suggestedWidths property, so can use them for guidance.


  for row in this.rows:
    for colpos, c in row:
      var acell = newCellFromAnother(this.headers[colpos])
      acell.text = c
      if len(acell) > colsWidths[colpos]:
        colsWidths[colpos] = len(acell)

we get the size length of each column by iterating on all the rows and find the max item (the cell with the longest size) in the position of the column in every row and that max will be the column width.

We support other options like totalWidth of the Table and that will make equal column sizes if the user didn't suggest widths

  let sizeForCol = (this.tablewidth/len(this.headers)).toInt()
  var lenHeaders = 0
  for w in colsWidths:
    lenHeaders += w 

Here we calculate the length of each header equally using table width specified by the user divided by the number of columns headers

  if this.tablewidth > lenHeaders:
    if this.suggestedWidths.len == 0:
      for colpos, c in colsWidths:
        colsWidths[colpos] += sizeForCol - c

if the user didn't suggest any widths then he wants the table columns of equal length

  if this.suggestedWidths.len != 0:
    var sumSuggestedWidths = 0
    for s in this.suggestedWidths:
      sumSuggestedWidths += s

    if lenHeaders > sumSuggestedWidths:
      raise newException(ValueError, fmt"sum of {this.suggestedWidths} = {sumSuggestedWidths} and it's less than required length {lenHeaders}")      

if the user suggested some widths we caculate the sum of what user suggested and check if greater than the calculated lenHeaders and if it's not we raise an exception.

  this.widths = colsWidths

Phew! We finally set the widths property now

nim-asciitable

this day is based on my project nim-asciitables and it's superseded by nim-terminaltables which provides more customizable styles and unicode box drawing support.

Day 17: Nim-Sonic-Client: Nim and Rust can be friends!

sonic is a fast, lightweight and schema-less search backend. It ingests search texts and identifier tuples that can then be queried against in a microsecond's time, and it's implemented in rust. Sonic can be used as a simple alternative to super-heavy and full-featured search backends such as Elasticsearch in some use-cases. It is capable of normalizing natural language search queries, auto-completing a search query and providing the most relevant results for a query. Sonic is an identifier index, rather than a document index; when queried, it returns IDs that can then be used to refer to the matched documents in an external database. We use it heavily in all of our projects currently using python client, but we are here today to talk about nim. Please make sure to check sonic website for more info on how start the server and its configurations

What to expect ?

Ingest

We should be able to push data over tcp from nim to sonic

    var cl = open("127.0.0.1", 1491, "dmdm", SonicChannel.Ingest)
    echo $cl.execCommand("PING")

    echo cl.ping()
    echo cl.protocol
    echo cl.bufsize
    echo cl.push("wiki", "articles", "article-1",
                  "for the love of god hell")
    echo cl.push("wiki", "articles", "article-2",
                  "for the love of satan heaven")
    echo cl.push("wiki", "articles", "article-3",
                  "for the love of lorde hello")
    echo cl.push("wiki", "articles", "article-4",
                  "for the god of loaf helmet")
PONG
true
0
0
true
2
0
true
true
true

Search

We should be able to search/complete data from nim client using sonic


    var cl = open("127.0.0.1", 1491, "dmdm", SonicChannel.Search)
    echo $cl.execCommand("PING")

    echo cl.ping()
    echo cl.query("wiki", "articles", "for")
    echo cl.query("wiki", "articles", "love")
    echo cl.suggest("wiki", "articles", "hell")
    echo cl.suggest("wiki", "articles", "lo")
PONG
true
@[]
@["article-3", "article-2"]
@[]
@["loaf", "lorde", "love"]

Sonic specification

If you go to their wire protocol page you will find some examples using telnet. I'll copy some in the following section

2️⃣ Sonic Channel (uninitialized)

  • START <mode> <password>: select mode to use for connection (either: search or ingest). The password is found in the config.cfg file at channel.auth_password.

Issuing any other command — eg. QUIT — in this mode will abort the TCP connection, effectively resulting in a QUIT with the ENDED not_recognized response.


3️⃣ Sonic Channel (Search mode)

The Sonic Channel Search mode is used for querying the search index. Once in this mode, you cannot switch to other modes or gain access to commands from other modes.

➡️ Available commands:

  • QUERY: query database (syntax: QUERY <collection> <bucket> "<terms>" [LIMIT(<count>)]? [OFFSET(<count>)]? [LANG(<locale>)]?; time complexity: O(1) if enough exact word matches or O(N) if not enough exact matches where N is the number of alternate words tried, in practice it approaches O(1))
  • SUGGEST: auto-completes word (syntax: SUGGEST <collection> <bucket> "<word>" [LIMIT(<count>)]?; time complexity: O(1))
  • PING: ping server (syntax: PING; time complexity: O(1))
  • HELP: show help (syntax: HELP [<manual>]?; time complexity: O(1))
  • QUIT: stop connection (syntax: QUIT; time complexity: O(1))

⏩ Syntax terminology:

  • <collection>: index collection (ie. what you search in, eg. messages, products, etc.);
  • <bucket>: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, default, common, ..);
  • <terms>: text for search terms (between quotes);
  • <count>: a positive integer number; set within allowed maximum & minimum limits;
  • <locale>: an ISO 639-3 locale code eg. eng for English (if set, the locale must be a valid ISO 639-3 code; if set to none, lexing will be disabled; if not set, the locale will be guessed from text);
  • <manual>: help manual to be shown (available manuals: commands);

Notice: the bucket terminology may confuse some Sonic users. As we are well-aware Sonic may be used in an environment where end-users may each hold their own search index in a given collection, we made it possible to manage per-end-user search indexes with bucket. If you only have a single index per collection (most Sonic users will), we advise you use a static generic name for your bucket, for instance: default.

⬇️ Search flow example (via telnet):

T1: telnet sonic.local 1491
T2: Trying ::1...
T3: Connected to sonic.local.
T4: Escape character is '^]'.
T5: CONNECTED <sonic-server v1.0.0>
T6: START search SecretPassword
T7: STARTED search protocol(1) buffer(20000)
T8: QUERY messages user:0dcde3a6 "valerian saliou" LIMIT(10)
T9: PENDING Bt2m2gYa
T10: EVENT QUERY Bt2m2gYa conversation:71f3d63b conversation:6501e83a
T11: QUERY helpdesk user:0dcde3a6 "gdpr" LIMIT(50)
T12: PENDING y57KaB2d
T13: QUERY helpdesk user:0dcde3a6 "law" LIMIT(50) OFFSET(200)
T14: PENDING CjPvE5t9
T15: PING
T16: PONG
T17: EVENT QUERY CjPvE5t9
T18: EVENT QUERY y57KaB2d article:28d79959
T19: SUGGEST messages user:0dcde3a6 "val"
T20: PENDING z98uDE0f
T21: EVENT SUGGEST z98uDE0f valerian valala
T22: QUIT
T23: ENDED quit
T24: Connection closed by foreign host.

Notes on what happens:

  • T6: we enter search mode (this is required to enable search commands);
  • T8: we query collection messages, in bucket for platform user user:0dcde3a6 with search terms valerian saliou and a limit of 10 on returned results;
  • T9: Sonic received the query and stacked it for processing with marker Bt2m2gYa (the marker is used to track the asynchronous response);
  • T10: Sonic processed search query of T8 with marker Bt2m2gYa and sends 2 search results (those are conversation identifiers, that refer to a primary key in an external database);
  • T11 + T13: we query collection helpdesk twice (in the example, this one is heavy, so processing of results takes more time);
  • T17 + T18: we receive search results for search queries of T11 + T13 (this took a while!);

4️⃣ Sonic Channel (Ingest mode)

The Sonic Channel Ingest mode is used for altering the search index (push, pop and flush). Once in this mode, you cannot switch to other modes or gain access to commands from other modes.

➡️ Available commands:

  • PUSH: Push search data in the index (syntax: PUSH <collection> <bucket> <object> "<text>" [LANG(<locale>)]?; time complexity: O(1))
  • POP: Pop search data from the index (syntax: POP <collection> <bucket> <object> "<text>"; time complexity: O(1))
  • COUNT: Count indexed search data (syntax: COUNT <collection> [<bucket> [<object>]?]?; time complexity: O(1))
  • FLUSHC: Flush all indexed data from a collection (syntax: FLUSHC <collection>; time complexity: O(1))
  • FLUSHB: Flush all indexed data from a bucket in a collection (syntax: FLUSHB <collection> <bucket>; time complexity: O(N) where N is the number of bucket objects)
  • FLUSHO: Flush all indexed data from an object in a bucket in collection (syntax: FLUSHO <collection> <bucket> <object>; time complexity: O(1))
  • PING: ping server (syntax: PING; time complexity: O(1))
  • HELP: show help (syntax: HELP [<manual>]?; time complexity: O(1))
  • QUIT: stop connection (syntax: QUIT; time complexity: O(1))

⏩ Syntax terminology:

  • <collection>: index collection (ie. what you search in, eg. messages, products, etc.);
  • <bucket>: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, default, common, ..);
  • <object>: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact);
  • <text>: search text to be indexed (can be a single word, or a longer text; within maximum length safety limits; between quotes);
  • <locale>: an ISO 639-3 locale code eg. eng for English (if set, the locale must be a valid ISO 639-3 code; if set to none, lexing will be disabled; if not set, the locale will be guessed from text);
  • <manual>: help manual to be shown (available manuals: commands);

Notice: the bucket terminology may confuse some Sonic users. As we are well-aware Sonic may be used in an environment where end-users may each hold their own search index in a given collection, we made it possible to manage per-end-user search indexes with bucket. If you only have a single index per collection (most Sonic users will), we advise you use a static generic name for your bucket, for instance: default.

⬇️ Ingest flow example (via telnet):

T1: telnet sonic.local 1491
T2: Trying ::1...
T3: Connected to sonic.local.
T4: Escape character is '^]'.
T5: CONNECTED <sonic-server v1.0.0>
T6: START ingest SecretPassword
T7: STARTED ingest protocol(1) buffer(20000)
T8: PUSH messages user:0dcde3a6 conversation:71f3d63b Hey Valerian
T9: ERR invalid_format(PUSH <collection> <bucket> <object> "<text>")
T10: PUSH messages user:0dcde3a6 conversation:71f3d63b "Hello Valerian Saliou, how are you today?"
T11: OK
T12: COUNT messages user:0dcde3a6
T13: RESULT 43
T14: COUNT messages user:0dcde3a6 conversation:71f3d63b
T15: RESULT 1
T16: FLUSHO messages user:0dcde3a6 conversation:71f3d63b
T17: RESULT 1
T18: FLUSHB messages user:0dcde3a6
T19: RESULT 42
T20: PING
T21: PONG
T22: QUIT
T23: ENDED quit
T24: Connection closed by foreign host.

Notes on what happens:

  • T6: we enter ingest mode (this is required to enable ingest commands);
  • T8: we try to push text Hey Valerian to the index, in collection messages, bucket user:0dcde3a6 and object conversation:71f3d63b (the syntax that was used is invalid);
  • T9: Sonic refuses the command we issued in T8, and provides us with the correct command format (notice that <text> should be quoted);
  • T10: we attempt to push another text in the same collection, bucket and object as in T8;
  • T11: this time, our push command in T10 was valid (Sonic acknowledges the push commit to the search index);
  • T12: we count the number of indexed terms in collection messages and bucket user:0dcde3a6;
  • T13: there are 43 terms (ie. words) in index for query in T12;
  • T18: we flush all index data from collection messages and bucket user:0dcde3a6;
  • T19: 42 terms have been flushed from index for command in T18;

5️⃣ Sonic Channel (Control mode)

The Sonic Channel Control mode is used for administration purposes. Once in this mode, you cannot switch to other modes or gain access to commands from other modes.

➡️ Available commands:

  • TRIGGER: trigger an action (syntax: TRIGGER [<action>]? [<data>]?; time complexity: O(1))
  • INFO: get server information (syntax: INFO; time complexity: O(1))
  • PING: ping server (syntax: PING; time complexity: O(1))
  • HELP: show help (syntax: HELP [<manual>]?; time complexity: O(1))
  • QUIT: stop connection (syntax: QUIT; time complexity: O(1))

⏩ Syntax terminology:

  • <action>: action to be triggered (available actions: consolidate, backup, restore);
  • <data>: additional data to provide to the action (required for: backup, restore);
  • <manual>: help manual to be shown (available manuals: commands);

⬇️ Control flow example (via telnet):

T1: telnet sonic.local 1491
T2: Trying ::1...
T3: Connected to sonic.local.
T4: Escape character is '^]'.
T5: CONNECTED <sonic-server v1.0.0>
T6: START control SecretPassword
T7: STARTED control protocol(1) buffer(20000)
T8: TRIGGER consolidate
T9: OK
T10: PING
T11: PONG
T12: QUIT
T13: ENDED quit
T14: Connection closed by foreign host.

Notes on what happens:

  • T6: we enter control mode (this is required to enable control commands);
  • T8: we trigger a database consolidation (instead of waiting for the next automated consolidation tick);

Implementation

imports

these are the imports that we will use because we will be dealing with networks, some data parsing, .. etc

import strformat, tables, json, strutils, sequtils, hashes, net, asyncdispatch, asyncnet, os, strutils, parseutils, deques, options, net

Types

As we said earlier there're three channels

type 
  SonicChannel* {.pure.} = enum
   Ingest
   Search
   Control

Generic sonic exception

type 
  SonicServerError = object of Exception

Now for the base connection

type
  SonicBase[TSocket] = ref object of RootObj
   socket: TSocket
   host: string
   port: int
   password: string
   connected: bool
   timeout*: int
   protocol*: int
   bufSize*: int
   channel*: SonicChannel

  Sonic* = ref object of SonicBase[net.Socket]
  AsyncSonic* = ref object of SonicBase[asyncnet.AsyncSocket]

we require

  • host: sonic server running on
  • password: for sonic server
  • connected: flag for connected or none
  • timeout: timeout in seconds
  • protocol: information sent to us on connecting to sonic server
  • bufsize: how big is the data buffer u can use
  • channel: to indicate the current mode.

Helpers


proc quoteText(text:string): string =
  ## Quote text and normalize it in sonic protocol context.
  ##  - text str  text to quote/escape
  ##  Returns:
  ##    str  quoted text

  return '"' & text.replace('"', '\"').replace("\r\n", "") & '"'

quoteText used to escape quotes and replace newline

proc isError(response: string): bool =
  ## Check if the response is Error or not in sonic context.
  ## Errors start with `ERR`
  ##  - response   response string
  ##  Returns:
  ##    bool  true if response is an error.

  response.startsWith("ERR ")

isError checks if the response represents and error

proc raiseForError(response:string): string =
  ## Raise SonicServerError in case of error response.
  ##  - response message to check if it's error or not.
  ##  Returns:
  ##    str the response message
  if isError(response):
    raise newException(SonicServerError, response)
  return response

raiseError a short circuit for raising errors if response is an errror or returning response

Making a connection

proc open*(host = "localhost", port = 1491, password="", channel:SonicChannel, ssl=false, timeout=0): Sonic =
  result = Sonic(
   socket: newSocket(buffered = true),
   host: host,
   port: port,
   password: password,
   channel: channel
  )
  result.timeout = timeout
  result.channel = channel
  when defined(ssl):
   if ssl == true:
     SSLifySonicConnectionNoVerify(result)
  result.socket.connect(host, port.Port)

  result.startSession()

proc openAsync*(host = "localhost", port = 1491, password="", channel:SonicChannel, ssl=false, timeout=0): Future[AsyncSonic] {.async.} =
  ## Open an asynchronous connection to a Sonic server.
  result = AsyncSonic(
   socket: newAsyncSocket(buffered = true),
   channel: channel
  )
  when defined(ssl):
   if ssl == true:
     SSLifySonicConnectionNoVerify(result)
  result.timeout = timeout
  await result.socket.connect(host, port.Port)
  await result.startSession()

Here we support to APIs async/sync APIs for opening connection and as soon as we do the connection we call startSession

startSession


proc startSession*(this:Sonic|AsyncSonic): Future[void] {.multisync.} =
  let resp = await this.socket.recvLine()

  if "CONNECTED" in resp:
   this.connected = true

  var channelName = ""
  case this.channel:
   of SonicChannel.Ingest:  channelName = "ingest"
   of SonicChannel.Search:  channelName = "search"
   of SonicChannel.COntrol: channelName = "control"

  let msg = fmt"START {channelName} {this.password} \r\n"
  await this.socket.send(msg)  #### start
  discard await this.socket.recvLine()  #### started. FIXME extract protocol bufsize
  ## TODO: this.parseSessionMeta(line)
  • we use multisync pragma to support async, sync APIs (check redisclient chapter for more info). according to wire protocol we just send the raw string START SPACE CHANNEL_NAME SONIC_PASSWORD and terminate that with \r\n
  • when we recieve data we should parse protocol version and the bufsize and set it in our SonicClient this

Sending/Receiving data

proc receiveManaged*(this:Sonic|AsyncSonic, size=1): Future[string] {.multisync.} =
  when this is Sonic:
   if this.timeout == 0:
     result = this.socket.recvLine()
   else:
     result = this.socket.recvLine(timeout=this.timeout)
  else:
   result = await this.socket.recvLine()

  result = raiseForError(result.strip())

proc execCommand*(this: Sonic|AsyncSonic, command: string, args:seq[string]): Future[string] {.multisync.} =
  let cmdArgs = concat(@[command], args)
  let cmdStr = join(cmdArgs, " ").strip()
  await this.socket.send(cmdStr & "\r\n")
  result = await this.receiveManaged()

proc execCommand*(this: Sonic|AsyncSonic, command: string): Future[string] {.multisync.} =
  result = await this.execCommand(command, @[""])

here we have couple helpers to send data on the wire execCommand and receiving data receiveManaged

  • we only support timeout for sync client (there's a withTimeout for async the user can try to implement )

Now we have everything we need to interact with sonic server, but not with userfriendly API, we can do better by converting the results to nim data structures or booleans when suitable

User-friendly APIs

Ping

checks the server endpoint

proc ping*(this: Sonic|AsyncSonic): Future[bool] {.multisync.} =
  ## Send ping command to the server
  ## Returns:
  ## bool  True if successfully reaching the server.
  result = (await this.execCommand("PING")) == "PONG"

Quit

Ends the connection

proc quit*(this: Sonic|AsyncSonic): Future[string] {.multisync.} =
   ## Quit the channel and closes the connection.
   result = await this.execCommand("QUIT")
   this.socket.close()

Push

Pushes search data into the index

proc push*(this: Sonic|AsyncSonic, collection, bucket, objectName, text: string, lang=""): Future[bool] {.multisync.} =
   ## Push search data in the index
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - objectName: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact)
   ##   - text: search text to be indexed can be a single word, or a longer text; within maximum length safety limits
   ##   - lang: ISO language code
   ##   Returns:
   ##     bool  True if search data are pushed in the index. 
   var langString = ""
   if lang != "":
     langString = fmt"LANG({lang})"
   let text = quoteText(text)
   result = (await this.execCommand("PUSH", @[collection, bucket, objectName, text, langString]))=="OK"


Pop

Pops search data from the index

proc pop*(this: Sonic|AsyncSonic, collection, bucket, objectName, text: string): Future[int] {.multisync.} =
   ## Pop search data from the index
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - objectName: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact)
   ##   - text: search text to be indexed can be a single word, or a longer text; within maximum length safety limits
   ##   Returns:
   ##     int 
   let text = quoteText(text)
   let resp = await this.execCommand("POP", @[collection, bucket, objectName, text])
   result = resp.split()[^1].parseInt()

Count

Count the indexed data

proc count*(this: Sonic|AsyncSonic, collection, bucket, objectName: string): Future[int] {.multisync.} =
   ## Count indexed search data
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - objectName: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact)
   ## Returns:
   ## int  count of index search data.

   var bucketString = ""
   if bucket != "":
     bucketString = bucket
   var objectNameString = ""
   if objectName != "":
     objectNameString = objectName
   result = parseInt(await this.execCommand("COUNT", @[collection, bucket, objectName]))

flush

Generic flush to be called from flushCollection, flushBucket, flushObject

proc flush*(this: Sonic|AsyncSonic, collection: string, bucket="", objectName=""): Future[int] {.multisync.} =
   ## Flush indexed data in a collection, bucket, or in an object.
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - objectName: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact)
   ##   Returns:
   ##     int  number of flushed data
   if bucket == "" and objectName=="":
      result = await this.flushCollection(collection)
   elif bucket != "" and objectName == "":
      result = await this.flushBucket(collection, bucket)
   elif objectName != "" and bucket != "":
      result = await this.flushObject(collection, bucket, objectName)

flushCollection

Flushes all the indexed data from a collection

proc flushCollection*(this: Sonic|AsyncSonic, collection: string): Future[int] {.multisync.} =
   ## Flush all indexed data from a collection
   ##  - collection index collection (ie. what you search in, eg. messages, products, etc.)
   ##   Returns:
   ##     int  number of flushed data
   result = (await this.execCommand("FLUSHC", @[collection])).parseInt

flushBucket

flushes all indexd data from a bucket in a collection

proc flushBucket*(this: Sonic|AsyncSonic, collection, bucket: string): Future[int] {.multisync.} =
   ## Flush all indexed data from a bucket in a collection
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   Returns:
   ##    int  number of flushed data
   result = (await this.execCommand("FLUSHB", @[collection, bucket])).parseInt

flushObject

Flushes all indexed data from an object in a bucket in collection

proc flushObject*(this: Sonic|AsyncSonic, collection, bucket, objectName: string): Future[int] {.multisync.} =
   ## Flush all indexed data from an object in a bucket in collection
   ##   - collection: index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket: index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - objectName: object identifier that refers to an entity in an external database, where the searched object is stored (eg. you use Sonic to index CRM contacts by name; full CRM contact data is stored in a MySQL database; in this case the object identifier in Sonic will be the MySQL primary key for the CRM contact)
   ##   Returns:
   ##     int  number of flushed data
   result = (await this.execCommand("FLUSHO", @[collection, bucket, objectName])).parseInt

Query

Queries sonic and returns a list of results.

proc query*(this: Sonic|AsyncSonic, collection, bucket, terms: string, limit=10, offset: int=0, lang=""): Future[seq[string]] {.multisync.} =
  ## Query the database
  ##  - collection index collection (ie. what you search in, eg. messages, products, etc.)
  ##  - bucket index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
  ##  - terms text for search terms
  ##  - limit a positive integer number; set within allowed maximum & minimum limits
  ##  - offset a positive integer number; set within allowed maximum & minimum limits
  ##  - lang an ISO 639-3 locale code eg. eng for English (if set, the locale must be a valid ISO 639-3 code; if not set, the locale will be guessed from text).
  ##  Returns:
  ##    list  list of objects ids.
  let limitString = fmt"LIMIT({limit})"
  var langString = ""
  if lang != "":
   langString = fmt"LANG({lang})"
  let offsetString = fmt"OFFSET({offset})"

  let termsString = quoteText(terms)
  discard await this.execCommand("QUERY", @[collection, bucket, termsString, limitString, offsetString, langString])
  let resp = await this.receiveManaged()
  result = resp.splitWhitespace()[3..^1]

Suggest

autocompletes a word using a collection and a bucket.

proc suggest*(this: Sonic|AsyncSonic, collection, bucket, word: string, limit=10): Future[seq[string]] {.multisync.} =
   ## auto-completes word.
   ##   - collection index collection (ie. what you search in, eg. messages, products, etc.)
   ##   - bucket index bucket name (ie. user-specific search classifier in the collection if you have any eg. user-1, user-2, .., otherwise use a common bucket name eg. generic, procault, common, ..)
   ##   - word word to autocomplete
   ##   - limit a positive integer number; set within allowed maximum & minimum limits (procault: {None})
   ##   Returns:
   ##     list list of suggested words.
   var limitString = fmt"LIMIT({limit})" 
   let wordString = quoteText(word)
   discard await this.execCommand("SUGGEST", @[collection, bucket, wordString, limitString])
   let resp = await this.receiveManaged()
   result = resp.splitWhitespace()[3..^1]


Test code to use

when isMainModule:

  proc testIngest() =
   var cl = open("127.0.0.1", 1491, "dmdm", SonicChannel.Ingest)
   echo $cl.execCommand("PING")

   echo cl.ping()
   echo cl.protocol
   echo cl.bufsize
   echo cl.push("wiki", "articles", "article-1",
              "for the love of god hell")
   echo cl.pop("wiki", "articles", "article-1",
              "for the love of god hell")
   echo cl.pop("wikis", "articles", "article-1",
              "for the love of god hell")
   echo cl.push("wiki", "articles", "article-2",
              "for the love of satan heaven")
   echo cl.push("wiki", "articles", "article-3",
              "for the love of lorde hello")
   echo cl.push("wiki", "articles", "article-4",
              "for the god of loaf helmet")

  proc testSearch() =

   var cl = open("127.0.0.1", 1491, "dmdm", SonicChannel.Search)
   echo $cl.execCommand("PING")

   echo cl.ping()
   echo cl.query("wiki", "articles", "for")
   echo cl.query("wiki", "articles", "love")
   echo cl.suggest("wiki", "articles", "hell")
   echo cl.suggest("wiki", "articles", "lo")

  proc testControl() =
   var cl = open("127.0.0.1", 1491, "dmdm", SonicChannel.Control)
   echo $cl.execCommand("PING")

   echo cl.ping()
   echo cl.trigger("consolidate")


  testIngest()
  testSearch()
  testControl()

Code is available on xmonader/nim-sonic-client. Feel free to send me a PR or open an issue.

Day 18: From a socket to a Webframework

Today we will be focusing on building a webframework starting from a socket :)

What to expect

proc main() =
    var router = newRouter()



    let loggingMiddleware = proc(request: var Request): (ref Response, bool) =
      let path = request.path
      let headers = request.headers
      echo "==============================="
      echo "from logger handler"
      echo "path: " & path
      echo "headers: " & $headers
      echo "==============================="
      return (newResponse(), true)

    let trimTrailingSlash = proc(request: var Request): (ref Response, bool) =
      let path = request.path
      if path.endswith("/"):
        request.path = path[0..^2]

      echo "==============================="
      echo "from slash trimmer "
      echo "path was : " & path
      echo "path: " & request.path
      echo "==============================="
      return (newResponse(), true)
      
    proc handleHello(req:var Request): ref Response =
      result = newResponse()
      result.code = Http200
      result.content = "hello world from handler /hello" & $req 
    router.addRoute("/hello", handleHello)

    let assertJwtFieldExists =  proc(request: var Request): (ref Response, bool) =
        echo $request.headers
        let jwtHeaderVals = request.headers.getOrDefault("jwt", @[""])
        let jwt = jwtHeaderVals[0]
        echo "================\n\njwt middleware"
        if jwt.len != 0:
          echo fmt"bye bye {jwt} "
        else:
          echo fmt"sure bye but i didn't get ur name"
        echo "===================\n\n"
        return (newResponse(), true)

    router.addRoute("/bye", handleHello, HttpGet, @[assertJwtFieldExists])
    
    proc handleGreet(req:var Request): ref Response =
      result = newResponse()
      result.code = Http200
      result.content = "generic greet" & $req 

        
    router.addRoute("/greet", handleGreet, HttpGet, @[])
    router.addRoute("/greet/:username", handleGreet, HttpGet, @[])
    router.addRoute("/greet/:first/:second/:lang", handleGreet, HttpGet, @[])

    let opts = ServerOptions(address:"127.0.0.1", port:9000.Port)
    var s = newServy(opts, router, @[loggingMiddleware, trimTrailingSlash])
    asyncCheck s.serve()
    echo "servy started..."
    runForever()
  
  main()

defining a handler and wiring to to a pattern or more

    proc handleHello(req:var Request): ref Response =
      result = newResponse()
      result.code = Http200
      result.content = "hello world from handler /hello" & $req 
    router.addRoute("/hello", handleHello)

    proc handleGreet(req:var Request): ref Response =
      result = newResponse()
      result.code = Http200
      result.content = "generic greet" & $req 

    router.addRoute("/greet", handleGreet, HttpGet, @[])
    router.addRoute("/greet/:username", handleGreet, HttpGet, @[])
    router.addRoute("/greet/:first/:second/:lang", handleGreet, HttpGet, @[])


defining/registering middlewares on the server globally

    let loggingMiddleware = proc(request: var Request): (ref Response, bool) =
      let path = request.path
      let headers = request.headers
      echo "==============================="
      echo "from logger handler"
      echo "path: " & path
      echo "headers: " & $headers
      echo "==============================="
      return (newResponse(), true)

    let trimTrailingSlash = proc(request: var Request): (ref Response, bool) =
      let path = request.path
      if path.endswith("/"):
        request.path = path[0..^2]

      echo "==============================="
      echo "from slash trimmer "
      echo "path was : " & path
      echo "path: " & request.path
      echo "==============================="
      return (newResponse(), true)

    var s = newServy(opts, router, @[loggingMiddleware, trimTrailingSlash])


defining middlewares (request filters on certain routes)

    router.addRoute("/bye", handleHello, HttpGet, @[assertJwtFieldExists])

Sounds like a lot. Let's get to it.

Implementation

The big picture



proc newServy(options: ServerOptions, router:ref Router, middlewares:seq[MiddlewareFunc]): ref Servy =
  result = new Servy
  result.options = options
  result.router = router
  result.middlewares = middlewares

  result.sock = newAsyncSocket()
  result.sock.setSockOpt(OptReuseAddr, true)

we have a server listening on a socket/address (should be configurable) and has a router that knows which pattern should be handled by which handler and a set of middlewares to be used.

proc serve(s: ref Servy) {.async.} =
  s.sock.bindAddr(s.options.port)
  s.sock.listen()
  while true:
    let client = await s.sock.accept()
    asyncCheck s.handleClient(client)

  runForever()

we receive a connection and pass it to handleClient proc

proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  ## code to read request from the user
  var req = await s.parseRequestFromConnection(client)
  
  ...
  echo "received request from client: " & $req

  ## code to get the route handler
  let (routeHandler, params) = s.router.getByPath(req.path)
  req.urlParams = params
  let handler = routeHandler.handlerFunc

  ..
  ## call the handler and return response in valid http protocol format
  let resp = handler(req)
  echo "reached the handler safely.. and executing now."
  await client.send(resp.format())
  echo $req.formData

handleClient reads the data from the wire in HTTP protocol and finds the route or requested path handler and then formats a valid http response and write it on the wire. Cool? Awesome!

Example HTTP requests and responses

when you execute curl httpbin.org/get -v the following (http formatted request) is sent to httpbin.org webserver

GET /get HTTP/1.1
Host: httpbin.org
User-Agent: curl/7.62.0-DEV

That is called a Request that has a request line METHOD PATH HTTPVERSION e.g GET /get HTTP/1.1. Followed by a list of headers lines with colon in it representing key values e.g

  • Host: httpbin.org a header is a line of Key: value
  • User-Agent: curl/7.62.0-DEV a header indicating the client type

As soon as the server receives that request it'll handle it as it was told to

HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 21 Oct 2019 18:28:13 GMT
Server: nginx
Content-Length: 206

{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Host": "httpbin.org", 
    "User-Agent": "curl/7.62.0-DEV"
  }, 
  "origin": "197.52.178.58, 197.52.178.58", 
  "url": "https://httpbin.org/get"
}

This is called a Response, response consists of

  • status line: HTTPVER STATUS_CODE STATUS_MESSAGE e.g HTTP/1.1 200 OK
  • list of headers
    • Content-Type: application/json type of content
    • Date: Mon, 21 Oct 2019 18:28:13 GMT date of the response
    • Server: nginx server name
    • Content-Length: 206 length of the upcoming body

Now let's go over the abstractions needed

Http Version

There're multiple http specifications 0.9, 1.0, 1.1, ..

so let's start with that. a Simple enum should be enough

type
  HttpVersion* = enum
    HttpVer11,
    HttpVer10


proc `$`(ver:HttpVersion): string = 
      case ver
      of HttpVer10: result="HTTP/1.0"
      of HttpVer11: result="HTTP/1.1"


HttpMethods

We all know GET, POST, HEAD, .. methods, again can be represented by a Simple enum

type
  HttpMethod* = enum  ## the requested HttpMethod
    HttpHead,         ## Asks for the response identical to the one that would
                      ## correspond to a GET request, but without the response
                      ## body.
    HttpGet,          ## Retrieves the specified resource.
    HttpPost,         ## Submits data to be processed to the identified
                      ## resource. The data is included in the body of the
                      ## request.
    HttpPut,          ## Uploads a representation of the specified resource.
    HttpDelete,       ## Deletes the specified resource.
    HttpTrace,        ## Echoes back the received request, so that a client
                      ## can see what intermediate servers are adding or
                      ## changing in the request.
    HttpOptions,      ## Returns the HTTP methods that the server supports
                      ## for specified address.
    HttpConnect,      ## Converts the request connection to a transparent
                      ## TCP/IP tunnel, usually used for proxies.
    HttpPatch         ## Applies partial modifications to a resource.



proc httpMethodFromString(txt: string):  Option[HttpMethod] = 
    let s2m = {"GET": HttpGet, "POST": HttpPost, "PUT":HttpPut, "PATCH": HttpPatch, "DELETE": HttpDelete, "HEAD":HttpHead}.toTable
    if txt in s2m:
        result = some(s2m[txt.toUpper])
    else:
        result = none(HttpMethod)

Also we add httpMethodFromString that takes a string and returns option[HttpMethod] value.

Http Code

HTTP specifications specifies certain code responses (status codes) to indicate the state for the request

  • 20X -> it's fine
  • 30X -> redirections
  • 40X -> client messed up
  • 50X -> server messed up

  HttpCode* = distinct range[0 .. 599]

const
  Http200* = HttpCode(200)
  Http201* = HttpCode(201)
  Http202* = HttpCode(202)
  Http203* = HttpCode(203)
  ...
  Http300* = HttpCode(300)
  Http301* = HttpCode(301)
  Http302* = HttpCode(302)
  Http303* = HttpCode(303)
  ..
  Http400* = HttpCode(400)
  Http401* = HttpCode(401)
  Http403* = HttpCode(403)
  Http404* = HttpCode(404)
  Http405* = HttpCode(405)
  Http406* = HttpCode(406)
  ...
  Http451* = HttpCode(451)
  Http500* = HttpCode(500)
  ...


proc `$`*(code: HttpCode): string =
    ## Converts the specified ``HttpCode`` into a HTTP status.
    ##
    ## For example:
    ##
    ##   .. code-block:: nim
    ##       doAssert($Http404 == "404 Not Found")
    case code.int
    ..
    of 200: "200 OK"
    of 201: "201 Created"
    of 202: "202 Accepted"
    of 204: "204 No Content"
    of 205: "205 Reset Content"
    ...
    of 301: "301 Moved Permanently"
    of 302: "302 Found"
    of 303: "303 See Other"
    ..
    of 400: "400 Bad Request"
    of 401: "401 Unauthorized"
    of 403: "403 Forbidden"
    of 404: "404 Not Found"
    of 405: "405 Method Not Allowed"
    of 406: "406 Not Acceptable"
    of 408: "408 Request Timeout"
    of 409: "409 Conflict"
    of 410: "410 Gone"
    of 411: "411 Length Required"
    of 413: "413 Request Entity Too Large"
    of 414: "414 Request-URI Too Long"
    of 415: "415 Unsupported Media Type"
    of 416: "416 Requested Range Not Satisfiable"
    of 429: "429 Too Many Requests"
    ...
    of 500: "500 Internal Server Error"
    of 501: "501 Not Implemented"
    of 502: "502 Bad Gateway"
    of 503: "503 Service Unavailable"
    of 504: "504 Gateway Timeout"
    ...
    else: $(int(code))

the code above is taken from pure/http in nim stdlib

headers

another abstraction we need is the headers list. Headers in http aren't just key=value, but key=[value] so key can has a list of values.

type HttpHeaders* = ref object
      table*: TableRef[string, seq[string]]

type HttpHeaderValues* =  seq[string]

proc newHttpHeaders*(): HttpHeaders =
  new result
  result.table = newTable[string, seq[string]]()

proc newHttpHeaders*(keyValuePairs:
    seq[tuple[key: string, val: string]]): HttpHeaders =
  var pairs: seq[tuple[key: string, val: seq[string]]] = @[]
  for pair in keyValuePairs:
    pairs.add((pair.key.toLowerAscii(), @[pair.val]))
  new result
  result.table = newTable[string, seq[string]](pairs)

proc `$`*(headers: HttpHeaders): string =
  return $headers.table

proc clear*(headers: HttpHeaders) =
  headers.table.clear()

proc `[]`*(headers: HttpHeaders, key: string): HttpHeaderValues =
  ## Returns the values associated with the given ``key``. If the returned
  ## values are passed to a procedure expecting a ``string``, the first
  ## value is automatically picked. If there are
  ## no values associated with the key, an exception is raised.
  ##
  ## To access multiple values of a key, use the overloaded ``[]`` below or
  ## to get all of them access the ``table`` field directly.
  return headers.table[key.toLowerAscii].HttpHeaderValues

# converter toString*(values: HttpHeaderValues): string =
#   return seq[string](values)[0]

proc `[]`*(headers: HttpHeaders, key: string, i: int): string =
  ## Returns the ``i``'th value associated with the given key. If there are
  ## no values associated with the key or the ``i``'th value doesn't exist,
  ## an exception is raised.
  return headers.table[key.toLowerAscii][i]

proc `[]=`*(headers: HttpHeaders, key, value: string) =
  ## Sets the header entries associated with ``key`` to the specified value.
  ## Replaces any existing values.
  headers.table[key.toLowerAscii] = @[value]

proc `[]=`*(headers: HttpHeaders, key: string, value: seq[string]) =
  ## Sets the header entries associated with ``key`` to the specified list of
  ## values.
  ## Replaces any existing values.
  headers.table[key.toLowerAscii] = value

proc add*(headers: HttpHeaders, key, value: string) =
  ## Adds the specified value to the specified key. Appends to any existing
  ## values associated with the key.
  if not headers.table.hasKey(key.toLowerAscii):
    headers.table[key.toLowerAscii] = @[value]
  else:
    headers.table[key.toLowerAscii].add(value)

proc del*(headers: HttpHeaders, key: string) =
  ## Delete the header entries associated with ``key``
  headers.table.del(key.toLowerAscii)

iterator pairs*(headers: HttpHeaders): tuple[key, value: string] =
  ## Yields each key, value pair.
  for k, v in headers.table:
    for value in v:
      yield (k, value)

proc contains*(values: HttpHeaderValues, value: string): bool =
  ## Determines if ``value`` is one of the values inside ``values``. Comparison
  ## is performed without case sensitivity.
  for val in seq[string](values):
    if val.toLowerAscii == value.toLowerAscii: return true

proc hasKey*(headers: HttpHeaders, key: string): bool =
  return headers.table.hasKey(key.toLowerAscii())

proc getOrDefault*(headers: HttpHeaders, key: string,
    default = @[""].HttpHeaderValues): HttpHeaderValues =
  ## Returns the values associated with the given ``key``. If there are no
  ## values associated with the key, then ``default`` is returned.
  if headers.hasKey(key):
    return headers[key]
  else:
    return default

proc len*(headers: HttpHeaders): int = return headers.table.len

proc parseList(line: string, list: var seq[string], start: int): int =
  var i = 0
  var current = ""
  while start+i < line.len and line[start + i] notin {'\c', '\l'}:
    i += line.skipWhitespace(start + i)
    i += line.parseUntil(current, {'\c', '\l', ','}, start + i)
    list.add(current)
    if start+i < line.len and line[start + i] == ',':
      i.inc # Skip ,
    current.setLen(0)

proc parseHeader*(line: string): tuple[key: string, value: seq[string]] =
  ## Parses a single raw header HTTP line into key value pairs.
  ##
  ## Used by ``asynchttpserver`` and ``httpclient`` internally and should not
  ## be used by you.
  result.value = @[]
  var i = 0
  i = line.parseUntil(result.key, ':')
  inc(i) # skip :
  if i < len(line):
    i += parseList(line, result.value, i)
  elif result.key.len > 0:
    result.value = @[""]
  else:
    result.value = @[]

So we have the abstraction now over the headers. very nice.

Request

type Request = object 
  httpMethod*: HTTPMethod
  httpVersion*: HttpVersion
  headers*: HTTPHeaders
  path*: string
  body*: string
  queryParams*: TableRef[string, string]
  formData*: TableRef[string, string]
  urlParams*: TableRef[string, string]

request is a type that keeps track of

  • http version: from the client request
  • request method: get, post, .. etc
  • requested path: if the url is localhost:9000/users/myfile the requested path would be /users/myfile
  • headers: request headers
  • body: body
  • formData: submitted form data
  • queryParams: if the url is /users/search?name=xmon&age=50 the queryParams will be Table {"name":"xmon", "age":50}
  • urlParams: are the captured variables by the router if we have a route to handle /users/:username/:language and we received request with path /users/xmon/ar it will bind username to xmon and language to ar and make that available on the request object to be used later on by the handler.

Building the request

remember the handleClient that we mentioned in the big picture section?


proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  ...

So let's implement parseRequestFromConnection



proc parseRequestFromConnection(s: ref Servy, conn:AsyncSocket): Future[Request] {.async.} = 

    result.queryParams = newTable[string, string]()
    result.formData = newTable[string, string]()
    result.urlParams = newTable[string, string]()

    let requestline = $await conn.recvLine(maxLength=maxLine)
    var  meth, path, httpver: string
    var parts = requestLine.splitWhitespace()
    meth = parts[0]
    path = parts[1]
    httpver = parts[2]
    var contentLength = 0
    echo meth, path, httpver
    let m = httpMethodFromString(meth)
    if m.isSome:
        result.httpMethod = m.get()
    else:
        echo meth
        raise newException(OSError, "invalid httpmethod")
    if "1.1" in httpver:
        result.httpVersion = HttpVer11
    elif "1.0" in httpver:
        result.httpVersion = HttpVer10
  
    result.path = path

    if "?" in path:
      # has query params
      result.queryParams = parseQueryParams(path) 
    

First we parse the request line METHOD PATH HTTPVER e.g GET /users HTTP/1.1 so if we split on spaces we get the method, path, and http version

Also if there's ? like in /users?username=xmon in the request path, we should parse the Query Parameters


proc parseQueryParams(content: string): TableRef[string, string] =
  result = newTable[string, string]()
  var consumed = 0
  if "?" notin content and "=" notin content:
    return
  if "?" in content:
    consumed += content.skipUntil({'?'}, consumed)

  inc consumed # skip ? now.

  while consumed < content.len:
    if "=" notin content[consumed..^1]:
      break

    var key = ""
    var val = ""
    consumed += content.parseUntil(key, "=", consumed)
    inc consumed # =
    consumed += content.parseUntil(val, "&", consumed)
    inc consumed
    # result[decodeUrl(key)] = result[decodeUrl(val)]
    result.add(decodeUrl(key), decodeUrl(val))
    echo "consumed:" & $consumed
    echo "contentlen:" & $content.len


Next should be the headers

    result.headers = newHttpHeaders()


    # parse headers
    var line = ""
    line = $(await conn.recvLine(maxLength=maxLine))
    echo fmt"line: >{line}< "
    while line != "\r\n":
      # a header line
      let kv = parseHeader(line)
      result.headers[kv.key] = kv.value
      if kv.key.toLowerAscii == "content-length":
        contentLength = parseInt(kv.value[0])
      line = $(await conn.recvLine(maxLength=maxLine))
      # echo fmt"line: >{line}< "

We receive the headers and figure out the body length from content-length header to know how much to consume from the socket after we're done with the headers.

    if contentLength > 0:
      result.body = await conn.recv(contentLength)

    discard result.parseFormData()

Now that we know how much to consume (contentLength) from socket we can capture the request's body. Notice that parseFormData handles the form submitted in the request, let's take a look at that next.

Submitting data.

In HTTP there are different Content-Type(s) to submit (post) data: application/x-www-form-urlencoded and multipart/form-data.

Quoting stackoverflow answer

The purpose of both of those types of requests is to send a list of name/value pairs to the server. Depending on the type and amount of data being transmitted, one of the methods will be more efficient than the other. To understand why, you have to look at what each is doing under the covers.

For application/x-www-form-urlencoded, the body of the HTTP message sent to the server is essentially one giant query string -- name/value pairs are separated by the ampersand (&), and names are separated from values by the equals symbol (=). An example of this would be: 

MyVariableOne=ValueOne&MyVariableTwo=ValueTwo


That means that for each non-alphanumeric byte that exists in one of our values, it's going to take three bytes to represent it. For large binary files, tripling the payload is going to be highly inefficient.

That's where multipart/form-data comes in. With this method of transmitting name/value pairs, each pair is represented as a "part" in a MIME message (as described by other answers). Parts are separated by a particular string boundary (chosen specifically so that this boundary string does not occur in any of the "value" payloads). Each part has its own set of MIME headers like Content-Type, and particularly Content-Disposition, which can give each part its "name." The value piece of each name/value pair is the payload of each part of the MIME message. The MIME spec gives us more options when representing the value payload -- we can choose a more efficient encoding of binary data to save bandwidth (e.g. base 64 or even raw binary).

e.g:

If you want to send the following data to the web server:

name = John
age = 12

using application/x-www-form-urlencoded would be like this:

name=John&age=12

As you can see, the server knows that parameters are separated by an ampersand &. If & is required for a parameter value then it must be encoded.

So how does the server know where a parameter value starts and ends when it receives an HTTP request using multipart/form-data?

Using the boundary, similar to &.

For example:

--XXX
Content-Disposition: form-data; name="name"

John
--XXX
Content-Disposition: form-data; name="age"

12
--XXX--

reference of the above explanation


type FormPart = object
      name*: string
      headers*: HttpHeaders
      body*: string

proc newFormPart(): ref FormPart = 
  new result
  result.headers = newHttpHeaders()

proc `$`(this:ref FormPart): string = 
  result = fmt"partname: {this.name} partheaders: {this.headers} partbody: {this.body}" 

type FormMultiPart = object
  parts*: TableRef[string, ref FormPart]

proc newFormMultiPart(): ref FormMultiPart = 
  new result
  result.parts = newTable[string, ref FormPart]()

proc `$`(this: ref FormMultiPart): string = 
  return fmt"parts: {this.parts}"

So that's our abstraction for multipart form.

proc parseFormData(r: Request): ref FormMultiPart =


  discard """
received request from client: (httpMethod: HttpPost, requestURI: "", httpVersion: HTTP/1.1, headers: {"accept": @["*/*"], "content-length": @["241"], "content-type": @["multipart/form-data; boundary=------------------------95909933ebe184f2"], "host": @["127.0.0.1:9000"], "user-agent": @["curl/7.62.0-DEV"]}, path: "/post", body: "--------------------------95909933ebe184f2\c\nContent-Disposition: form-data; name=\"who\"\c\n\c\nhamada\c\n--------------------------95909933ebe184f2\c\nContent-Disposition: form-data; name=\"next\"\c\n\c\nhome\c\n--------------------------95909933ebe184f2--\c\n", raw_body: "", queryParams: {:})
  """

  result = newFormMultiPart()
  
  let contenttype = r.headers.getOrDefault("content-type")[0]
  let body = r.body
  
  if "form-urlencoded" in contenttype.toLowerAscii():
    # query params are the post body
    let postBodyAsParams = parseQueryParams(body)
    for k, v in postBodyAsParams.pairs:
      r.queryParams.add(k, v)     

if the content-type has the word form-urlencoded we parse he body as if it was queryParams


  elif contenttype.startsWith("multipart/") and "boundary" in contenttype:
    var boundaryName = contenttype[contenttype.find("boundary=")+"boundary=".len..^1]
    echo "boundayName: " & boundaryName
    for partString in body.split(boundaryName & "\c\L"):
      var part = newFormPart()
      var partName = ""

      var totalParsedLines = 1
      let bodyLines = body.split("\c\L")[1..^1] # at the boundary line
      for line in bodyLines:
        if line.strip().len != 0:
          let splitted = line.split(": ")
          if len(splitted) == 2:
            part.headers.add(splitted[0], splitted[1])
          elif len(splitted) == 1:
            part.headers.add(splitted[0], "")
          
          if "content-disposition" in line.toLowerAscii and "name" in line.toLowerAscii:
            # Content-Disposition: form-data; name="next"
            var consumed = line.find("name=")+"name=".len
            discard line.skip("\"", consumed) 
            inc consumed
            consumed += line.parseUntil(partName, "\"", consumed)

        else:
          break # done with headers now for the body.

        inc totalParsedLines
      
      let content = join(bodyLines[totalParsedLines..^1], "\c\L")
      part.body = content
      part.name = partName
      result.parts.add(partName, part)
      echo $result.parts

if it's not form-urlencoded then it's a multipart then we need to figure out the boundary and split the body on that boundary text

Response

Now that we can parse the client request we need to be able to build a correctly formatted response. Response keeps track of

  • http version
  • response status code
  • response content
  • response headers
type Response = object
  headers: HttpHeaders
  httpver: HttpVersion
  code: HttpCode
  content: string

Formatting response


proc formatStatusLine(code: HttpCode, httpver: HttpVersion) : string =
  return fmt"{httpver} {code}" & "\r\n"

Here we build status line which is HTTPVERSION STATUS_CODE STATUS_MSG\r\n e.g HTTP/1.1 200 OK

proc formatResponse(code:HttpCode, httpver:HttpVersion, content:string, headers:HttpHeaders): string = 
  result &= formatStatusLine(code, httpver)
  if headers.len > 0:
    for k,v in headers.pairs:
      result &= fmt"{k}: {v}" & "\r\n"
  result &= fmt"Content-Length: {content.len}" & "\r\n\r\n"
  result &= content
  echo "will send"
  echo result
  

proc format(resp: ref Response) : string = 
  result = formatResponse(resp.code, resp.httpver, resp.content, resp.headers)


To format a complete response we need

  • building status line
  • headers to string
  • content length to be the length for the body
  • the body itself

Handling client request

so every handler function should take a Request object and return a Response to be sent on the wire. Right?



proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  ...
  let (routeHandler, params) = s.router.getByPath(req.path)
  req.urlParams = params
  let handler = routeHandler.handlerFunc
  ...
  let resp = handler(req)
  await client.send(resp.format())

Very cool the router will magically return to us a suitable route handler or 404 handler if not found using its getByPath proc

  • We get the handler
  • apply it to the request to get a valid http response
  • send the response to the client on the wire.

Let's get to the Handler Function example definition again


    proc handleHello(req:var Request): ref Response =
      result = newResponse()
      result.code = Http200
      result.content = "hello world from handler /hello" & $req 

so it takes a request and returns a response, how about we create an alias for that?

type HandlerFunc = proc(req: var Request):ref Response {.nimcall.}

Middlewares

It's typical in many frameworks to apply certain set of checks or functions on the incoming request before sending it to any handler, like logging the request first, or trimming the trailing slashes, or checking for a certain header

How can we implement that? Remember our handleClient? they need to be applied before the request reach the handler so should be above handler(req)


proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  ### HERE SHOULD BE MIDDLEWARE Code
  ###
  ###


  let (routeHandler, params) = s.router.getByPath(req.path)
  req.urlParams = params
  let handler = routeHandler.handlerFunc
  ...
  let resp = handler(req)
  await client.send(resp.format())

So let's get to the implementation


proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  
  for  m in s.middlewares:
    let (resp, usenextmiddleware) = m(req)
    if not usenextmiddleware:
      echo "early return from middleware..."
      await client.send(resp.format())
      return
  ...
  let handler = routeHandler.handlerFunc
  ...
  let resp = handler(req)
  await client.send(resp.format())

here we loop over all registered middlewares

  • middleware should return a response to be sent if it needs to terminate the handling immediately
  • should tell us if we should continue applying middlewares or terminate immediately

That's why the definition of a middleware is like that


    let loggingMiddleware = proc(request: var Request): (ref Response, bool) =
      let path = request.path
      let headers = request.headers
      echo "==============================="
      echo "from logger handler"
      echo "path: " & path
      echo "headers: " & $headers
      echo "==============================="
      return (newResponse(), true)

Let's create an alias for middleware function so we can use it easily in the rest of our code

type MiddlewareFunc = proc(req: var Request): (ref Response, bool) {.nimcall.}

Route specific middlewares

above we talked about global application middlewares, but maybe we want to apply some middleware or filter to a certain route


proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  
  
  for  m in s.middlewares:
    let (resp, usenextmiddleware) = m(req)
    if not usenextmiddleware:
      echo "early return from middleware..."
      await client.send(resp.format())
      return

  echo "received request from client: " & $req

  let (routeHandler, params) = s.router.getByPath(req.path)
  req.urlParams = params
  let handler = routeHandler.handlerFunc
  let middlewares = routeHandler.middlewares
  
  

  for  m in middlewares:
    let (resp, usenextmiddleware) = m(req)
    if not usenextmiddleware:
      echo "early return from route middleware..."
      await client.send(resp.format())
      return
    
  let resp = handler(req)
  echo "reached the handler safely.. and executing now."
  await client.send(resp.format())
  echo $req.formData


notice now we have a route specific middlewares to apply as well before calling handler(req) maybe to check for a header before allowing access on that route.

Router

Router is one of the essential components in our code it's responsible to keep track of what the registered pattern and their handlers so we can actually do something with incoming request and the filters middlewares to apply on the request


type RouterValue = object
  handlerFunc: HandlerFunc
  middlewares:seq[MiddlewareFunc]

type Router = object
  table: TableRef[string, RouterValue]

Basic definition of the router as it's a map from a url pattern to RouterValue that basically has a reference to the handler proc and a sequence of middlewares/filters

proc newRouter(): ref Router =
  result = new Router
  result.table = newTable[string, RouterValue]()

Initializing the router

proc handle404(req: var Request): ref Response  = 
  var resp = newResponse()
  resp.code = Http404
  resp.content = fmt"nothing at {req.path}"
  return resp

Simple 404 handler in case that we don't find a handler for the requested path

proc getByPath(r: ref Router, path: string, notFoundHandler:HandlerFunc=handle404) : (RouterValue, TableRef[string, string]) =
  var found = false
  if path in r.table: # exact match
    return (r.table[path], newTable[string, string]())

  for handlerPath, routerValue in r.table.pairs:
    echo fmt"checking handler:  {handlerPath} if it matches {path}" 
    let pathParts = path.split({'/'})
    let handlerPathParts = handlerPath.split({'/'})
    echo fmt"pathParts {pathParts} and handlerPathParts {handlerPathParts}"

    if len(pathParts) != len(handlerPathParts):
      echo "length isn't ok"
      continue
    else:
      var idx = 0
      var capturedParams = newTable[string, string]()

      while idx<len(pathParts):
        let pathPart = pathParts[idx]
        let handlerPathPart = handlerPathParts[idx]
        echo fmt"current pathPart {pathPart} current handlerPathPart: {handlerPathPart}"

        if handlerPathPart.startsWith(":") or handlerPathPart.startsWith("@"):
          echo fmt"found var in path {handlerPathPart} matches {pathPart}"
          capturedParams[handlerPathPart[1..^1]] = pathPart
          inc idx
        else:
          if pathPart == handlerPathPart:
            inc idx
          else:
            break

        if idx == len(pathParts):
          found = true
          return (routerValue, capturedParams)

  if not found:
    return (RouterValue(handlerFunc:notFoundHandler, middlewares: @[]), newTable[string, string]())

Here we search for pattern registered in the router for exact match or if it has varialbes we and capture their values e.g: /users/:name/:lang pattern matches the request /users/xmon/ar and creates env Table with {"name":"xmon", "lang":"ar"}

  • /mywebsite/homepage pattern matches /mywebsite/homepage
  • /blogs/:username patternmatches the path/blogs/xmonand/blogs/ahmedso it capture the env with variable nameusernameand variable valuexmonorahmed` and returns
  • when we found the suitable handler and its env we set the env on the request on urlParams field and call the handler on the updated request. Remember our handleClient proc?

proc handleClient(s: ref Servy, client: AsyncSocket) {.async.} =
  var req = await s.parseRequestFromConnection(client)
  
  ## Global middlewares
  ## ..
  ## ..

  let (routeHandler, params) = s.router.getByPath(req.path)
  req.urlParams = params
  let handler = routeHandler.handlerFunc

  ## Route middlewares.
  ## ..
  ## ..
  let resp = handler(req)
  await client.send(resp.format())

proc addHandler(router: ref Router, route: string, handler: HandlerFunc, httpMethod:HttpMethod=HttpGet, middlewares:seq[MiddlewareFunc]= @[]) = 
  router.table.add(route, RouterValue(handlerFunc:handler, middlewares:middlewares))

we provide a simple function to add a handler to a route setting the method type and the middlewares as well on a Router object.

What's next?

We didn't talk about templates, cookies, sessions, dates, sending files and for sure that's not a complete HTTP ref implementation by any means. Jester is a great option to check. Thank you for going through this day and please feel free to send PR or open issue on nim-servy repository

Day 19: Wit.AI client

Nim client for wit.ai to Easily create text or voice based bots that humans can chat with on their preferred messaging platform. It helps to reduce expressions into entity/trait

e.g in your wit.ai project you define entity like VM (virtual machine) and trait to be something like create, stop and when you send an expression like new virtual machine or fresh vm, wit.ai helps to reduce it to entity vm and trait create

What to expect

  let tok = getEnv("WIT_ACCESS_TOKEN", "")
  if tok == "":
    echo "Make sure to set WIT_ACCESS_TOKEN variable"
    quit 1
  var inp = ""
  var w = newWit(tok)

  while true:
    echo "Enter your query or q to quit > "
    inp = stdin.readLine()
    if inp == "q":
      quit 0
    else:
      echo w.message(inp)
Enter your query or q to quit >
new vm
{"_text":"new vm","entities":{"vm":[{"confidence":0.97072907352305,"value":"create"}]},"msg_id":"1N6CURN7qaJaSKXSK"}

Enter your query or q to quit >
new machine
{"_text":"new machine","entities":{"vm":[{"confidence":0.90071815565634,"value":"create"}]},"msg_id":"1t8dOpkPbAP6SgW49"}

Enter your query or q to quit >
new docker
{"_text":"new docker","entities":{"container":[{"confidence":0.98475238333984,"value":"create"}]},"msg_id":"1l7ocY7MVWBfUijsm"}
Enter your query or q to quit >

stop machine
{"_text":"stop machine","entities":{"vm":[{"confidence":0.66323929848545,"value":"stop"}]},"msg_id":"1ygXLjnQbEt4lVMyS"}
Enter your query or q to quit >

show my coins
{"_text":"show my coins","entities":{"wallet":[{"confidence":0.75480999601329,"value":"show"}]},"msg_id":"1SdYOY60xXdMvUG7b"}
Enter your query or q to quit >

view coins
{"_text":"view coins","entities":{"wallet":[{"confidence":0.5975926583378,"value":"show"}]},"msg_id":"1HZ3YlfLlr31JlbKZ"}
Enter your query or q to quit >

Speech

  echo w.speech("/home/striky/startnewvm.wav", {"Content-Type": "audio/wav"}.toTable)
{
  "_text" : "start new the m is",
  "entities" : {
    "vm" : [ {
      "confidence" : 0.54805678200202,
      "value" : "create"
    } ]
  },
  "msg_id" : "1jHMTJGHEAFh8LHFS"
}

Implementation

imports

import strformat, tables, json, strutils, sequtils, hashes, net, asyncdispatch,
    asyncnet, os, strutils, parseutils, deques, options, net
import json
import logging
import httpclient
import uri

var L = newConsoleLogger()
addHandler(L)

Here we import utilities we are going to use like string formatters, tables, json, http client .. etc and prepare default logger.

Crafting wit.ai API

let WIT_API_HOST = getEnv("WIT_URL", "https://api.wit.ai")
let WIT_API_VERSION = getEnv("WIT_API_VERSION", "20160516")
let DEFAULT_MAX_STEPS = 5

To work with wit.ai API you will need to generate an API token.

  • WIT_API_HOST: base URL for wit.ai API notice it's https then we will need -d:ssl flag in compile phase.
  • WIT_API_VERSION: API version in wit.ai

We will be interested in /message and /speech endpoints in wit.ai API

Adding authorization to HTTP Headers

proc getWitAIRequestHeaders*(accessToken: string): HttpHeaders =
  result = newHttpHeaders({
    "authorization": "Bearer " & accessToken,
    "accept": "application/vnd.wit." & WIT_API_VERSION & "+json"
  })

To authorize our requests against wit.ai we need to add authorization header.

Encoding params helper

proc encodeQueryStringTable(qsTable: Table[string, string]): string =
  result = ""

  if qsTable.len == 0:
    return result

  result = "?"
  var first = true
  for k, v in qsTable.pairs:
    if not first:
      result &= "&"
    result &= fmt"{k}={encodeUrl(v)}"
    first = false
  echo $result
  return result

A helper to encode key, value pairs into a query string `?key=val

Let's get to the client

Here we define the interesting parts to interact with wit.ai

type WitException* = object of Exception

Generic Exception to use

type Wit* = ref object of RootObj
  accessToken*: string
  client*: HttpClient

proc newWit(accessToken: string): Wit =
  var w = new Wit
  w.accessToken = accessToken
  w.client = newHttpClient()
  result = w

the entry point for our Wit.AI client. the client Wit keeps track of

  • accessToken: to access the API
  • client: http client to use underneath
proc newRequest(this: Wit, meth = HttpGet, path: string, params: Table[string,
    string], body = "", headers: Table[string, string]): string =
  let fullUrl = WIT_API_HOST & path & encodeQueryStringTable(params)
  this.client.headers = getWitAIRequestHeaders(this.accessToken)
  if headers.len > 0:
    for k, v in headers:
      this.client.headers[k] = v

  var resp: Response
  if body == "":
    resp = this.client.request(fullUrl, httpMethod = meth)
  else:
    resp = this.client.request(fullUrl, httpMethod = meth, body = body)
  if resp.code != 200.HttpCode:
    raise newException(WitException, (fmt"[-] {resp.code}: {resp.body} "))

  result = resp.body

Generic helper to format/build wit.ai requests. It does the following

  • Prepares the headers with authorization using getWitAIRequestHeaders
  • Prepares the full URL using the WIT_API_HOST and the query params sent
  • Based on the method HttpGet or HttpPost it'll issue the request and raises if response's status code is not 200
  • Returns the response body

/message endpoint

According to the docs of wit.ai only q param is required.

Definition
  GET https://api.wit.ai/message
Example request with single outcome

  $ curl -XGET 'https://api.wit.ai/message?v=20170307&q=how%20many%20people%20between%20Tuesday%20and%20Friday' \
      -H 'Authorization: Bearer $TOKEN'

Example response
  {
    "msg_id": "387b8515-0c1d-42a9-aa80-e68b66b66c27",
    "_text": "how many people between Tuesday and Friday",
    "entities": {
      "metric": [ {
        "metadata": "{'code': 324}",
        "value": "metric_visitor",
        "confidence": 0.9231
      } ],
      "datetime": [
        {
          "confidence": 0.954105,
          "values": [
            {
              "to": {
                "value": "2018-12-22T00:00:00.000-08:00",
                "grain": "day"
              },
              "from": {
                "value": "2018-12-18T00:00:00.000-08:00",
                "grain": "day"
              },
              "type": "interval"
            },
            {
              "to": {
                "value": "2018-12-29T00:00:00.000-08:00",
                "grain": "day"
              },
              "from": {
                "value": "2018-12-25T00:00:00.000-08:00",
                "grain": "day"
              },
              "type": "interval"
            },
            {
              "to": {
                "value": "2019-01-05T00:00:00.000-08:00",
                "grain": "day"
              },
              "from": {
                "value": "2019-01-01T00:00:00.000-08:00",
                "grain": "day"
              },
              "type": "interval"
            }
          ],
          "to": {
            "value": "2018-12-22T00:00:00.000-08:00",
            "grain": "day"
          },
          "from": {
            "value": "2018-12-18T00:00:00.000-08:00",
            "grain": "day"
          },
          "type": "interval"
        }
      ]
    }
  }
proc message*(this: Wit, msg: string, context: ref Table[string, string] = nil,
    n = "", verbose = ""): string =
  var params = initTable[string, string]()
  if n != "":
    params["n"] = n
  if verbose != "":
    params["verbose"] = verbose
  if msg != "":
    params["q"] = msg

  if not context.isNil and context.len > 0:
    var ctxNode = %* {}
    for k, v in context.pairs:
      ctxNode[k] = %*v

    params["context"] = ( %* ctxNode).pretty()

  return this.newRequest(HttpGet, path = "/message", params, "", initTable[
      string, string]())

here we will allow msg as the expression we want to check in wit.ai, and adding some extra params for more close mapping to the official API like context, verbose, n

  • msg: User’s query. Length must be > 0 and < 280
  • verbose: A flag to get auxiliary information about entities, like the location within the sentence.
  • n: The maximum number of n-best trait entities you want to get back. The default is 1, and the maximum is 8
  • context: Context is key in natural language. For instance, at the same absolute instant, “today” will be resolved to a different value depending on the timezone of the user. (can contain locale, timezone, coords for coordinates)

/speech endpoint

proc speech*(this: Wit, audioFilePath: string, headers: Table[string, string],
    context: ref Table[string, string] = nil, n = "", verbose = ""): string =
  var params = initTable[string, string]()
  if n != "":
    params["n"] = n
  if verbose != "":
    params["verbose"] = verbose

  if not context.isNil and context.len > 0:
    var ctxNode = %* {}
    for k, v in context.pairs:
      ctxNode[k] = %*v

    params["context"] = ( %* ctxNode).pretty()
  let body = readFile(audioFilePath)

  return this.newRequest(HttpPost, path = "/speech", params, body, headers)

almost the same as /message endpoint except we send audioFile content in body

same as /message, but we will send an audio file.

Thanks

The complete sources can be found at nim-witai. Please feel free to contribute by opening PR or issue on the repo.

Day 20: CacheTable

Today we will implement an expiry feature on keys over nim tables

What to expect

  var c = newCacheTable[string, string](initDuration(seconds = 2))
  c.setKey("name", "ahmed", initDuration(seconds = 10))
  c.setKey("color", "blue", initDuration(seconds = 5))
  c.setKey("akey", "a value", DefaultExpiration)
  c.setKey("akey2", "a value2", DefaultExpiration)

  c.setKey("lang", "nim", NeverExpires)
  • Here will will create a new Table from string to string
  • we are allowed to set the default expiration to 2 seconds using Duration object globally on the Table newCacheTable[string, string](initDuration(seconds = 2))
  • We are allowed to override the default expiration when setKey by passing a duration object
  • We are allowed to set a key to NeverExpires

Here's a small example to see the internals of execution


  for i in countup(0, 20):
    echo "has key name? " & $c.hasKey("name")
    echo $c.getCache
    echo $c.get("name")
    echo $c.get("color")
    echo $c.get("lang")
    echo $c.get("akey")
    echo $c.get("akey2")
    os.sleep(1*1000)

Implementation

Imports

import tables, times, os, options, locks
type Expiration* = enum NeverExpires, DefaultExpiration

We have to types of Expiration

  • NeverExpires basically the key stays there forever.
  • DefaultExpiration to use whatever global expiration value defined on the Table
type Entry*[V] = object
  value*: V
  ttl*: int64

type CacheTable*[K, V] = ref object
  cache: Table[K, Entry[V]]
  lock*: locks.Lock
  defaultExpiration*: Duration

proc newCacheTable*[K, V](defaultExpiration = initDuration(
    seconds = 5)): CacheTable[K, V] =
  ## Create new CacheTable
  result = CacheTable[K, V]()
  result.cache = initTable[K, Entry[V]]()
  result.defaultExpiration = defaultExpiration

The only difference between our CacheTable and Nim's Table is the entries are keeping track of Time To Live TTL

  • Entry is a Generic entry we store in the CacheTable that has a value of a type V and keeps track of its ttl
  • CacheTable is a Table from keys of type K to values of of type Entry[V] and keeps track of default expiration
  • newCacheTable is a helper to create a new CacheTable.
proc getCache*[K, V](t: CacheTable[K, V]): Table[K, Entry[V]] =
  result = t.cache

a helper to get the underlying Table

proc setKey*[K, V](t: CacheTable[K, V], key: K, value: V, d: Duration) =
  ## Set ``Key`` of type ``K`` (needs to be hashable) to ``value`` of type ``V`` with duration ``d``
  let rightnow = times.getTime()
  let rightNowDur = times.initDuration(seconds = rightnow.toUnix(),
      nanoseconds = rightnow.nanosecond)

  let ttl = d.inNanoseconds + rightNowDur.inNanoseconds
  let entry = Entry[V](value: value, ttl: ttl)
  t.cache.add(key, entry)

a helper to set a new key in the CacheTable with a specific Duration

proc setKey*[K, V](t: CacheTable[K, V], key: K, value: V,
    expiration: Expiration = NeverExpires) =
  ## Sets key with `Expiration` strategy
  var entry: Entry[V]
  case expiration:
  of NeverExpires:
    entry = Entry[V](value: value, ttl: 0)
    t.cache.add(key, entry)
  of DefaultExpiration:
    t.setKey(key, value, d = t.defaultExpiration)

a helper to set key based on an Expiration strategy

  • if NeverExpires : ttl should be 0
  • if DefaultExpiration: ttl will be the same as the Cachetable defaultExpiration duration
proc setKeyWithDefaultTtl*[K, V](t: CacheTable[K, V], key: K, value: V) =
  ## Sets a key with default Ttl duration.
  t.setKey(key, value, DefaultExpiration)

sets a key to value with default expiration

proc hasKey*[K, V](t: CacheTable[K, V], key: K): bool =
  ## Checks if `key` exists in cache
  result = t.cache.hasKey(key)

Check if the cache underneath has a specific key

proc isExpired(ttl: int64): bool =
  if ttl == 0:
    # echo "duration 0 never expires."
    result = false
  else:
    let rightnow = times.getTime()
    let rightNowDur = times.initDuration(seconds = rightnow.toUnix(),
        nanoseconds = rightnow.nanosecond)
    # echo "Now is : " & $rightnow
    result = rightnowDur.inNanoseconds > ttl

Helper to check if a ttl expired relative to the time right now.

proc get*[K, V](t: CacheTable[K, V], key: K): Option[V] =
  ## Get value of `key` from cache
  var entry: Entry[V]
  try:
      entry = t.cache[key]
  except:
    return none(V)

  # echo "getting entry for key: " & key  & $entry
  if not isExpired(entry.ttl):
    # echo "k: " & key & " didn't expire"
    return some(entry.value)
  else:
    # echo "k: " & key & " expired"
    del(t.cache, key)
    return none(V)

Getting a key from the cache to returns an Option[V] of the value of type V stored in the Entry[V].

Thank you for reading! and please feel free to open an issue or a PR to improve to content of Nim Days :)

Parser combinators

Today, we will learn about Parser Combinators and Nim. a parser is something (a function) accepts some text and creates a decent structure out of it (that's not formal definition by any means). First time I learned about Parser combinator when I was (still for sure) learning haskell, I was amazed by the expressiveness and composebility. Lots of languages has libraries based on parser combinators e.g python pyparsing

from pyparsing import Word, alphas
greet = Word(alphas) + "," + Word(alphas) + "!"
hello = "Hello, World!"
print(hello, "->", greet.parseString(hello))


The program outputs the following:

Hello, World! -> ['Hello', ',', 'World', '!']

Here in this program we literally said we want to create a greet parser that's the combination of a Word of alphas followed by a literal comma , then followed by another Word of alphas then followed by a literal exclamation point !. That greet parser is only capable of parsing a text that can be broken down to the small chunks (parsable parts) we mentioned.

Imagine in python you could express that json grammar using pyparsing in around 25 lines?

import pyparsing as pp
from pyparsing import pyparsing_common as ppc


def make_keyword(kwd_str, kwd_value):
    return pp.Keyword(kwd_str).setParseAction(pp.replaceWith(kwd_value))


TRUE = make_keyword("true", True)
FALSE = make_keyword("false", False)
NULL = make_keyword("null", None)

LBRACK, RBRACK, LBRACE, RBRACE, COLON = map(pp.Suppress, "[]{}:")

jsonString = pp.dblQuotedString().setParseAction(pp.removeQuotes)
jsonNumber = ppc.number()

jsonObject = pp.Forward()
jsonValue = pp.Forward()
jsonElements = pp.delimitedList(jsonValue)
jsonArray = pp.Group(LBRACK + pp.Optional(jsonElements, []) + RBRACK)
jsonValue << (
    jsonString | jsonNumber | pp.Group(jsonObject) | jsonArray | TRUE | FALSE | NULL
)
memberDef = pp.Group(jsonString + COLON + jsonValue)
jsonMembers = pp.delimitedList(memberDef)
jsonObject << pp.Dict(LBRACE + pp.Optional(jsonMembers) + RBRACE)

jsonComment = pp.cppStyleComment
jsonObject.ignore(jsonComment)

A more formal definition According to wikipedia, In computer programming, a parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output. In this context, a parser is a function accepting strings as input and returning some structure as output, typically a parse tree or a set of indices representing locations in the string where parsing stopped successfully. Parser combinators enable a recursive descent parsing strategy that facilitates modular piecewise construction and testing. This parsing technique is called combinatory parsing.

So today, we will try to create a small parser combinators (parsec library) in nim with the following expectation

What to expect

parsing just one letter


  let aParser = charp('a')
  let bParser = charp('b')
  echo $aParser.parse("abc")
  # <Right parsed: @["a"], remaining: bc >
  echo $bParser.parse("bca")
  # <Right parsed: @["b"], remaining: ca >

parsing a letter followed by another letter

  let abParser = charp('a') >> charp('b')
  echo $abParser.parse("abc")
  # <Right parsed: @["a", "b"], remaining: c >

parsing one or the other

  let aorbParser = charp('a') | charp('b')
  echo $aorbParser.parse("acd")
  # <Right parsed: @["a"], remaining: cd >

  echo $aorbParser.parse("bcd")
  # <Right parsed: @["b"], remaining: cd >

parsing abc

  let abcParser = parseString("abc")
  echo $abcParser.parse("abcdef")
  # <Right parsed: @["abc"], remaining: def >

parsing many a's

  let manyA = many(charp('a'))
  echo $manyA.parse("aaab")
  # <Right parsed: @["a", "a", "a"], remaining: b >

  echo $manyA.parse("bbb")
  # <Right parsed: @[], remaining: bbb >

parsing at least 1 a

  let manyA1 = many1(charp('a'))
  echo $manyA1.parse("aaab")
  # <Right parsed: @["a", "a", "a"], remaining: b >
  echo $manyA1.parse("bbb")
    Left Expecting '$a' and found 'b'
  # 

parsing many digits

  let manyDigits = many1(digit)
  echo $manyDigits.parse("1234")
  # <Right parsed: @["1", "2", "3", "4"], remaining:  >

parsing digits separated by comma

  let commaseparatednums = sep_by(charp(',').suppress(), digit)
  echo $commaseparatednums.parse("1,2,4")
  # <Right parsed: @["1", "2", "4"], remaining:  >

Creating the greet parser from pyparsing

  let greetparser = word >> charp(',').suppress() >> many(ws).suppress() >> word
  echo $greetparser.parse("Hello,   World")
  # <Right parsed: @["Hello", "World"], remaining:  >

Multiply parser

  echo $(letter*3).parse("abc")
  # <Right parsed: @["a", "b", "c"], remaining:  >

parsing UUIDs

  let uuidsample = "db9674c4-72a9-4ab9-9ddd-1d641a37cde4"
  let uuidparser =(hexstr*8).map(smashtransformer) >> charp('-') >> (hexstr*4).map(smashtransformer) >> charp('-') >>  (hexstr*4).map(smashtransformer) >> charp('-') >> (hexstr*4).map(smashtransformer) >> charp('-') >> (hexstr*12).map(smashtransformer)
  echo $uuidparser.parse(uuidsample)
  # <Right parsed: @["db9674c4", "-", "72a9", "-", "4ab9", "-", "9ddd", "-", "1d641a37cde4"], remaining:  >

parsing recursive nested structures (ints or list of [ints or lists])

  var listp: Parser
  var valref = (proc():Parser =digits|listp)
  listp = charp('[') >> sep_by(charp(',').suppress(), many(valref)) >> charp(']')
  var valp = valref()

  echo $valp.parse("1")
  # <Right parsed: @["1"], remaining:  >
  echo $valp.parse("[1,2]")
  # <Right parsed: @["[", "1", "2", "]"], remaining:  >
  echo $valp.parse("[1,[1,2]]")
  #<Right parsed: @["[", "1", "[", "1", "2", "]", "]"], remaining:  >

Implementation

the idea of a parser is something that accepts a text and returns Either an success (with the info of what got consumed of the text and what is still remaining) or a failure with some error messages

                                        -> success( parsed, remaining)
stream of characters ->  [  parser  ]
                                        -> failure (what went wrong message)

and

  • if it was a failure we abort the parsing operation
  • if it was a success we try to continue with the next parser

that's the basic idea

imports

import strformat, strutils, sequtils

well, we will be dealing with lots of strings and lists, so probably we need strformat, strutils, and sequtils

Either and its friends

Either is one of my favorite types, bit more advanced than a Maybe or Option, because it allows returning specific error message instead of just none that gives us no idea what went wrong.

data Either a b = Left a | Right b

Either a success Right with data of type b or failure Left with data of type a

we can try to describe it in Nim as variant as follows

type 
  EitherKind = enum
    ekLeft, ekRight
  Either = ref object
    case kind*: EitherKind 
    of ekLeft: msg*: string
    of ekRight: val*: tuple[parsed: seq[string], remaining:string]

Here we defined the kind EitherKind that can be ekLeft or ekRight and on the variant Either we define msg in case if kind was ekLeft for error message msg and in case of ekRight we define val which is the "parsed and the remaining" parts of the input string.

proc map*(this: Either, f: proc(l:seq[string]):seq[string]): Either =
  case this.kind
  of ekLeft: return this
  of ekRight: 
    return Either(kind:ekRight, val:(parsed:f(this.val.parsed), remaining:this.val.remaining))

Here we define the map function for the type either, basically what happens when we apply a function on the either type, it should unwrap the data in Right, pass it to the function and return a new Either (transformed either) and in case of Left we return the same Either

proc `$`*(this:Either): string =
  case this.kind
  of ekLeft: return fmt"<Left {this.msg}>"
  of ekRight: return fmt("<Right parsed: {this.val.parsed}, remaining: {this.val.remaining} >")

converting the either to string by defining $ function

proc `==`*(this: Either, other: Either): bool =
  return this.kind == other.kind

here we define simple comparison for the either objects (basically checking if both are ekRight or both are ekLeft)

Now to the parsers

We can exploit the feature of the objects to hold some more instructions for the parser, but typically parser combinators are about composing higher order functions together to parse a text, we can try to emulate that with objects and taking a short cut

type
  Parser = ref object
    f* : proc(s:string):Either
    suppressed*: bool

Here we define a Parser type that

  • holds a function f (real parser that consumes the input string and returns an Either)
  • suppressed a flag to indicate we want to ignore the parsed text

suppressed can be very useful in ignoring/discarding dashes in a string (e.g uuid text) or commas in a CSV row.

proc newParser(f: proc(s:string):Either, suppressed:bool=false): Parser =
  var p = Parser()
  p.suppressed = suppressed
  p.f = f 
  return p

helper to create a new parser, from a real parsing function function proc(s:string):Either and suppressed flag,

proc `$`*(this:Parser): string =
  return fmt("<Parser:>")

allowing our parser to convert to string by defining $

proc parse*(this: Parser, s:string): Either =
  return this.f(s)


  • parse is a function that receives a string then executes the underlying parser in f from that input string to Either type.
proc map*(this:Parser, transformer:proc(l:seq[string]):seq[string]):Parser =
  proc inner(s:string):Either = 
    return this.f(s).map(transformer)
  return newParser(f=inner)

Here we define a map function to transform the underlying parser result once executed the idea here is we return a new parser wrapping an inner function with all transformation knowledge (if bit tricky move to next)

proc suppress*(this: Parser): Parser = 
    this.suppressed = true 
    return this

here we change the suppressed flag to true, should be used as in the examples mentioned in what to expect section

  let commaseparatednums = sep_by(charp(',').suppress(), digit)
  echo $commaseparatednums.parse("1,2,4")

Here we will be interested in the digits 1 and 2 and 4 and want to ignore the commas in the input string, so that's what suppress helps us with.

Parsing a single character

now we would like to be able to parse a single character and get parsed value and the remaining characters

  let aParser = charp('a')
  echo $aParser.parse("abc")
  # (parsed a, remaining bc)

proc charp*(c: char): Parser =
  proc curried(s:string):Either =
      if s == "":
          let msg = "S is empty"
          return Either(kind:ekLeft, msg:msg)
      else:
          if s[0] == c:
            let rem = s[1..<s.len]
            let parsed_string = @[$c]
            return Either(kind:ekRight, val:(parsed:parsed_string, remaining:rem))
          else:
              return Either(kind:ekLeft, msg:fmt"Expecting '${c}' and found '{s[0]}'")
  return newParser(curried)

here we defined a charp function that takes a character to parse and returns a Parser only capable of parsing that character

  • we check if empty string, we return Left Either with ekLeft kind
  • we check if the string starts with the character we want to parse, if so we return a an Either with a Right of that characater and the rest of the string or we return a Left if the string doesn't start with the character we plan to parse
  • all of the parsing logic we define in a function curried that we pass to newParser

Sequential parsers

now we would like to parse a then b sequentially. possible if we create parser for a and a parser for b and try to (parse a andThen parse b). the statement can be converted to proc `andThen(parserForA, parserForB). let's define that function


  let abParser = charp('a') >> charp('b')
  echo $abParser.parse("abc")
  # parse: [a, b] and remaining c

proc andThen*(p1: Parser, p2: Parser): Parser =
    proc curried(s: string) : Either= 
        let res1 = p1.parse(s)
        case res1.kind
        of ekLeft:
          return res1
        of ekRight:
            let res2 = p2.parse(res1.val.remaining) # parse remaining chars.
            case res2.kind
            of ekLeft:
              return res2
            of ekRight:
                let v1 = res1.val.parsed
                let v2 = res2.val.parsed
                var vs: seq[string] = @[]
                if not p1.suppressed: #and _isokval(v1):
                    vs.add(v1) 
                if not p2.suppressed: #and _isokval(v2):
                    vs.add(v2)
                return Either(kind:ekRight, val:(parsed:vs, remaining:res2.val.remaining)) 
            return res2

    return newParser(f=curried)


proc `>>`*(this: Parser, rparser:Parser): Parser =
  return andThen(this, rparser)

Straight forward

  • if parsing with p1 fails, we fail with Left
  • if parsing with p1 succeed, we try to parse with p2
    • if parsing p2 works the whole thing returns Right
    • if it doesn't we return Left
  • we create >> function to a more pleasing api

alternate parsing

Now we want to try parsing with one parse or the other and only fail if both can't parse

  let aorbParser = charp('a') | charp('b')
  echo $aorbParser.parse("acd")
  echo $aorbParser.parse("bcd")

Here we want to be able to parse a or b


proc orElse*(p1, p2: Parser): Parser =
    proc curried(s: string):Either=
        let res = p1.parse(s)
        case res.kind
        of ekRight:
          return res
        of ekLeft:
          let res = p2.parse(s)
          case res.kind
          of ekLeft:
            return Either(kind:ekLeft, msg:"Failed at both")
          of ekRight:
            return res

    return newParser(curried)

proc `|`*(this: Parser, rparser: Parser): Parser =
  return orElse(this, rparser)


  • if we are able to parse with p1 we return with Right

  • if we can't parse with p1 we try to parse with p2

    • if we succeed we return a Right
    • if we can't we return failure with Left
  • we define more pleasing syntax |

Parsing n times

we want to parse with a parsers n times so instead of doing this

threetimesp1 = p1 >> p1 >> p1

we want to write

threetimesp1 = p1*3

proc n*(parser:Parser, count:int): Parser = 
    proc curried(s: string): Either =
        var mys = s
        var fullparsed: seq[string] = @[]
        for i in countup(1, count):
            let res = parser.parse(mys)
            case res.kind
            of ekLeft:
                return res
            of ekRight:
                let parsed = res.val.parsed
                mys = res.val.remaining
                fullparsed.add(parsed) 

        return Either(kind:ekRight, val:(parsed:fullparsed, remaining:mys))
    return newParser(f=curried)
    

proc `*`*(this:Parser, times:int):Parser =
       return n(this, times) 
  • here we try to apply the parser count times
  • we create * function for more pleasing api

parsing letters, upper, lower, digits

now we want to be able to parse any alphabet letter and digits with something like

let letter = anyOf(strutils.Letters)
let lletter = anyOf({'a'..'z'})
let uletter = anyOf({'A'..'Z'})
let digit = anyOf(strutils.Digits)

for digit we can do

digit = charp("1") | charp("2") | charp("3") | charp("4") ...

but definitely it looks much nicer with anyOf syntax, so the idea is we create parsers for the elements in the set and try to orElse between them

Here we define choice


proc choice*(parsers: seq[Parser]): Parser = 
    return foldl(parsers, a | b)

proc anyOf*(chars: set[char]): Parser =
    return choice(mapIt(chars, charp(it)))

  • choice is generic function over any Parsers seq that tries them in order
  • anyOf takes in characters that then gets converted to parser using mapIt and charp parser generator (from character to a Parser)

Parsing a complete string

Now we would like to parse complete string "abc" from "abcdef" instead of doing

abcParser = charp('a') >> charp('b') >> charp('c')

we want an easier syntax that gets expanded to that have

abcParser = parseString("abc)
parseString parser

proc parseString*(s:string): Parser =
  var parsers: seq[Parser] = newSeq[Parser]()
  for c in s:
    parsers.add(charp(c))
  var p = foldl(parsers, a >> b)
  return p.map(proc(l:seq[string]):seq[string] = @[join(l, "")])

Optionally

What if we want to mark a parser as optional to exist? for example if we are parsing a greet statement and it's valid to not to have ! for instance ("Hello World" and "Hello World !") both should be parsable without greet parser.

We probably want to define it like that

  let greetparser = word >> charp(',').suppress() >> many(ws).suppress() >> word >> optionally(charp('!'))
  echo $greetparser.parse("Hello,   World")
  #<Right parsed: @["Hello", "World", ""], remaining:  >
  echo $greetparser.parse("Hello,   World!")
  # <Right parsed: @["Hello", "World", "!"], remaining:  >

Notice the optionally(charp('!')) it marks a parser as an option.


proc optionally*(parser: Parser): Parser =
    let myparsed = @[""]
    let nonproc = proc(s:string):Either = Either(kind:ekRight, val:(parsed:myparsed, remaining:""))
    let noneparser = newParser(f=nonproc)
    return parser | noneparser

What we basically do is we fake a success parser that we try to parse with the parser passed and if we can't we succeed with noneparser

many: zero or more

Here we try to parse as many as we can of a specific parser, e.g parse as many as as we can from a string.

proc parseZeroOrMore(parser: Parser, inp:string): Either = #zero or more
    let res = parser.parse(inp)
    case res.kind
    of ekLeft:
      let myparsed: seq[string] = @[]
      return Either(kind:ekRight, val:(parsed:myparsed, remaining:inp))
    of ekRight:
      let firstval = res.val.parsed
      let restinpafterfirst = res.val.remaining
      # echo "REST INP AFTER FIRST " & restinpafterfirst
      let res = parseZeroOrMore(parser, restinpafterfirst)
      case res.kind
      of ekRight:
        let subseqvals = res.val.parsed
        let remaining = res.val.remaining
        var values:seq[string] = newSeq[string]()
        # echo "FIRST VAL: " & firstval
        # echo "SUBSEQ: " & $subseqvals
        values.add(firstval)
        values.add(subseqvals)
        return Either(kind:ekRight, val:(parsed:values, remaining:remaining))
      of ekLeft:
        let myparsed: seq[string] = @[]
        return Either(kind:ekRight, val:(parsed:myparsed, remaining:inp))

proc many*(parser:Parser):Parser =
    proc curried(s: string): Either =
        return parse_zero_or_more(parser,s)

many1: one or more

proc many1*(parser:Parser): Parser =
    proc curried(s: string): Either =
        let res = parser.parse(s)
        case res.kind
        of ekLeft:
          return res
        of ekRight:
          return many(parser).parse(s)
    return newParser(f=curried)
  • Here we try to parse once manually
    • if parsing succeed we invoke the many parser
    • if parsing fails we return a left

Separated by parser

Most of the times the data we parse are separated by something a comma, space, a dash.. etc and we would like to have a simple way to parse data without hassling with commas, .. etc To make something like that possible

  let commaseparatednums = sep_by(charp(',').suppress(), digit)
  echo $commaseparatednums.parse("1,2,4")
proc sep_by1*(sep: Parser, parser:Parser): Parser =
    let sep_then_parser = sep >> parser
    return (parser >> many(sep_then_parser))

proc sep_by*(sep: Parser, parser:Parser): Parser =
  let myparsed = @[""]
  let nonproc = proc(s:string):Either = Either(kind:ekRight, val:(parsed:myparsed, remaining:""))
  return (sep_by1(sep, parser) | newParser(f=nonproc))


How does that work? Lets assume the example a,b,c we want to describe it as sepBy commaParser letterParser. perfect. then how do we mentally reason about parts? well we start with parsing a letter then comma then letter then comma then letter

so letter then (separator >> letter) many times, that's exactly this line in sep_by1

    return (parser >> many(sep_then_parser))

Surrounded By

if we want to make sure something is surrounded by something e.g single quotes or | we can use surroundedBy helper

  let sur3pipe = surroundedBy(charp('|'), charp('3'))
  echo $sur3pipe.parse("|3|")
  #<Right parsed: @["|", "3", "|"], remaining:  >

Implementation should be as easy as

let surroundedBy = proc(surparser, contentparser: Parser): Parser =
    return surparser >> contentparser >> surparser

Between

between is more generic that surroundedBy because the opening and closing can be different e.g (3)

  let paren3 = between(charp('('), charp('3'), charp(')') )
  echo paren3.parse("(3)")
  # <Right parsed: @["(", "3", ")"], remaining:  >

Implementation should be as easy as

let between = proc(p1, p2, p3: Parser): Parser =
    return p1 >> p2 >> p3

Parsing recursive nested structures

Next, we have a very simple language where you can have

  • chars
  • list of chars or list

It's going to be very easy to express

  var listp: Parser
  var valref = (proc():Parser =letters|listp)

  listp = charp('[') >> sep_by(charp(',').suppress(), many(valref)) >> charp(']')
  var valp = valref()

Here's probably the tricky part, let's think about it for a second, we want to says

lang = list | letter and list = list of lang, we need to delay one of them to be able to reference, and delaying usually means "convert to a function" or at least have it's info "declared already", and that's what we do with listp: Parser just giving nim the info that there will be listp at some point and for the lang parser we create a function that returns list | letter (that's the reason you will find some of our parsec parsers accept proc in some of their overloads instead of just parser only) and once we are done with the declaration of listp now we can invoke valref function to get an actual usable parser to use.


  var inps = @["a", "[a,b]", "[a,[b,c]]"]
  for inp in inps:
      echo &"inp : {inp}"
      let parsed = valp.parse(inp)
      if parsed.kind == ekRight:
          let data = parsed.val.parsed
          echo inp, " => ", $parseToNimData(data)

we only need a function parseToNimData to convert, typically we should be able to use enhance the usage of maps to actually convert the data to the desired type "in the same time of the parsing"

Before defining parseToNimData, let's define the language elements first

  # recursive lang ints and list of ints or lists
  type 
    LangElemKind = enum
        leChr, leList
    LangElem = ref object
        case kind*: LangElemKind 
        of leChr: c*: char
        of leList: l*: seq[LangElem]
  

  proc `$`*(this:LangElem): string =
    case this.kind
    of leChr: return fmt"<Char {this.c}>"
    of leList: return fmt("<List: {this.l}>")

  proc `==`*(this: LangElem, other: LangElem): bool =
    return this.kind == other.kind

We state that our language can have two kind of LangElemKind

  • leChr: for chracters
  • leList: for lists of any langauge element.

  proc parseToNimData(data: seq[string]) : LangElem =
    result = LangElem(kind:leList, l: @[])
    let dataIsList = data[0][0] == '['
    for el in data:
      var firstchr = el[0]
      if firstchr.isAlphaAscii():
        var elem = LangElem(kind:leChr, c:firstchr)
        if dataIsList == false:
            return elem
        else:
             result.l[result.l.len-1].l.add(LangElem(kind:leChr, c:firstchr))

      elif firstchr == '[':
          result.l.add(LangElem(kind:leList, l: @[]))

parseToNimData is a simple transformer that builds the tree of the suceessfully parsed strings converting them into LangElems This is how the final result looks like

inp : a
@["parsed data: ", "a"]
a => <Char a>
inp : [a,b]
@["parsed data: ", "[", "a", "b", "]"]
[a,b] => <List: @[<List: @[<Char a>, <Char b>]>]>
inp : [a,[b,c]]
@["parsed data: ", "[", "a", "[", "b", "c", "]", "]"]
[a,[b,c]] => <List: @[<List: @[<Char a>]>, <List: @[<Char b>, <Char c>]>]>

That's it!

More resources on the topic

Thank you for reading! and please feel free to open an issue or a PR to improve to content of Nim Days or improving the very young nim-parsec :)