Day 12: Implementing Redis Protocol

Today we will implement RESP (REdis Serialization Protocol) in Nim. Hopefully you read Day 2 on bencode data format (encoding/parsing) because we will be using the same techniques.

RESP

From redis protocol page.

Redis clients communicate with the Redis server using a protocol called RESP (REdis Serialization Protocol). While the protocol was designed specifically for Redis, it can be used for other client-server software projects.

RESP is a compromise between the following things:

Simple to implement.
Fast to parse.
Human readable.
RESP can serialize different data types like integers, strings, arrays. There is also a specific type for errors. Requests are sent from the client to the Redis server as arrays of strings representing the arguments of the command to execute. Redis replies with a command-specific data type.

So, basically we have 5 types (ints, strings, bulkstrings, errors, arrays)

What do we expect?

  • able to decode strings into Reasonable structures in Nim
  echo decodeString("*3\r\n:1\r\n:2\r\n:3\r\n\r\n")
  # # @[1, 2, 3]
  echo decodeString("+Hello, World\r\n")
  # # Hello, World
  echo decodeString("-Not found\r\n")
  # # Not found
  echo decodeString(":1512\r\n")
  # # 1512
  echo $decodeString("$32\r\nHello, World THIS IS REALLY NICE\r\n")
  # Hello, World THIS IS REALLY NICE
  echo decodeString("*2\r\n+Hello World\r\n:23\r\n")
  # @[Hello World, 23]
  echo decodeString("*2\r\n*3\r\n:1\r\n:2\r\n:3\r\n\r\n*5\r\n:5\r\n:7\r\n+Hello Word\r\n-Err\r\n$6\r\nfoobar\r\n")
  # @[@[1, 2, 3], @[5, 7, Hello Word, Err, foobar]]
  echo $decodeString("*4\r\n:51231\r\n$3\r\nfoo\r\n$-1\r\n$3\r\nbar\r\n")
  # @[51231, foo, , bar]
  • able to encode Nim structures representing Redis values into RESP
  echo $encodeValue(RedisValue(kind:vkStr, s:"Hello, World"))
  # # +Hello, World
  echo $encodeValue(RedisValue(kind:vkInt, i:341))
  # # :341
  echo $encodeValue(RedisValue(kind:vkError, err:"Not found"))
  # # -Not found
  echo $encodeValue(RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)]  ))
  # #*2
  # #+Hello World
  # #:23

  echo $encodeValue(RedisValue(kind:vkBulkStr, bs:"Hello, World THIS IS REALLY NICE"))
  # #$32
  # # Hello, World THIS IS REALLY NICE  

Implementation

Imports and constants

Let's starts with main imports

import strformat, strutils, sequtils,
const CRLF = "\r\n"
const REDISNIL = "\0\0"
  • CRLF is really important because lots of the protocol depends on that separator \r\n
  • REDISNIL \0\0 to represent Nil values

Data types

Again, as in Bencode chapter we will define a variant RedisValue that represents All redis datatypes strings, errors, bulkstrings, ints, arrays


  ValueKind = enum
    vkStr, vkError, vkInt, vkBulkStr, vkArray

  RedisValue* = ref object
    case kind*: ValueKind
    of vkStr: s*: string
    of vkError : err*: string
    of vkInt: i*: int
    of vkBulkStr: bs*: string
    of vkArray: l*: seq[RedisValue]

Let's add $, hash, == procedures


import hashes

proc `$`*(obj: RedisValue): string = 
  result = case obj.kind
  of vkStr : obj.s
  of vkBulkStr: obj.bs
  of vkInt : $obj.i
  of vkArray: $obj.l
  of vkError: obj.err

proc hash*(obj: RedisValue): Hash = 
  result = case obj.kind
  of vkStr : !$(hash(obj.s))
  of vkBulkStr: !$(hash(obj.bs))
  of vkInt : !$(hash(obj.i))
  of vkArray: !$(hash(obj.l))
  of vkError: !$(hash(obj.err))

proc `==`* (a, b: RedisValue): bool =
  ## Check two nodes for equality
  if a.isNil:
      result = b.isNil
  elif b.isNil or a.kind != b.kind:
      result = false
  else:
      case a.kind
      of vkStr:
          result = a.s == b.s
      of vkBulkStr:
          result = a.s == b.s
      of vkInt:
          result = a.i == b.i
      of vkArray:
          result = a.l == b.l
      of vkError:
          result = a.err == b.err

Encoder

Encoding is just converting the variant RedisValue to the correct representation according to RESP

Encode simple strings

To encode simple strings specs says OK should be +OK\r\n


proc encodeStr(v: RedisValue) : string =
  return fmt"+{v.s}{CRLF}"

Encode Errors

To encode errors we should precede it with - and end it with \r\n. So Notfound should be encoded as -Notfound\r\n

proc encodeErr(v: RedisValue) : string =
  return fmt"-{v.err}{CRLF}"

Encode Ints

Ints are encoded :NUM\r\n so 95 is :95\r\n

proc encodeInt(v: RedisValue) : string =
  return fmt":{v.i}{CRLF}"

Encode Bulkstrings

From RESP page

Bulk Strings are used in order to represent a single binary safe string up to 512 MB in length.

Bulk Strings are encoded in the following way:

A "$" byte followed by the number of bytes composing the string (a prefixed length), terminated by CRLF.
The actual string data.
A final CRLF.
So the string "foobar" is encoded as follows:

"$6\r\nfoobar\r\n"
When an empty string is just:

"$0\r\n\r\n"
RESP Bulk Strings can also be used in order to signal non-existence of a value using a special format that is used to represent a Null value. In this special format the length is -1, and there is no data, so a Null is represented as:

"$-1\r\n"
proc encodeBulkStr(v: RedisValue) : string =
  return fmt"${v.bs.len}{CRLF}{v.bs}{CRLF}"

Encode Arrays

To encode an array we do * followed by array length then \r\n then encode each element then end the array encoding with \r\n

  • As we are calling encode we should forward declared it

proc encode*(v: RedisValue) : string 
proc encodeArray(v: RedisValue): string = 
  var res = "*" & $len(v.l) & CRLF
  for el in v.l:
    res &= encode(el)
  res &= CRLF
  return res

So for instance to encode encodeValue(RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)] )) The result should be

*2\r\n
+Hello World\r\n
:23\r\n
\r\n

Encode any data type

Here we switch on the passed variant and dispatch the encoding to the reasonable encoder.

proc encode*(v: RedisValue) : string =
  case v.kind 
  of vkStr: return encodeStr(v)
  of vkInt:    return encodeInt(v)
  of vkError:  return encodeErr(v)
  of vkBulkStr: return encodeBulkStr(v)
  of vkArray: return encodeArray(v)

Decoder

Decoding is converting RESP representation into the correct Nim structures RedisValue, Basically the reverse of what we did in the previous chapter

Please note: Basic strategy is Returning the RedisValue and the length of processed characters

Decode simple string

proc decodeStr(s: string): (RedisValue, int) =
  let crlfpos = s.find(CRLF)
  return (RedisValue(kind:vkStr, s:s[1..crlfpos-1]), crlfpos+len(CRLF))

So, Here we are creating RedisValue of kind vkStr of the string between + and \r\n

Decode errors

proc decodeError(s: string): (RedisValue, int) =
  let crlfpos = s.find(CRLF)
  return (RedisValue(kind:vkError, err:s[1..crlfpos-1]), crlfpos+len(CRLF))

Here we are creating RedisValue of kind vkError of the string between - and \r\n

Decode ints

Nums as we said are the values between : and \r\n so we parseInt of the characters between : and \r\n and create RedisValue of kind vkInt with that parsed int.

proc decodeInt(s: string): (RedisValue, int) =
  var i: int
  let crlfpos = s.find(CRLF)
  let sInt = s[1..crlfpos-1]
  if sInt.isDigit():
    i = parseInt(sInt)
  return (RedisValue(kind:vkInt, i:i), crlfpos+len(CRLF))

Decode bulkstrings

Bulkstrings are between $ followed by the string length and \r\n

  • string length == 0: empty string
  • string length == -1: nil
  • string length > 0: string with data

proc decodeBulkStr(s:string): (RedisValue, int) = 
  let crlfpos = s.find(CRLF)
  var bulklen = 0
  let slen = s[1..crlfpos-1]
  bulklen = parseInt(slen)
  var bulk: string
  if bulklen == -1:
      bulk = nil
      return (RedisValue(kind:vkBulkStr, bs:REDISNIL), crlfpos+len(CRLF))
  else:
    let nextcrlf = s.find(CRLF, crlfpos+len(CRLF))
    bulk = s[crlfpos+len(CRLF)..nextcrlf-1] 
    return (RedisValue(kind:vkBulkStr, bs:bulk), nextcrlf+len(CRLF))

Decode arrays

This is the trickiest part is to decode array

  • first we need to get the length between * and \r\n
  • then decode objects array length times, and add them to arr
  • As we are calling decode we should forward declared it
proc decode(s: string): (RedisValue, int)
proc decodeArray(s: string): (RedisValue, int) =
  var arr = newSeq[RedisValue]()
  var arrlen = 0
  var crlfpos = s.find(CRLF)
  var arrlenStr = s[1..crlfpos-1]
  if arrlenStr.isDigit():
     arrlen = parseInt(arrlenStr)
  
  var nextobjpos = s.find(CRLF)+len(CRLF)
  var i = nextobjpos 
  
  if arrlen == -1:
    
    return (RedisValue(kind:vkArray, l:arr), i)
  
  while i < len(s) and len(arr) < arrlen:
    var pair = decode(s[i..len(s)])
    var obj = pair[0]
    arr.add(obj)
    i += pair[1]
  return (RedisValue(kind:vkArray, l:arr), i+len(CRLF))

So this RESP

*2\r\n
+Hello World\r\n
:23\r\n
\r\n

Should be decoded to RedisValue(kind:vkArray, l: @[RedisValue(kind:vkStr, s:"Hello World"), RedisValue(kind:vkInt, i:23)] )

Decode any object

Based on the first character we dispatch to the correct decoder then we skip the processed count in the string to decode the next object.

proc decode(s: string): (RedisValue, int) =
  var i = 0 
  while i < len(s):
    var curchar = $s[i]
    if curchar == "+":
      var pair = decodeStr(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "-":
      var pair = decodeError(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "$":
      var pair = decodeBulkStr(s[i..len(s)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == ":":
      var pair = decodeInt(s[i..s.find(CRLF, i)+len(CRLF)])
      var obj =  pair[0]
      var count =  pair[1]
      i += count
      return (obj, i)
    elif curchar == "*":
      var pair = decodeArray(s[i..len(s)])
      let obj = pair[0]
      let count =  pair[1]
      i += count 
      return (obj, i)
    else:
      echo fmt"Unrecognized char {curchar}"
      break

Preparing commands

In redis, commands are sent as List of RedisValues

so GET USER is converted to *2\r\n$3\r\nGET\r\n$4\r\nUSER\r\n\r\n

proc prepareCommand*(this: Redis, command: string, args:seq[string]): string =
  let cmdArgs = concat(@[command], args)
  var cmdAsRedisValues = newSeq[RedisValue]()
  for cmd in cmdArgs:
    cmdAsRedisValues.add(RedisValue(kind:vkBulkStr, bs:cmd))
  var arr = RedisValue(kind:vkArray, l: cmdAsRedisValues)

  return encode(arr)

nim-resp

That day is based on nim-resp project, and on-going effort to create a redis client in Nim, it supports pipelining feature and all of the previous code. Feel free to send PRs or open issues