Learning from the Masters

// read integer from argument str
static Int argToInt(String *sp)
{
    Int num = 0;
    while (isascii((*sp)[1]) && isdigit((*sp)[1])) {
        num = 10*num + (*(++*sp) - '0');
    }
    return num;
}
A C program that converts ASCII digits to integers.

Some interesting facts about ASCII to integer conversions:

  1. argToInt is a C function that converts ASCII digits to their integer equivalents. I pulled it from the Gofer language's commonui.c library. The original author is Mark P. Jones.

  2. ASCII digits do not translate directly to the numbers that they represent, but their encodings are adjacent. Characters '0' through '9' map to integers 47 through 58. Subtracting character '0' from character '9' will give you the integer 9. Take a sequence of characters "123", add their differences from character '0' to an accumulator that is repeatedly multiplied by a base — (10*((10*1)+2))+3 — and you get the integer 123.

  3. In C, type String is an alias for an array of characters. The string literal "123" is the array {'1', '2', '3', '\0'}, where '\0' is a special character denoting the end of a string. Without '\0' many string-related functions in the C standard library would never terminate. This is frightening considering how C has no bounds checking for arrays. A function might just read into some other program's memory.

Side Note: C characters are actually byte-long integers, signed or unsigned depending on the computer's architecture.

  1. C arrays and pointers are often one and the same. A C string is actually a pointer to the head of an array of characters. Increment the pointer by 1 and you get a pointer to the next character in the array. To index into an array involves multiplying that index by the size of the data type within the array and adding that product to the head of the array. Pointer arithmetic, incrementing a pointer by a number, is usually faster. This is an old-school optimization.

  2. The function argToInt inputs a pointer to an array of strings — a pointer to pointers. The absurd expression *(++*sp) dereferences the first pointer and grabs the second pointer to the underlying string, increments that pointer by 1, then dereferences that pointer to grab the underlying character. The character is then processed by the conversion algorithm I described in bullet point two.

  3. The expression (*sp)[1] is a pointer dereference followed by an indexing operation. (*sp) is wrapped in parentheses because pointer dereferencing — for some weird reason — has lesser precedence than indexing.

  4. I am forever grateful to the programmers that came before me. I've learned a lot by studying their code. The C code I just described inspired me to create the Scheme function fold-digits-by. I am grateful, but there are many parts of C that make me want to crawl out of my skin.

;; Given a radix, returns a function that folds a sequence
;; of character digits into their numerical equivalent.
;; Radix determines base.
(define fold-digits-by
  (lambda (radix)
    (lambda (xs)
      (fold-left (lambda (sum x)
                   (+ (* radix sum) (digit->integer x)))
                 0
                 xs))))
A Scheme program that folds sequences of character digits into their numerical equivalent