Thursday, 8 September 2011

Linked lists in JSON

It's time for another Javascript post! This time it will be about efficiency, which is not only a subject dear to my heart, but abnormally important for web development in Javascript. Javascript is a slow, interpreted language that runs on the browsers on people's phones, so we need tightly focused code to avoid slowing down things even more. More specifically, efficient data structures. Even more specifically, efficient lists.

A sea of lists

A list is conceptually simple: A finite ordered collection of things. There. That's it. So, why does there exist nine gadzillion different variations of it? Efficient programming boils down to choosing the right data structures for the task at hand. Even the simple list has many different implementations, each suited to tackling a different set of problems.

The most common list implementations are:
  1. Arrays. Just data stacked end-to-end in a long line. Efficient for random access reading, but costly to manipulate. To remove an element in the middle, you will have to traverse over the full list and shift all successive elements 1 index back.
  2. Generators. Aka "lazy lists". These are just code that emit one data element at a time. Very efficient memory usage, as elements are conjured only when needed. Horribly inefficient for lookup.
  3. Linked lists. A loose collection of nodes, where each node contains one data element, and has a reference to the next node in the list. Inefficient for lookup, but very efficient for manipulation and insertion.
In Javascript, and consequently JSON, only Arrays are natively supported. But that's fine, because Generators and Linked lists are easy to implement yourself. And that's what we're going to do today!

We're going to create a function for transforming Javascript arrays into JSON linked lists.

The Array to JSON Linked list translator

Array.prototype.toJSONll = function (idx) {
  idx = idx || 0;
  if (idx >= this.length) return "null";
  idxS = idx.toString();
  valueS = JSON.stringify(this[idx]);
  nextS = this.toJSONll(idx + 1);
  return ("{idx: " + idxS +
          ", value: " + valueS +
          ", next: " + nextS + "}");
  var idxS = null;
  var valueS = null;
  var nextS = null;
};
For convenience, it is implemented as an extension method to the standard Javascript Array class. This means you can include this snippet in your code, and voila! All your existing arrays now have a .toJSONll() method!

I've embedded the code into this actual webpage, so if you want to test it out, type a valid Javascript array into the text field below and push the button:
You can't test this, because you don't have Javascript enabled.  SYNTAX ERROR
That's all for this time! I might make my next post about encoding lazy lists in JSON. What do you guys think?

Tuesday, 2 August 2011

Variable closure in Javascript

First off; why should we care about Javascript? Silverlight rocks hard and is vastly superior to Javascript both in expressiveness and in standard library toolbox brevity.

A few years into the future, the above question will hopefully be answered with "what the hell is Javascript"?

Unfortunately, Silverlight adoption has not spread to all devices and platforms. For normal development, that is not a problem. Your customers are very likely using Silverlight ready platforms anyway.

The real world, however, is a cruel mistress. Business decisions - or local law - may force you to make your application available to as wide an audience as possible. In those cases, you need to use Javascript. And, when using Javascript, you need to be very careful what you do. It is a terribly undisciplined language with a weak standard library, inefficient arrays, variable types, no proper object orientation, and really weird variable scope.

This short post will focus on the latter problem: keeping your variables properly scoped.

Take the following example code, which is basically an extension method to the String type. It is a handy utility to count the number of characters in a string. If you implement it in your project, the expression "this".count() will evaluate to 4:

String.prototype.count = function (idxParam) {
  idx = idxParam || 0;
  returnValue = 0;
  if (this[idx])
    returnValue = 1 + this.count(idx + 1);
  return returnValue;
};

While this code works perfectly (try it!), there is one small problem with it. Can you spot it? Look closely. Closer. No? Unless you are a seasoned Javascript developer, the problem is impossible to spot and will likely remain invisible for a long time.

Remember I said Javascript has weird variable scope? In the above code, the variables have not been properly closed off and will leak out of the function and into the global namespace. After executing the above function, the variables "idx" and "returnValue" will be available to all functions everywhere. This may not sound like a real problem, but if some other functions also use variables with those names, they will re-use the above variables instead of receiving their own. Needless to say, this is likely to cause problems down the road. (Especially if you introduce multithreading!)

To check this, try to run the following code:
"Hello world!".count();

if (typeof(idx) !== "undefined")
  alert("Oopsie!");

So, how do we fix this? You need to "close" the variables off, so to speak. By nulling them out, you signal to Javascript that they are no longer needed when you exit the function:

String.prototype.count = function (idxParam) {
  idx = idxParam || 0;
  returnValue = 0;
  if (this[idx])
    returnValue = 1 + this.count(idx + 1);
  return returnValue;
  // close variables
  var idx = null;
  var returnValue = null;
};

// Do a quick check that "idx" is now scoped.
idx = undefined;
"Hello world!".count();
if (typeof(idx) === "undefined")
  alert("It works!");

Simple as that. Note that the var keyword is vital here. Without it, the last two lines would be normal program statements, and consequently would not be executed as they occur after the return statement. The magic var statements, on the other hand, are always executed.

The above hack works everywhere - I've tested in IE, Opera, Mozilla, Chrome, in Windows and on mobile devices. In the world of Javascript, this means it is practically defined behaviour. I recommend you adopt this as a best practice and remember it whenever you happen to step in some Javascript.

Saturday, 25 September 2010

A Better Way of Logging

The software for this post is freely available for download: Just the logger assembly, or logger source with test app.

Is logging necessary?

During development, logging is normally seen as an unnecessary burden, because you are working in a debugger. When the program crashes, your editor immediately pops up, highlighting the cause of the crash. When the program is behaving strangely, you can pause the program and inspect all relevant values.

Not so for software that's been sent to production, where there is no debugger. When the program crashes, all the user sees is a standard windows crash dialog and a sudden lack of application. When the program exhibits strange behaviour, the user can do nothing to find out why. Short of releasing all programs as source and requiring that all users run your software through a debugger, you need to know the state of the program prior to the crash.

This is the role of the logger - like ship captains of yore, it writes down every tedious little happening in a sequential list - a log. You can read the log afterwards and deduce a snapshot of the application state at any particular moment during its execution. When you get a report that the program has crashed or is doing things wrongly, you no longer have to desperately try to reproduce the problem yourself. Just ask for the log file, print it out, and sit down to deduce the error over a cup of tea like a ninja.

In any sufficiently interesting software, logging is crucial.

Is logging hard?

In principle, no. In practicle, yes. There are many pitfalls that can render a log file less than useful when disaster finally strikes.

The problems that most commonly haunt conventional log files are:
  1. Buffering. Usually just a big performance gain when writing to disk IO, buffering potentially means that the last hundred log entries have not yet been written to the log file when your program crashes. A log containing just some application startup info is about as useful as a paper cup in a storm.
  2. Multithreading. Multiprocessing your application is a good way to speed it up by several orders of magnitude. For logging, however, it means that two threads can try to log at the same time. A usual result of this is interleaved log entries, i.e. thread A logs "socket created" while thread B logs "shutting down database" at the same time, possibly resulting in the misleading log entries "shutting down socket" and "created database".
  3. File access rights. Access rights is serious business in Windows Vista and above. For logging, this means that if a user has decided to launch your application with "Run as administrator" the first time, the log file will be created with "Administrator" as owner. When running the application later as a normal user, he is now unable to append to the log file, and the program either crashes or doesn't log anything.
  4. Growing log file. After a while, your log file will grow to an improper size. It is foremost bad form to fill up your users' disks without their consent, but it is also difficult to read huge log files. If they are particularly large, they might even crash your Notepad when you open them.
There are many ways to at least partially solve these problems, but all of them are cumbersome and result in an overly large code base for a simple logger. Look at the sheer size of the code base of any freely available logging framework if you don't believe me.

But let us step back for a moment and look at the bigger puzzle. Do these problems have a common cause? Can we avoid these problems altogether?

Yes, the common cause is that you are operating with files. And yes, we can avoid these problems altogether with...

A Better way of Logging

To download the logger assembly, click here.

The solution is so obvious that I was half surprised no-one had thought of - or at least implemented - it before; log to the Windows Registry.

Let us look at how it affects our earlier problems:
  1. Buffering. Windows registry storage, and consequently disk IO buffering, is done on the OS level instead of application level. This means that no buffers are lost when your program crashes.
  2. Multithreading. The Windows registry access functions are already thread-safe, so you don't have to do anything.
  3. File access rights. No file means no file rights headaches! Nuff said.
  4. Growing log file. Since we no longer operate on a monolithic blob of bytes, but rather on discrete registry key values, we can remove entries willy-nilly. This means we can simply delete the oldest entries in O(1) time to keep the total log size within an approved threshold.
However, a credo of mine is that a brilliant idea is worthless without execution. Before even writing the opening paragraph of this blog post, I created a fully working, fully documented registry logger. It is of course open sourced and freely available for anyone to use.

The entire logger source code looks like this:

// Logger.cs
using System.Collections.Generic;
using Microsoft.Win32;

namespace RegistryLogger {
    public class Logger {
        private RegistryKey logRoot;
        private int headIdx;
        private int tailIdx;
        private int maxLogLength;

        /// <summary>
        /// Create logger object with Max Log Length of 10.
        /// </summary>
        /// <param name="appName">Name of your App.</param>
        public Logger(string appName) : this(appName, 10) { }

        /// <summary>
        /// Creates key HKCU\SOFTWARE\LOGS\[appName].
        /// Logger is now ready for use.
        /// </summary>
        /// <param name="appName">Name of your App.</param>
        /// <param name="maxLogLength">Limit total log storage to this many
        /// entries.</param>
        public Logger(string appName, int maxLogLength) {
            this.maxLogLength = maxLogLength;
            string lrp = "SOFTWARE\\LOGS\\" + appName;
            RegistryKeyPermissionCheck pc =
                RegistryKeyPermissionCheck.ReadWriteSubTree;
            logRoot = Registry.CurrentUser.OpenSubKey(lrp, pc);
            if (logRoot == null) {
                logRoot = Registry.CurrentUser.CreateSubKey(lrp, pc);
                headIdx = tailIdx = 0;
                logRoot.SetValue("headIdx", headIdx);
                logRoot.SetValue("tailIdx", tailIdx);
            } else {
                headIdx = (int)logRoot.GetValue("headIdx");
                tailIdx = (int)logRoot.GetValue("tailIdx");
            }
        }

        /// <summary>
        /// All currently stored log entries.
        /// </summary>
        public string[] Entries {
            get {
                List<string> area = new List<string>();
                for (int i = tailIdx; i < headIdx; i++)
                    area.Add((string)logRoot.GetValue(i.ToString()));
                return area.ToArray();
            }
        }

        /// <summary>
        /// Prepend a timestamp to the entry and log it as string value
        /// HKCU\SOFTWARE\LOGS\[appName]\[idx]
        /// Each new log entry gets an unique idx.
        /// </summary>
        /// <param name="entry">Entry to log.</param>
        public void Log(string entry) {
            string toLog = System.DateTime.Now.ToString() + ": " + entry;
            logRoot.SetValue(headIdx++.ToString(), toLog);
            logRoot.SetValue("headIdx", headIdx);
            DeleteOldEntries();
        }

        private void DeleteOldEntries() {
            while (headIdx - tailIdx > maxLogLength)
                logRoot.DeleteValue(tailIdx++.ToString());
            logRoot.SetValue("tailIdx", tailIdx);
        }
    }
}

Remember how I said the buffering, multithreading, access rights workarounds for log files were cumbersome? Look how elegant we can keep the code when those concerns are noncerns.

A simple test run

To download the logger source and the test application, click here.

To test this, I created a GUI interface in WPF to let you read and write log entries.

Let us enter a few sample entries and submit them to the logger:


Clicking the "Log Above Lines" button causes each line to be logged as a separate log entry. To see what we have logged, we can click "View Log":


Ok, looks good. But I bet you a can of tomatoes that I can guess what you are thinking right now. "How does this look in the registry?"

Let's have a look:


The entries are all there! We can also see the current logger state: The headIdx value says that the next entry to be logged will get idx 7, while the tailIdx value says that the oldest stored entry has idx 0. Since we can deduce every stored entry idx by starting from tailIdx and iterating up to headIdx, this is literally equivalent to a linked list. As anyone who has at least passed close by a computer science department knows, linked lists make for easy FIFO queues, and FIFO queues makes for easy constant-size buffers: If the size is too large, move the tail pointer and delete the previous tail element. Rinse, lather, repeat until size is within threshold.

For demonstration purposes, I've limited the logger to 10 entries. We now have 7 log entries, so let us see how it behaves when we add another 4:


Click "Log", which should take us to exactly 11 total entries, 1 above the threshold. Click "View":


As we can see, the first log entry "Application starting." is now deleted! Just to be sure, let us check the registry:


The entry with idx 0 is now gone, and the tailIdx is 1. Success!

In closing

Hopefully, you now know enough to implement proper logging in your applications. Whether you decide to use my logger code (free open source, as always) or roll your own registry logger, I will be very interested in hearing how it works out for you.

If you have any questions, don't hesitate to use the comments section.

Friday, 2 July 2010

How I landed a high-paying new job

I am happy to announce that my business card now reads "Senior Web Security Engineer"!

It's been about a month since my interview now, and my new employer (kept anonymous for oblivious reasons) have kindly given me permission to blog about it given that I anonymized it all. Since a few of you are probably out hunting for new jobs at the moment - and who wouldn't in this economy - keep reading for some pointers on how to pass a job interview.

The resumé and application

These are straightforward, so I won't go into detail. Just remember to put the prospective employer's name all over the application, proofread twice, and use glossy paper for the cover letter. Bonus points for delivering in person. If applicable, flirt heavily with whomever is receiving the application - getting someone on your good side can only help you.

I was lucky here, as the HR lady was obviously impressed by my shark tooth ear studs. She even called me up the same afternoon! After answering a few simple math questions to show I wasn't completely useless (this is called a "phone screening", but I think "phony screening" would be a funnier name), I was invited to...

The interview

I had a couple of days to prepare for this, so I had time to hit the sun parlor a few times and even found the time to get my teeth bleached. I also memorized the current "hot words" in cryptology in the evenings, and reimplemented the "rot thirteen" cipher a million times until I could write the C# code blindfolded, doing my best to stay sharp.

The first part of the interview was supposed to be with me, my prospective manager Bob (not his real name of course - it's just an alibi), and the HR lead Cyril (likewise not his real name). However, since Bob got unexpectedly ill that day and couldn't come, it was just me and Cyril. Cyril had little to no programming knowledge, but Bob had been thoughtful enough to fax him with a list of techincal questions. The list was quite long, and I spent about half an hour talking about the different cryptology disciples; cryptography, ciphering, stenography, cracking, then a bit about security mechanisms like web code obfuscation, reverse engineering, litigation and firewalls. To top it off, I went on quite a bit about my experience as a team leader and my thoughts on keeping a team efficient. All in all I did pretty good, and Cyril was very impressed by my broad knowledge. He gave me the green flag, and after a pleasant free lunch I was on to the last part...

The programming test

Oh my World this was unexpectedly harder than I thought it would be. Here I was expecting to be given a variant of the fizz test, but instead I was tasked to implement a complete cipher program from scratch in only two hours! Fortunately for you, my company has replaced that specific test with a new one now, so I'm allowed to post the entire thing here!

You have TWO HOURS for this excercise. After completion, you must print out your
solution, sign your name in the top right corner of every page, get a stamp of
authenticity from HR, and deliver it to your manager.

You will work with a well-known encryption procedure. This is called a substitution
cipher, in which each letter of the original message (the "plaintext") is replaced
by a different letter to generate the coded message (the "ciphertext").

To simplify matters further, we will restrict our plaintext to use only the upper
case letters A to Z, together with the space and newline characters; furthermore,
space and newline characters will be left unchanged by the encryption.

In each case, the replacement letter will be a fixed number of letters later in the
alphabet compared to the original letter (and "wrapping around" back to A again
after Z, as necessary). The number of letters to count forward will be called the
encryption key.

Thus, with a key of 3, we would replace A by D, B by E and so on, with, finally, W
being replaced by Z, X by A, Y by B and Z by C.

This simple kind of cipher is sometimes called a "Caesar Cipher" after Julius
Caesar, who is said to have used it for secure battlefield messages.

You are required to develop a program which will input a series of lines of
characters, and encode them with a Caesar cipher; as each line of plaintext is
input, the corresponding line of ciphertext should be output. It is up to you to
devise a mechanism whereby the user can indicate that all lines have been input,
and to implement the program in such a way that it will then terminate.

The key for the encoding should be "hard coded" into your program (as opposed to be
being read in at run time). Thus, changing the key will involve modifying the
source program and recompiling.

You may find it handy to be able to run your program on an input file (and
producing an output file) rather than working only with the keyboard and screen.
Use command line redirection to achieve this.

A command line gui? What is this, the 1970-ies? (A side note: This was the main reason I later persuaded them to replace the test with a new one.)

And the algorithm was pretty muddily explained. Replace X, Y corresponding with A, B and whatnot? That couldn't possibly be any less unclear, couldn't it not? Not giving up hope, I read the text carefully over and over again until finally it hit me: This is just a less general version of my Rot Thirteen cipher! I was in luck. After quickly typing in the memorized code, I had about an hour to figure out how to get the blooming thing to work in a terminal and add some polish. I got time to add some XML in there and even add a few design patterns. I was, and still am, pretty pleased with the end result:

using System;

namespace ConsoleApplication1 {
    // Factory pattern
    class CesarFactory {
        private int key;

        public CesarFactory(int key) {
            this.key = key;
        }

        public ICipher Instantiate() {
            return new CesarAdapter();
        }
    }

    // Interface pattern
    interface ICipher {
        string Encipher(string plaintext);
    }

    // Adapter pattern
    class CesarAdapter : ICipher {
        string ICipher.Encipher(string plaintext) {
            return Cesar.Encode(plaintext);
        }
    }

    // Cesar cipher
    class Cesar {
        public static string Encode(string input) {
            char[] chars = input.ToCharArray();
            for (int i = 0; i < input.Length; i++) {
                int lowerRotation = Rotate(chars[i] - 'a', 3, 26);
                int upperRotation = Rotate(chars[i] - 'A', 3, 26);
                if (chars[i] > 'a') chars[i] =
                    (char)('a' + lowerRotation);
                else if (chars[i] > 'A') chars[i] =
                    (char)('A' + upperRotation);
            }
            return new string(chars);
        }

        private static int Rotate(int c, int delta, int max) {
            c += delta;
            while (c > max) c -= max;
            return c;
        }
    }

    // Entry point.
    class Program {
        // Command line gui.
        static void Main(string[] args) {
            // Create Cesar factory
            CesarFactory factory = new CesarFactory(3);
            // Xml header
            Console.Out.Write("<XML version=\"1.0\"><Cesar>");
            // Read as long as there is something in the input buffer.
            while (Console.BufferHeight > 0) {
                // Read from Console.
                int x = Console.Read();
                // Encode input as string.
                string plaintext = new string(new char[]{(char)x});
                // Get cipher.
                ICipher cesar = factory.Instantiate();
                // Encipher plaintext.
                string ciphertext = cesar.Encipher(plaintext);
                // Output ciphertext to Console.
                Console.Write(ciphertext);
            }
            // Xml footer
            Console.Out.Write("</XML></Cesar>");
        }
    }
}

At the 1 hour 56 minute mark I sent my submission to the printer, got it stamped by HR, signed it, and had it faxed home to Bob for review. Exhausted but hopeful, I was now free to go and relax while the review took its course.

A few days later, the phone call I had been waiting so anxiously for finally came. I was accepted as Senior Web Security Engineer for a big bank, earning over £60k a year!

All it took was dedication, commitment, and a sense of purpose. And let's not be modest anymore (job interview tip here!) - a heavy dose of talent. How awesome is that?

Thursday, 3 June 2010

Introduction to Ciphers

As some of my readers may already know, I am hard at work refreshing my cryptology skills for a job interview (tomorrow! Wish me luck!) at one of the big banks in my area. While reading, I noticed that one of the most basic building blocks of cryptology, namely ciphers, are always explained in an overly complex manner. The textbooks, evidently written by skilled mathematicians, focus entirely on the math-y stuff like bit shifting and modular arithmetic, instead of explaining what ciphers are. Thus, I've decided to write a short primer to help other aspiring cryptologers:

Definition

Ciphers are the science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message.

In practice, this is usually done by encrypting a key onto a message, yielding a ciphertext. Since the ciphertext never looks anything like the original message, nobody can suspect its existence unless you tell them and give them the key.

Let's give an example to illuminate things.

The RotThirteen cipher

The RotThirteen algorithm is the simplest there is; increase each letter thirteen times, and roll over to 'a' if you pass 'z'. Since the alphabet contains twice thirteen characters, this means that 'XYZ' is encrypted to 'KLM', and 'KLM' encrypted again becomes 'XYZ'! This property means we don't need a separate decryption step, saving time and effort. In fact, the entire algorithm can be wrapped in just one single function (and a helper function).

class Ciphers {
    public static string RotThirteen(string input) {
        char[] chars = input.ToCharArray();
        for (int i = 0; i < input.Length; i++) {
            int lowerRotation = Rotate(chars[i] - 'a', 13, 26);
            int upperRotation = Rotate(chars[i] - 'A', 13, 26);
            if (chars[i] > 'a') chars[i] = (char)('a' + lowerRotation);
            else if (chars[i] > 'A') chars[i] = (char)('A' + upperRotation);
        }
        return new string(chars);
    }

    private static int Rotate(int c, int delta, int max) {
        c += delta;
        while (c > max) c -= max;
        return c;
    }
}

Quickly creating a simple host application and testing this:



Click "RotThirteen":



Click "RotThirteen" again:



Working as expected!

Usages

In contrast to most ciphers, RotThirteen doesn't use secret keys. Instead, all messages are encoded with the public key 13. While RotThirteen is not very secure, its simplicity and elegance makes it popular where only simple encryption is needed. For instance:
  • As an anti-spam measure.
  • A convenient way to write both a puzzle and vgf nafjre in the same place, without giving away the answer prematurely.

Tuesday, 18 May 2010

Staying Efficient

The software in this post is open source and free for download:
Staying Efficient

As leader of a small team of developers, I am constantly working to keep inefficiency out of our workflow. In this post, I will write about the things I have learned about efficiency during the past years and what you can do to help your team Stay Efficient.

In all teams, small inefficiencies will trickle in over time and gradually slow your down. If these inefficiencies are left unchecked for long enough, your team will be reduced to a dead weight; everybody will put in their due 8 hours per day, but little or nothing comes of it. While these kinds of teams can thrive for years in big corporations (and indefinitely in government), it is a situation professional programmers are desperate to avoid.

But how can we avoid it? Since this "disease" is so widespread, it must be hard to combat, right? That was a rhetorical question - it isn't hard at all. Let's divide team efficiency into three categories and have a detailed look at them:
  1. Rock Stars - A rare and super efficient species of team, these guys seem to get everything done in less time than you thought possible. Need a new button in the application? Done!
  2. Code Monkeys - Steady but slow. Sometimes an hour can go by with no new code written. They will estimate 1 week to put an extra button in the application, and usually spend at least 2 weeks.
  3. Dead Weights - They will spend 2 months putting the wrong button in the wrong place. You need the sheltering wings of a large corporation to keep these teams alive; they sure as hell aren't going to make anything you can sell for a profit.
Like I hinted at above, teams tend to start out as 1 but sink gradually down to 2, or even 3 given time. It will be useful to discuss how this happens to understand how to avoid it.

From type 1 to type 2: This is always the result of complacency. (You're #1, so why try harder?) Each new inefficiency is small and unnoticeable: Build times rise from 30 seconds to 50 seconds, lunches extend from 30 minutes to 35 minutes, etc, until build times are at 40+ minutes and lunches are an hour long.

From type 2 to type 3 takes a lot of neglect, and hence a lot of time for the neglect to happen in. Most teams will tend to stay at type 2 for years because they do inefficiency cleanup in fits and bursts; they always seem to have one or two developers that get fed up with the mounting inefficiencies and take initiative to remove the worst ones. Eventually, though, these developers will get burnt out and leave. After that, you'll reach type 3 within the year.

By now it should be obvious to anyone how we can avoid this sad spiral:

Continuous vigilance

Just spending a small amount of work each week - even when (especially when!) you know your team is already super effective - to detecting and combating inefficiencies is enough to keep your team lean and mean indefinitely.

I speak from experience here. About two years ago I was given lead of a type 2 team, and have been keeping a constant vigil for big and small inefficiencies to squash. Since then, we've not only kept from slowing down, we've even increased efficiency by such a huge margin that we are sporadically acting as a type 1 team!

Go for the big things first

But! Fight your temptation to rush ahead and spend valuable time fixing the first inefficiency you can think of.

You should start out removing the most glaring inefficiencies first, then go for the small ones later. This should really go without saying, but you'd be surprised at how often managers mull around fixing small inefficiencies before facing up to the real problems. They buy faster hardware, replace old office chairs, move the team closer together, upgrade development tools, etc. without even taking a look at what the team members actually do at work. The fastest development hardware in the world doesn't help much if your developers spend half their workday not working.

While "not working" sounds like a serious neglect of duty, it is usually not due to a conscious choice by the developers. They may stop to think about a problem, and just zone out and not notice that their mind has been wandering all over the place for the past half hour before getting back to work. I call this "Attention Drift" and it is the single biggest cause of inefficiency in development teams around the globe. Therefore, it is the thing you should combat first.

Combating Attention Drift

The reason so few teams are willing to acknowledge - and much less deal with - Attention Drift is because developers are normally conflict averse. Developers will not realize their own problem; they have convinced themselves they are spending the wasted time thinking about problems and code design, and will argue forcefully and angrily that spending ten minutes staring into space instead of coding is not only acceptable, but useful. (There seem to be a taboo for insinuating that developers are just plain goofing off.) For this reason, most companies unofficially allow for extended periods of inattention as fringe benefits.

Ignoring Attention Drift like that is completely acceptable and might work out for your company, but it will not allow your team to rise to type 1 efficiency. In my opinion, every company needs a few type 1 teams to be truly successful. How can you grab new customers if you can't cater to their needs fast enough? How can you enter and dominate a new market if your competitors develop a product quicker than you?

Also, doesn't it just appear more honest to pay people when and only when they're actually working to create value for you?

Now that it's obvious that you should do something about it, the next question is: What can you actually do? Constantly walking around and poking at people who are not coding works, but it is tiresome and frightfully boring. You don't want to burn yourself out just by getting other people to do the work they're paid for.

For me, the words "repetitive" and "boring" ring a bell labeled "automation". "Poking at people who are not coding" is a task that is surprisingly easy to automate!

The "Productivity Tool" application

Simply put, I needed a program that could remind me and my team whenever our work slowed down for too long. Measuring work is simple: for a developer, working is coding, and coding is typing. So in more concrete terms, I needed a program to measure my average typing speed for the past, say, 60 seconds.

Thanks to fellow programming blogger Stephen Toub's excellent posts on hardware-level keyboard hooks and likewise mouse gesture hooks, this was a breeze to program. In just a few days, everyone on my team had this little app running as a minimized service:


(Icons courtesy of  MySiteMyWay.)

Since it is usually minimized, it wouldn't be very useful unless I integrated it with the Windows 7 desktop bar:


The service icon on the left there will always display the current activity level, so you can always keep an eye on it to see how you're doing. If it tends to stay in the low end, you are going at a very slow pace; get another coffee or beverage to pick you up.

And, to really combat the dreaded attention drift, we need to wake someone up when they appear to have zoned out completely:


This should be enough to keep honest people busy, but to be really sure nobody feel tempted to ignore the application, I decided to add a web service interface to it: Anyone connecting to port 1337 on the computer running the app will get to see your current efficiency rating in percent (0-100%).

Want to try it out? Download the installer here.

The Employee Overview tool

Constantly accessing the 6 different web services is, of course, too tiresome to work for long. We need a monitor tool:


Really simple, this program allows you to enter in the names and computer names of your team members, and it will read their current performance from their web service and display it as a simple ascii face: ":-(" means they're going slow, ":-)" means they're doing well, "???" means their app is not running, etc.

Sounds useful? You can download the installer here.

Does it work? Can it work?

Yes! It works wonders. With attention drift practically eliminated, our code base has doubled in just the few months we've been running with this setup!

Despite the fact that it is obviously working, some people keep saying "But it can't work! Your developers can cheat this system!" And, yes, that is true. It can also be detected and handled. As an actual example, my team member Name Withheld had been complaining a lot on the new tool, so I naturally kept a close eye on him. By observing closely from afar, I noticed Name Withheld hitting buttons at random while he was reading gossip on Hacker News (it's like Reddit, but uglier). I gave him a public reprimand via our mailing list, followed by a stern warning to everyone that such behavior could lead to termination of employment, and complaints and resistance would lead to a negative mark on the next employee evaluation. There have been no further incidents, and the general attitude towards the new tools has steadily improved. Several team members have even approached me and told me they think it's a great idea!

The code

To go in detail about all of it is a bit too much for one blog post, and this one is getting big enough already, so I'll leave you to download the code from here and try it out for yourself.

In further posts, I will talk about how the code works, specifically:
  • How to create and access a web service.
  • How to initiate hardware keyboard hooks. 
  • Integrating with the Windows 7 desktop toolbar.
  • Making a balloon popup warning. 
  • Utilizing Linear Memory on the overview tool for safe memory handling.
  • And probably more!
That's it for this time. Thanks for reading, and stay tuned for more!

Sunday, 25 April 2010

Safe memory usage in C# part 2/2

This is the second and last part of my article on safe memory usage in C# (you can find the first post here). While the previous post discussed the "why", this post will cover the "how".

As promised, this is where I will show you how to create, use and retrieve complex real-world structures whose data live entirely in the LinearMemory.  If you are of the impatient type, you can download the code and play around with it a bit before reading further. The code is open source, which means you can do whatever you want with it.

Representation of data structures

Given the nature of the LinearMemory, it is obvious that each data structure will have to be something very similar to C structs, i.e. a queue of fixed-size data like:

struct Apple {
    int Color; // always 2 bytes
    float Weight; // always 4 bytes
} // 6 bytes total

This is, however, not very object-oriented. What we really want to do is to use classes, but not actual objects. Objects are allocated on the heap, while we want some sort of object-like behaviour where the data are entirely contained in the LinearMemory.

If you're thinking "static class", well done!  A static class in which every method takes a LinearMemory base pointer (address of the start of the data structure) in addition to their "normal" parameters will fit our needs nicely. Each of these static classes will be tailored towards one specific data structure, and will contain the necessary logic to create, modify and access the values of the data structure. There is nothing innovative about this syntax, it is actually how normal objects methods are accessed in IronPython and similar languages already!

Avoiding serialization

Given that we already have string-to-byte and byte-to-string functions, using XML as object representation springs to mind. After all, C# has excellent support for serializing and deserializing objects to and from XML via System.Xml.Serialization.XmlSerializer. Using that, we now have everything we need to store any data in LinearMemory. The problem is, of course, that we cannot use the data without deserializing it. Since we need to allocate heaped memory to store the deserialized object before using it, we have lost all the safety we're working so hard to get!

Serialization is a no-go; we need to write functions to read and manipulate live data inside the LinearMemory. For the new structures I intend to introduce, we only need to extend the LinearMemoryTools a little so it can interpret integers in addition to strings. Each number is simply encoded in 5 big-endian bytes:

//PARTIAL LinearMemoryTools.cs
/// <summary>
/// Access the bytes of a int.
/// </summary>
/// <param name="i">The int.</param>
/// <returns>The bytes.</returns>
public static byte[] IntBytes(int i) {
    if (i > int.MaxValue) throw new OverflowException("int too big!");
    bool positive = false;
    if (i > 0) positive = true;
    i = Math.Abs(i);
    byte[] bytes = new byte[5];
    bytes[0] = (byte)(i / 16777216);
    i = i % 16777216;
    bytes[1] = (byte)(i / 65536);
    i = i % 65536;
    bytes[2] = (byte)(i / 256);
    i = i % 256;
    bytes[3] = (byte)(i);
    bytes[4] = positive ? (byte)'P' : (byte)'N';
    return bytes;
}

/// <summary>
/// Retrieve a int from memory.
/// </summary>
/// <param name="ptr">Ptr to int.</param>
/// <param name="len">Must be 5.</param>
/// <returns>The int.</returns>
public static int BytesInt(int ptr, int len) {
    int i = 0;
    i += 16777216 * LinearMemory.mem[ptr + 0];
    i += 65536 * LinearMemory.mem[ptr + 1];
    i += 256 * LinearMemory.mem[ptr + 2];
    i += LinearMemory.mem[ptr + 3];
    if (LinearMemory.mem[ptr + 4] == (byte)'N')
        i = -i;
    return i;
}

Note that the second parameter to BytesInt could strictly be removed. I decided to leave it in for two reasons: 1) I plan to extend the method to support integers of arbitrary length in the future, and 2) it forces the user to keep the length of the representation in mind.

Enter the structures

To keep motivated when coding this, I decided to create a database that we all dream of administrating sometime, a company! A company needs employees:

//Employee.cs
public static class Employee {
    private const int OFF_NAME____ =  0;
    private const int OFF_NAME_LEN = OFF_NAME____ + 20;
    private const int OFF_SALARY__ = OFF_NAME_LEN + 5;
    private const int OFF_PHONE___ = OFF_SALARY__ + 5;

    public static int Size = OFF_PHONE___ + 5;
    
    public static int Create(string name, int salary, int phone) {
        if (name.Length > 20)
            throw new OverflowException("name too long! max 20 chars");
        int p = LinearMemory.malloc(Size);
        LinearMemoryTools.Insert(p + OFF_NAME____,
                                 LinearMemoryTools.StringBytes(name));
        LinearMemoryTools.Insert(p + OFF_NAME_LEN,
                                 LinearMemoryTools.IntBytes(name.Length));
        LinearMemoryTools.Insert(p + OFF_SALARY__,
                                 LinearMemoryTools.IntBytes(salary));
        LinearMemoryTools.Insert(p + OFF_PHONE___,
                                 LinearMemoryTools.IntBytes(phone));
        return p;
    }
}

Easier than you feared, eh? The object doesn't do much of anything yet, but we'll get to that later. Now for the structure that ties our database together:

//Company.cs
class Company {
    private const int OFF_NAME____ = 0;
    private const int OFF_NAME_LEN = OFF_NAME____ + 20;
    private const int OFF_NUM_EMPL = OFF_NAME_LEN + 5;
    private const int OFF_EMP_PTRS = OFF_NUM_EMPL + 5;

    private static int Size = OFF_NUM_EMPL + 5 + 500;

    public static int Create(string name) {
        if (name.Length > 20)
            throw new OverflowException("name too long! max 20 chars");
        int p = LinearMemory.malloc(Size);
        LinearMemoryTools.Insert(p + OFF_NAME____,
                                 LinearMemoryTools.StringBytes(name));
        LinearMemoryTools.Insert(p + OFF_NAME_LEN,
                                 LinearMemoryTools.IntBytes(name.Length));
        // Start out with 0 employees.
        LinearMemoryTools.Insert(p + OFF_NUM_EMPL,
                                 LinearMemoryTools.IntBytes(0));
        return p;
    }

A bit trickier; you can see we're keeping track of a list of pointers to employees. If pointers confuse you, think of this as a wonderful opportunity to learn how to use them. As programming guru Joel says, real programmers need to understand pointers. To elaborate, what we store in the company structure in offset 30 onwards is just a queue of integers that each represent an address of (=a pointer to!) an employee structure. When looking at the LinearMemory at each of these addresses, we will find a complete working employee data structure.  See? No magic at all.

Accessors, and modifiers, and reports! Oh my!

Note that we keep the structure's offset values private. This is deliberate, since the code that uses the data structure should never need to know the details of the representation. Instead of writing "LinearMemoryTools.BytesInt(employeePtr, Employee.OFFSET_SALARY, 5)" all over the place, we can write "Employee.GetSalary(employeePtr)", which is much more readable. Conversely, we want to modify the object data with "Employee.SetSalary(employeePtr)". These setters and getters give us a tremendous advantage; we can now refer to objects via a single value - the address to the start of the object. Since we can refer to each object by a unique value, we are now fully object-oriented!

The GetXXX methods are called accessors and are coded as follows:

//Employee.cs
public static string GetName(int ptr) {
    i = LinearMemoryTools.BytesInt(ptr + OFF_NAME_LEN, 5);
    return LinearMemoryTools.BytesString(ptr + OFF_NAME____, i);
}

public static int GetPhone(int ptr) {
    return LinearMemoryTools.BytesInt(ptr + OFF_PHONE___, i);
}

public static int GetSalary(int ptr) {
    return LinearMemoryTools.BytesInt(ptr + OFF_SALARY__, i);
}

//Company.cs
public static int GetEmployee(int ptr, int idx) {
    return LinearMemoryTools.BytesInt(ptr + OFF_EMP_PTRS + 5 * idx, 5);
}

public static string GetName(int ptr) {
    i = LinearMemoryTools.BytesInt(ptr + OFF_NAME_LEN, 5);
    return LinearMemoryTools.BytesString(ptr + OFF_NAME____, i);
}

public static int GetNumEmployees(int ptr) {
    return LinearMemoryTools.BytesInt(ptr + OFF_NUM_EMPL, 5);
}

Not too hard.

Now for the modifiers. Modifiers are, as you'd guess, methods that modify the object data. They don't know what's stored already, they store what you tell them to store. We call them SetXXX:

//Employee.cs
public static void SetName(int ptr, string name) {
    if (name.Length > 20)
        throw new OverflowException("name too long! max 20 chars");
    LinearMemoryTools.Insert(ptr + OFF_NAME____,
                             LinearMemoryTools.StringBytes(name));
    LinearMemoryTools.Insert(ptr + OFF_NAME_LEN,
                             LinearMemoryTools.IntBytes(name.Length));
}

public static void SetPhone(int ptr, int phone) {
    LinearMemoryTools.Insert(ptr + OFF_PHONE___,
                             LinearMemoryTools.IntBytes(phone));
}

public static void SetSalary(int ptr, int amount) {
    LinearMemoryTools.Insert(ptr + OFF_SALARY__,
                             LinearMemoryTools.IntBytes(amount));
}

//Company.cs
public static void AddEmployee(int ptr1, int ptr2) {
    // Get number of employees.
    i = LinearMemoryTools.BytesInt(ptr1 + OFF_NUM_EMPL, 5);
    // Add 1 and store new number.
    i += 1;
    LinearMemoryTools.Insert(ptr1 + OFF_NUM_EMPL,
                             LinearMemoryTools.IntBytes(i));
    // Ptr to new employee ptr.
    j = OFF_EMP_PTRS + 5 * (i - 1);
    // Be sure not to write out of object bounds!
    if (j > Size) throw new OverflowException("too many employees!");
    LinearMemoryTools.Insert(ptr1 + j, LinearMemoryTools.IntBytes(ptr2));
}

public static void SetName(int ptr, string name) {
    LinearMemoryTools.Insert(ptr + OFF_NAME____,
                             LinearMemoryTools.StringBytes(name));
    LinearMemoryTools.Insert(ptr + OFF_NAME_LEN,
                             LinearMemoryTools.IntBytes(name.Length));
}

And now we basically have all we need!

To make our life easier during debugging, though, we might want to add reports. These are simply wrappers around the accessors that present the entire data structure in a human-readable format.  They basically serve the same needs as ToString() methods on native data structures:

//Employee.cs
public static void Print(int ptr, TextBox stdout) {
    stdout.Text += "\r\n--Employee: " + GetName(ptr);
    stdout.Text += "\r\n  Salary..: " + GetSalary(ptr);
    stdout.Text += "\r\n  Phone...: " + GetPhone(ptr);
}

//Company.cs
public static void Print(int ptr, TextBox stdout) {
    stdout.Text += "\r\n======= COMPANY =======";
    stdout.Text += "\r\nName: " + GetName(ptr);
    j = LinearMemoryTools.BytesInt(ptr + OFF_NUM_EMPL, 5);
    stdout.Text += "\r\n" + j + " employees:";
    for (i = 0; i < j; i++) {
        k = GetEmployee(ptr, i);
        Employee.Print(k, stdout);
    }
}

Oh my! So, what else do we need for our database? Nothing! You can now create, modify and read every piece of data. Everything on top of that is just a matter of user interface, the nuances of which will have to be a topic for another post.

Making it work

To save you the work of typing in all the code and creating a user interface, I have made a fully working Visual Studio 2008 downloadable solution for you to play with. When you start it up, you will see the mainframe:

From the mainframe, you have access to all implemented subprocesses via the button rack on the top. The large text area accumulates reports from each subprocess you run. The most interesting subprocess button, "Run Company", launches the Company Database interface. With that interface, you can edit the company name, add employees, and edit the Company database's Employee data structures:

You can also launch the visual debugger (available from the "debug" button on the mainframe), where you can view the current state of the LinearMemory. To aid in debugging memory "leaks", the allocated memory is colored differently from the unallocated memory:


In the screenshot above, I have allocated an entire company data structure with lots of empty space saved for pointers to employees, and two employee data structures.

And that's it! I hope the code is self-explanatory. If not, just toss me a comment below, and I'll do my best to enlighten you. :)