Mavnn's blog

Stuff from my brain

Exploring Reactive Extensions

The Reactive Extensions project is "a library for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators". That doesn't immediately give most people an intuitive grasp of exactly what it is - but it's a useful addition to the toolset so we put together a practical for people to experiment with.

At it's simplest, RX (as it's called… the Nuget package you're looking for is Rx-Main, obviously!) allows you to create an IObservable object which you can then… erm… observe.

Persistent Data Structures

In last week's Developer Education session at 15below we had a look at immutable and persistent data structures, and why you'd want to use them.

TL;DR version: are you writing performance critical, real time code? Do you have less memory available than a low end smart phone? No?

Use immutable data types everywhere you can.

The session was inspired by Scott Wlaschin's excellent is your programming language unreasonable? post. If you haven't read it yet, go and do so - it's much better than the rest of this post, and you can always come back here later if you remember.

One of the points that Scott raises is that code written with mutable data structures (ones that you can change after they've been created) is very hard to reason about. In the very literal sense of working out the reason why things happen.

Difficult vs Impossible

Although programming is young and we often don't know much about the "best" way to do things, we're not totally shooting in the dark.

Every so often, you come up against problems that people have investigated in detail, and given programming's mathematical roots this even leads on occasion to a proof about a certain type of system. It would be impossible to keep up with all of the research; but there are a few places where it's very helpful to know about general results.

I'm going to claim here that it makes a big difference to how you handle feature requests both as a developer, and as a business, when you're asked to produce systems which are actually impossible.

Let's take the one that comes up most often in my experience… Consistency in distributed systems.

So - this comes up the moment that somebody (customer, internal stakeholder, whatever) declares that having just a single service running on a single server is just not reliable enough. At some point, something will go wrong - and when it does, the service is a SPOF (single point of failure) and your processes which use it stop.

Unacceptable!

Every product owner, ever

"We must have a cluster!" the service developer is told. "Load balancing!"

"Hmmmm." says the developer to themselves. "Distributed computing. That can get a little tricky. Let's see if I can nail down the actual requirements a bit more."

The Q&A session goes something a bit like this:

  • Dev: So… Let's start easy and assume two nodes for now. How important is it that every write is replicated to both nodes before it's readable?
  • PO: Critical!
  • Dev: And… How important is it that the system stays available when one node is down?
  • PO: Critical!
  • Dev pausing: Ah… We won't be able to replicate writes at that point - the second node is down.
  • PO: Oh. Right, makes sense - read availability is critical though.
  • Dev: It would make life easier if writes can only be made to one of the two nodes - let's call it master. That OK?
  • PO thinks for while: OK. It's not ideal, but we've got a deadline. Go for it.
  • Dev: How about consistency - if Bob writes to the first node, and then immediately reads from the second, is it okay if he gets slightly out of date data?
  • PO: Absolutely not.
  • Dev: OK. Give me a moment.
  • PO: Just a moment - one last thing! This has to be super user friendly to use. So make sure it's completely transparent to the client consumer that they're talking to a cluster.
  • Dev: …right.

Little known to our PO, their requirements are at this point strictly impossible. The impossibility here is a particular edge case; what happens if the "master" node receives a write, sends it to the "slave" to replicate, but then never gets a response. What does it do? Return an error to the client? Well - no. If the slave comes back up, and the replication had been successful before the slave became unavailable, then we'd have an inconsistent history between slave and master.

Does it return a success? Well - no. In that case, we're violating our restriction that every write is replicated before it's considered available to read.

So it has to return something else - a "pending", "this write will probably be replicated some day" response. But that violates the restriction that it shouldn't add any complexity to the consumer. We now have a corner case that the server can't handle, so it has to be passed back to the client.

After this first write, we do have a little bit more flexibility - we can stop accepting new writes until we've heard from the slave that it's back up and available and just throw an error. But we're still left with that first, awkward write to deal with. (Perceptive readers will also realise that this set up actually leaves us less reliable for writes than a single node solution - proofs left as an exercise to the reader).

In reality, this impossibility is a subset of the more widely know CAP theorem: a distributed system cannot be always "Consistent" and always "Available" and still behave predictably under network "Partitions". The three terms in CAP have pretty specific meanings - check out a nice introduction at You Can't Sacrifice Partition Tolerance.

This is the point where reality diverges, Sliding Doors style, depending on what the developer does next. The branches are numerous, but let's have a look at some of the most common. As an aside, I've fallen into pretty much all of these categories at different points.

Option 1: The developer doesn't know this is impossible either

At this point, we end up with a response that goes something along the lines of: "Well - I can do you a temporary solution where we return a pending result in situation x. Bit of a pain; put it on the technical debt register, and we'll sort it out when we have a bit more time."

Or: "I can't think of a completely fool proof solution right now; how about in situation x we return a failure for now. It'll be a bit confusing when a user gets told the write failed, and then it shows up later - but we'll get it sorted before the final release."

Neither of these solutions are wrong, as such: but the building of impossible expectations will inevitably sour the relationship between product owner and developer, and can cause serious business issues if an external customer has been promised impossible results. There may even be direct financial penalty clauses involved.

Option 2: The developer knows it's impossible, and thinks the product owner does too

Here the developer says "Well, I can return a pending result…" and the PO adds mentally "…which is a OK stop gap measure, I'll schedule some time to clean it up later."

This leads to pretty much the same outcomes as "Option 1", except the developer gets an unhealthy injection of smug self-righteousness for knowing that he never promised the impossible. In general, this is not helpful.

Option 3: The developer knows it's impossible, tries to explain… And fails

This is very similar in outcome to Options 1 & 2. Just more frustrating to the developer, especially if the product owner then claims the developer is "negative" or "incompetent".

Option 4: The developer knows it's impossible and explains to the product owner how and why

This is hard on two levels. On the first: the proof of why something can't be done might be genuinely difficult to understand. On the second: it can be hard to work out if you've avoided Option 3, or if people are just nodding and smiling.

We nearly hit one of these scenarios this week; fortunately our QA department spotted the mismatch in expectations (yay QA!). Where things got a bit strange is that it was raised as Option 3: "hey! Can we put a bit more effort in, and make this nicer to use?" At the QAT phase this much easier to deal with though - you don't have angry customers, commercial agreements and these other bits hanging over your heads (well - not if you're writing an internal service anyway).

What can we take away from all of this?

A few things.

Developers

  1. As a developer, you must know the basics of the domain you're working in. Keep on learning, folks.
  2. You must be able to communicate as a developer. A lot of developers are introverts, myself included. This is not an excuse. Introvert means that you can't recharge around other people, not that you can't talk to them.
  3. You cannot remove your developers from your customer communications, or completely separate commercial proposals and technical evaluation. You must have technical input into your business process, because sometimes its isn't a question of how much time you spend, how well you design or how skilled a developer you assign to the problem: it might just be impossible.

"Product Owners" (whatever your actual job title is)

  1. Listen to your developers, and pay attention to the wording. If they say something is impossible (not hard, not delayed) check you understand why.
  2. Be careful how you define the business problem to your developers. You may end up specifying a problem that is unsolvable if you end up layering up too many technical restrictions - while your developer may be able to suggest something that meets the business criteria without falling foul of technical (or more importantly mathematical) limitations to what is possible.
  3. If you place a technical requirement ("it must be clustered - no single points of failure!") make sure you understand the technical trade offs that you are imposing. This may take a long time. Alternatively, and preferably, rephrase your requirement to be your actual business requirement ("We promised 98% uptime - what's your design to make sure it happens?").
  4. You must be able and willing to say "no" to a customer when they ask for something impossible. You can offer alternatives, work arounds - but don't promise the impossible. It will come back, and it will hurt you.

Keeping Up With the Latest Hammer

Making Sure Your Developers Keep Developing

Software development is a strange profession, mostly because it's so young and because the tools are changing so fast. I've not been an established enough craftsman in any other trade to know whether other professions are moving as quickly these days, but at least in my imagination once a carpenter learns to use a hammer, it doesn't get discontinued after 2 years and the hammer taken off the market. Or a new hammer released that hammers nails ten times as fast, but has a different shaped handle and you have to hammer sideways instead of down.

We don't even seem to be able to decide whether it's a craft or a science. You can earn Computer Science degrees - but then well known software professionals choose titles like Software Craftsman.

Despite all of our claims of best practice and shared knowledge, it largely boils down to: developers don't know what they're doing yet. We're a new profession, and we're still learning - not just as individuals, but as a profession.

This means that both as an individual developers, and as software houses - if we stop learning, we sink. The competitive advantage of keeping up with what's happening in the industry so outweighs the cost of doing the research that it would be foolish not to. Because while we might not yet know the right way to do things, we're still definitely finding better ways to do things.

So: how do we do keep up to date, as individuals and as companies?

Modelling Inheritance With Inheritance

This post is part of the F# Advent Calendar 2014, which is stuffed full of other interesting posts. Go have a read!

Note: This post is epic in length. If you just want to see the final resulting script of much silliness, skip straight to the conclusion!

Note 2: If you just want to see an example of a sane generated type provider, the code from my FPDays tutorial is a much better bet.

Note 3: There is a lot of code below. If you're viewing this on a desktop, I suggest collapsing the sidebar to the right otherwise you'll have a lot of horizontal scroll bars. If you're on a mobile device, you might want to bookmark for later.

So… I've been playing with generated (not erased) type providers for a bit, and meaning to write something up about them. Most of the documentation out there is for erased type providers, and to be honest they have a lot of advantages in terms of performance.

But they also have two fundamental limitations:

  • You can't used erased F# types in any other .net language
  • You can't use reflection on erased types (even in F#)

So let's see if we can have a play with generated types, and then - given this is Christmas, and all - let's see if we can build Jesus' family tree in the .net type system. After all, if you're going to use inheritance to model something, how about modelling inheritance?

Cutting Quotations Down to Size

This is part 2 in my quotations series, following on from Tap, Tap, Tapping on the Door.

As promised in the first part of this series, here we're going to take a look at manipulating quotations. I mean, we've got this AST - now what are we going to do with it?

Let's start with something fairly straightforward; boolean algebra.

First, let's get a look at how some boolean expressions are represented in quotations.

Firing up F# Interactive, we'll feed a few in and see what happens:

1
2
3
4
5
6
<@@ true @@>;;
(* val it : Expr =
      Value (true)
        {CustomAttributes = [NewTuple (Value ("DebugRange"),
              NewTuple (Value ("stdin"), Value (4), Value (4), Value (4), Value (8)))];
         Type = System.Boolean;} *)

Hmm. That's… not as nice as we might want. The custom attributes are being added by F# interactive for debugging purposes, but hopefully the general shape is clear: our expression consists of a single value of true.

I'll cut out the custom attributes from now on to make reading things a bit easier.

Next!

If I Ruled the World… Remote Working

Welcome to a new category of posts for my blog. It's the "Rule the World" category, where I spout opinions on a subject with gay abandon, even if I can't actually offer that much beyond anecdotal evidence to back the opinion up.

On this occasion I even feel justified as someone has actively asked my opinion about something.

Well, that was foolish now, wasn't it?

Property Checking Start Challenge

Almost a year ago now, I wrote up a blog post on using FsCheck. I still rate it as an excellent tool, but unfortunately we don't manage to use it that much. The reasons for this basically boil down to the fact that a) we tend to forget it exists and b) a good deal of our code is written in C# or VB.net, and the original API is not very friendly from those languages.

So as part of the 15below developer education sessions we're going to try an exercise to see if we can bring a bit more property based testing into our code base!

Tap, Tap, Tapping on the Door

In my investigations into type providers, I started digging into a feature of F# called quotations. These blur the boundary between code and data; a representation of an expression tree that you can then evaluate or manipulate.

Why is this useful? Well; it's used in a number of places in various F# libraries. As mentioned above, type providers use them as a mechanism for providing the invocation code for the types that are being provided. The compiler can then take that expression tree and turn in into clr code.

They can also be useful as a way of defining code within your F# that can then be translated into other programming languages. The linq to sql implementation does this (turning your linq into SQL, fairly obviously!) while the FunScript project compiles your F# quotations into JavaScript.

So; linked features, often used in concert: quotations allow you to generate expressions at runtime, manipulate them at run time and evaluate them at run time - where evaluation covers everything from running the code on the clr to outputting it as a different language.

Functionally SOLID 2

This post follows on directly from Going Functionally SOLID

In our first session looking at SOLID and functional programming, we tried to apply some SOLID principles to an example piece of code.

We ended up with a set of interfaces like those below, and robot classes could then implement the interfaces to define their capabilities and state. I mentioned the example code was for a giant robot game, yes?