This post follows on directly from my previous post Type Providers from the Ground Up. I highly recommend that you read that first, and check out the relevant example code from GitHub.
It's also a bit epic… grab yourself a coffee before you start.
So we have a working type provider now. Unfortunately, we're missing out on at least two major features that your new type provider will almost certainly want to make use of.
The first is that in our example, we're reading the metadata that defines our types from a fixed file location. In almost every real life case, you will want to be able to parametrize your provider to specify where this instance is getting it's metadata from.
The second is that in many cases getting the metadata will be slow, and the number of types available to generate may be very large. In these situations, you really want to be able to only generate the types that are required as they are requested, especially because this will reduce the size of the final compiled output. This is particularly important for type providers that read from large network based data sources like the Freebase provider.
We'll take the second first, because it's easy - and we like easy…
Generating types on demand
This is in many ways one of the features that makes type providers uniquely powerful compared to code generation. Because the types are being requested by the compiler as needed, type providers can give meaningful access to literally infinite type hierarchies.
So, does all this power come with great cost and complexity? Not really, no.
Let's take the our node creation function, with some bits snipped out:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
To make the ports deferred, we simply change the
AddMembers call at the end to
AddMembersDelayed and wrap the creation of the array in a function that takes
It ends up looking like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Now the input and output ports of a node will only be generated the first time that the compiler needs them available. If you don't use a particular node in your program, then the compiler will never generate it's ports, and they will not be including in your final build output.
Of course, in this case we're pre-loading all of our metadata anyway, but hopefully this gives you an idea.
Parametrizing the Data Source
Currently, we're reading the json that's generating our types like this:
1 2 3
Now, you'll probably of noticed from playing with other type providers that they allow you to do funky things like:
This is actually one of the things that kept me going for longest in writing my first type provider, and I have to admit I'm still not fully certain why it's done this way.
At the moment, if we strip out all of the type creation logic, our type provider looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
As you can see, we add the types to the namespace during the initialization of the MavnnProvider type.
This is no good if we want to add parameters - after all, we don't know what they are yet. And the same provider might be used several times with different parameters. Also, when we create our provided type (
let nodeType = ...) we're putting into a fixed space in the assemblies namespace. Again, this is no good if we want to be able to use more than one of our provider with different parameters.
To get around these issues, we create a "parent" provided type within the type provider which will host an isolated namespace for each parametrized provider instance:
Then we define some 'static parameters' and call the
DefineStaticParameters method on the parent provided type, still within the construction of the type provider:
1 2 3 4 5 6
… and then we amend the base TypeProvider type so that the only type it adds to the namespace is the
At this point, we're creating an independent environment for each instance of the type provider. Unfortunately we need to make several changes to the type creation logic to make this work.
Firstly, we loaded quite a few things globally in the original version - things like the node list now need to happen within the context of
DefineStaticParameters. You'll also notice that
DefineStaticParameters gets given a
typeName as one of the parameters on the callback. This is a compiler generated type name for this instance of the which is passed in when a parameterised provider is defined, and the callback method needs to return a provided type with that name.
So, for example:
1 2 3 4 5 6 7 8
Will pass in
"Script.thisOne" [| box "c:\Temp\Graph.json" |] to the callback method, and expect to get back a provided type. So the first thing we'll do in the callback is create the new type which we will then add all of our nodes to.
Keeping all of the amendments separate in your head just gets harder and harder at this point, so let's just few the final annotated method and get an overview of the final result. It's long, but hopefully worth it!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
Let's just check this all still works…
Ah. No. No, it doesn't.
This is where type provider development can get a bit more frustrating. The compiler allows the code above to compile - it's completely valid F# that looks like it should do the right thing. But now, our quotations are doing something different; and evaluating them at runtime fails.
Let's take a look at the constructor that's throwing the error:
GetNode was referring to a public method in the type provider assembly. But if you look above now, it's actually a private method with in the type provider class that we are closing over. But our generated type is in the assembly that's being created, not in the type provider assembly so it can't access this method. Even if it was in the same assembly, this method is actually private to the class, so we'd still be stuck. Bearing that in mind, let's try a
rewrite to see if we can get all of our quotations into better shape.
What are our options? Well, we can either capture all private state in types that the quotation evaluator knows about (
string, mostly!). Or we can make sure that any methods called in the quotations are public.
The first gives us a cleaner interface for the outside world (the
GetNode method should never really have been public in the first place), so let's give it a try.
In our first version of the type provider, we were using the
GetNode method to avoid having to embed the
Node in the constructor directly. But how would we go about putting the node in directly? We need something that creates an
Node isn't a completely trivial type - it's members (
Ports) are made of more complex types themselves. Let's start with a simpler challenge, and see if we can make an
We already know that:
isn't going to work. The expression evaluator won't know what to do with the
Id type. But
Id's constructor is a public method, as is the
Guid constructor. Let's try it:
1 2 3 4
Cool. It works, and even has the right signature. Looks like we might be getting somewhere. The
Port type is nearly as straight forward:
1 2 3 4
We're using our embeddedId method to 'lift' the port's
Id into an expression, and then splicing that expression into a call to create a new port.
We're on a roll! Just need to do the same for the
Node type itself, with it's…
There's probably a more elegant way of doing this, but given this is a functional first language, let's grab the first tool that springs to mind.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
So, from the top down.
portsExpr creates a quotation that takes an
adder quotation (
Expr<List<Port>> -> unit) and returns an
Expr<List<Port>>. This is what we're going to use in our
Node construction quotation; but first we need the
adder; some kind of magic method that takes a List and adds each of the ports from the node that's being passed into
embeddedNode. I've built it as a recursive function; the 'zero' state that's passed in looks like this:
This is what will happen if the port list on the input node is empty. If it's not empty, we repeated build up nested calls to:
1 2 3
h is the next port from the list. By the end of the process we have a chain of anonymous functions, each in turn closing over the quotation of a port from the input. Finally, we can splice that into the expression that actually creates our node.
Now we can use our new
embeddedX expressions in our provided constructors and methods; for example, the constructor above becomes:
1 2 3 4 5 6 7 8 9 10
Can you see the difference? Now, rather than closing over the
GetNode method, we're closing over the quotation of the node that it returns.
With a sense of deja vu, let's just check this all works…
And somewhat surprisingly - it does.
If you want to see and play with the code, the version for this post can be found in the FirstFloor branch of the project on GitHub.
As with the first post in the series, let me know your questions and comments.