Category Archive for: ‘Tech / Work’

A Personal Stack Overflow Milestone

2

In the grand scheme of things, ’tis but a minor achievement, but I was quite chuffed with myself this evening when my Stack Overflow reputation finally reached the 10,000 mark:

My girls made me a special “10K” cake to celebrate :-)

Kudos to Jeff, Joel and the team for creating a site that I have found engaging, entertaining and very useful for the last 3 years and 4 months.

Schoolboy Error Of The Day

4

This dumb mistake just cost me an hour spelunking around in the debugger:

var status = source.Substring(source.LastIndexOf("/" + 1));

(where source is e.g. “http://foo.com/status/all-is-good”)

Fortunately the ramifications were picked up in the acceptance tests, but the root cause wasn’t at all obvious from such a high level.

Lesson for the day – code is never too trivial to warrant unit testing.

Entity Framework Week Part 5: Concluding Thoughts

0

This is the fifth in a series of five posts recounting my experiences using Entity Framework Code-First to replace ADO.NET and stored procedures in a client’s existing application. The introductory post in the series is here.


I am lucky to have had the opportunity to spend a time-boxed period playing with Entity Framework Code-First in a real-world scenario, and to get paid for the privilege! I now have a clearer understanding of how it has progressed during the last few years, what its strong points are, and where it still has shortcomings compared to the much more mature NHibernate framework.

The Positives

I have to say that after a week of getting through the pain barrier and the initial denial of working with an unfamiliar ORM, I have reached a level of understanding and acceptance with Entity Framework. It really isn’t all that bad (at least the Code-First flavour), and if you don’t stray too far from its rigid way of thinking it will help you to get a solution up and running quickly and reliably. It’s certainly a far preferable option than mucking about with ADO.NET and stored bloody procedures, that’s for sure.

The whole process of configuration and initialization is straightforward and pain-free, with the derived DbContext providing a out-of-the-box implementation of Unit of Work already to be referenced from your consuming code. Easy.

Querying the model is 99% unadulterated LINQ, with the occasional call to Include to perform some eager fetching – what could be simpler?

I’m also unashamedly impressed with how easily EF can be used to power ASP.NET Dynamic Data sites, and RESTful WCF Data Services. Nice.

The Negatives

I found that the real pain of working with Entity Framework only surfaces when you wish to start tuning its behaviour in any way – you find that it’s a big black box with few extensibility points. It performs cunning tricks effortlessly, but wields its power in a largely indiscriminate manner. By comparison, NHibernate can achieve even greater things, but requires you to explicitly invoke these powers.

I am reminded of a response Ayende gave when asked why NH Futures was not the default behaviour – “NHibernate tries hard not to make too much magic”. I thought it sounded glib at the time, but having lived with EF for a while, I now understand why this is preferable.

Most of the NHibernate features that are missing from Entity Framework are related to performance – such as the ability to configure query batching, write batching, bulk operations, extra-lazy properties, and second-level caching. These are the features you’ll miss the most when you’re some way into a project, perhaps not until it’s in production and scalability issues arise.

I also feel the CTP5 of EF code-first is a little way off offering true support for persistence ignorance and POCO, having experienced a number of issues that required me to change my domain model, database schema, and application code.

Additional Resources

Here are a few of the resources that I found particularly useful during my EF week:

Entity Framework Week Part 4: Features and Further Investigations

0

This is the fourth in a series of five posts recounting my experiences using Entity Framework Code-First to replace ADO.NET and stored procedures in a client’s existing application. The introductory post in the series is here.


I didn’t want this series of posts to descend into a point-scoring NHibernate-versus-Entity Framework comparison, but…

I now have a basic proof-of-concept up and running, with my client’s nascent application now being powered by Entity Framework Code-First CTP5 rather than a hand-rolled DAL. So, I had some time to consider future functional and non-functional requirements that the team would be asked to develop and support, and investigate how EF would meet the challenge.

Caching

I was genuinely surprised to learn that Entity Framework still doesn’t include any out-of-the-box support for integrating second-level caching, for example to cache reference data. It seems there is a body of opinion stating that caching should not be the responsibility of the data access layer. I disagree, and I think this is one of the major benefits NHibernate still has over Entity Framework, with its multiple flexible and configurable second-level cache providers.

Targeting Alternative Providers (SQLite)

When working with NHibernate, I often target SQLite for fast integration tests against an in-memory database, rather than maintaining a testing version of the MSSQL/Oracle databases that my applications usually use for their bitbucket. I was pleased to see the discussions on the System.Data.SQLite page suggesting that this approach is possible with Entity Framework too, but I didn’t spend any time attempting to get this working.

Auditing Functionality

Entity Framework does not appear to support the rich events and listeners model that is offered by NHibernate and frequently used to develop application auditing functionality. The recommended solution to achieve this scenario is to override the virtual SaveChanges method on DbContext and add validation and auditing logic there. For more details, see page 261 of Programming Entity Framework.

Bulk Operations

I have not yet encountered any Entity Framework support for bulk update/delete operations akin to NHibernate’s Executable DML functionality. Such requirements are usually relatively rare, but it’s a shame to have to fall back to writing stored procedures for relatively simple operations which can be described in terms of the domain model.

Query Batching

There does not appear to be any way to do query batching in Entity Framework, as per NHibernate Futures. Multiple queries result in multiple network trips to the database, sadly. Similarly, there’s no support for write batching and batched collection loads.

Concurrency and Versioning

Entity framework supports optimistic concurrency. Chapter 23 of Programming Entity Framework explains in detail how this can be configured and utilised by your application. Entity Framework also supports rowversion fields for concurrency checks.

Extra-Lazy Properties

Unlike NHibernate, Entity Framework currently has no notion of “extra-lazy” properties. Requesting the Count of a child collection (e.g. Order.Lines.Count) will therefore trigger the loading of all entities (Lines) in the child collection. Not nice. Yes, we can work around this by making the appropriate count query at a higher level but it’s much nicer to be able to traverse the domain model relationships and let persistence ignorance work it’s magic.

Integration with the Wider .NET Stack

To my mind, one of the key selling points of Entity Framework over NHibernate is its out-of-the-box integration with other areas of the .NET stack – notably the ability to power ASP.NET Dynamic Data sites (which are great for simple pages to maintain reference data) and WCF Data Services.

Query Techniques

NHibernate offers a world of choice when it comes to methods for querying the model: HQL, Criteria, QueryOver, LINQ, Named Queries, etc. These each offer a plethora of possible options and tweaks including query caching, batching and futures. By comparison, Entity Framework offers a comprehensive LINQ provider (with decent extensions to specify eager-loading of child entities), or Entity SQL. And that’s your lot.


By the end of my fourth day, I had a working proof-of-concept using Entity Framework Code First to power my client’s application, and I had a good idea of how suitable it was to meet future requirements lurking in the product backlog.

In the fifth and final part of this series of posts, I’ll write some concluding thoughts on my overall experiences spending a week with Entity Framework.

Entity Framework Week Part 3: Runtime Issues Encountered

0

This is the third in a series of five posts recounting my experiences using Entity Framework Code-First to replace ADO.NET and stored procedures in a client’s existing application. The introductory post in the series is here.


Having configured and initialized Entity Framework, and tweaked the mappings, by Day 3 I was all set to start consuming my shiny new DbContext implementation from the application code, and actually get some CRUD work done. Not unexpectedly, I hit a few issues along the way…

Proxy Generation

As a long-term NHibernate user, I habitually mark all members on my domain classes as virtual, since this is a requirement for entities to be replace at runtime by proxies. Remember that NHibernate is a port from the Java world, where all instance methods are virtual by default.

Now, this habit led to some unexpected behaviour when I attempted to use Entity Framework to persist the same domain objects, namely exception messages such as:

The property ‘Foo’ on type ‘Bar_B6089AE40D178593955F1328A70EAA3D8F0F01DDE9F9FBD615F60A34F9178B94’ cannot be set because the collection is already set to an EntityCollection.

Clear as mud, eh? A little Googling eventually unearthed the following posts from other people experiencing this issue:

This latter post on the MSDN forums includes the following explanation from one of the guys on the Entity Framework team:

“If you make all your properties virtual then EF will generate proxy classes at runtime that derives from your POCO classed, these proxies allow EF to find out about changes in real-time rather than having to capture the original values of your object and then scan for changes when you save (this is obviously has performance and memory usage benefits but the difference will be negligible unless you have a large number of entities loaded into memory). These are known as ‘change tracking proxies’, if you make your navigation properties virtual then a proxy is still generated but it is much simpler and just includes some logic to perform lazy loading when you access a navigation property.

Because your original code was generating change tracking proxies, EF was replacing your collection property with a special collection type to help it find out about changes. Because you try and set the collection back to a simple list in the constructor you are getting the exception.

Unless you are seeing performance issues I would follow Terrence’s suggestion and just remove ‘virtual’ from your non-navigation properties.”

This feels a little bit strange, and I’m not convinced that we are really getting persistence ignorance if we experience differing behaviour depending on whether or not we have chosen to make all our members virtual. I haven’t invested much time looking into the benefits of these “Change-Tracking Proxies”, or how it is possible to utilise these without causing the “collection is already set to an EntityCollection” exception. I just did what the man said and removed the virtual keyword from most non-navigation properties.

A Runtime Exception When Lazy-Loading

At one point I experienced an exception message along the lines of:

“Entities in ‘CodeFirstContainer_Sessions’ participate in the ‘Session_Season’ relationship. 0 related ‘Session_Season_Target’ were found. 1 ‘Session_Season_Target’ is expected.”

This was caused by my navigation property (Session.Season) not having been set as virtual, so no proxy was being created.

Incidentally, it is worth highlighting that lazy-loading must occur within the scope of an open DbContext (i.e. within the Unit of Work). It is not reasonable to expect to transparently load the navigation property after the database connection has been closed (this is analogous to attempting to lazy-load in NHibernate after closing the Session).

Cascades

In NHibernate, cascading saves/updates/deletes have to be specified manually on all foreign key relationships – the default behaviour is not to cascade any changes when committing changes, which often leads to newcomers experiencing an error message “not-null property references a null or transient value”.

Entity Framework takes a more convention-based approach and assumes that all saves and updates should cascade. So, if you save a shiny new Order with an associated Address and a handful of Lines, Entity Framework will determine that it should first insert the Address row, then the Order row, and then each Line row. Sweet. Updates similarly cascade. Assuming you are happy with this behaviour (which seems sensible), then all should be well.

Deletes, on the other hand, are a bit strange. Entity Framework will not take responsibility for cascading a delete in the database – it expects that you will achieve this by setting a cascading delete on the foreign key relationship in the RDBMS.

Having said this, if you delete a parent entity in Entity Framework, it will attempt to issue delete statements for any child entities which have been loaded into the current DbContext, but it will not initialize any child entities which have not yet been loaded. This may lead to the RDBMS throwing foreign key constraint violation exceptions if a cascading delete has not been specified. For more details about how cascade delete “works” in Entity Framework, see this blog post.

Personally, I think this behaviour is pretty shoddy, but there you have it! Forewarned is forearmed.

In light of this behaviour I had to make modifications to the database schema to set cascading deletes on all the appropriate foreign key relationships. For many line-of-business applications, deletes are actually pretty rare events, and in the short term I suspect this issue is more likely to be encountered when clearing down data in integration tests than in actual application use.

Initializing Child Objects on Domain Entities

I habitually add code to the constructors of domain entities to initialise child entities to sensible defaults – I find it helps to ensure that objects are always in a valid state, and reduces the likelihood of encountering an unhandled NullReferenceException. So, for example, I would usually have something like this:

public class Order
{
    public Order()
    {
        this.Address = new Address();
    }

    public virtual Address Address { get; set; }
}

Unfortunately, when using this approach with Entity Framework, I found that when loading an existing Order which has an associated Address, the Order.Address object was always reset to its default.

Now, I realise that making calls to virtual members in a constructor is not really a good idea (heaven knows Resharper and Coderush have both nagged me about it often enough), but NHibernate never had a problem with this approach. Nevertheless, I tried to do things properly and replaced the automatic properties with backing fields….

public class Order
{
    public Order()
    {
        this._address = new Address();
    }

    private Address _address;

    public virtual Address Address
    {
        get { return _address; }
        set { _address = value; }
    }
}

But still no dice. I then tried putting initialization logic in the property’s getter….

public class Order
{
    public Order()
    {
        this._address = new Address();
    }

    private Address _address;

    public virtual Address Address
    {
        get { return _address ?? (_address = new Address()); }
        set { _address = value; }
    }
}

But nothing worked. Every time I loaded an Order, the Order.Address object was reset back to its default instead of containing the data loaded from the database.

This is quite frustrating and as yet I haven’t been able to find a workaround, other than to abandon my plans to perform object initialization in the domain model and instead handle it in the service layer, with all the resultant null-checking code that will ensue.

While trying to find a solution, I did stumble across this comment from Rowan Miller that “More flexibility in how we interact with classes is a common ask for EF and our team is looking at how we can support this at the moment.”
Are we asking for flexibility? I thought we were just asking for the persistence ignorance that had long been promised.


Despite these issues, in three short days I had gone from knowing next to nothing about Entity Framework 4 Code first, to using it to perform almost all of the data access required by a small application. In the fourth part of this series I’ll consider some of the additional features and application requirements that I would expect an ORM to handle, and see how EF meets the challenge.

Entity Framework Week Part 2: Conventions and Fluent Mappings

0

This is the second in a series of five posts recounting my experiences using Entity Framework Code-First to replace ADO.NET and stored procedures in a client’s existing application. The introductory post in the series is here.


As mentioned in yesterday’s post, I was attempting to use Entity Framework Code-First CTP5 to map an existing domain model to an existing database schema. Fortunately the project was in its infancy and there was a high degree of cohesion between the two models. I therefore didn’t anticipate too many difficulties ahead – the occasional naming discrepancy to resolve, and table-per-hierarchy mappings that would need their discriminators specifying – nothing too complicated really. I hoped to make as few changes as possible to either the database schema or domain model.

Entity Framework Code-First uses a set of conventions to “discover” the mappings from domain objects to database. This is broadly analogous to James Gregory’s Fluent NHibernate AutoMapping functionality.

As with Fluent NHibernate, it is possible to add custom conventions, and to manually override mappings for specific properties which deviate from the conventions. It is also possible to remove existing conventions.

All of these modifications to the model mappings are affected by overriding the virtual OnModelCreating method in our concrete implementation of DbContext. I was initially worried about the sheer volume of code that might be included in this method, and was relieved to discover that mapping overrides related to particular entities can be separated out into the constructor of generic implementations of EntityTypeConfiguration, not unlike the generic ClassMap in Fluent NHibernate.

Custom Conventions

In the domain model I was working with, all entities were derived from an abstract base Entity class which defined an integer Id property. By contrast, the primary keys on the database tables were all prefixed with the name of the table/entity. Neither of these situations are ideal, but nor are they all that unusual, and I sought a way of “teaching” this convention to our custom EF context.

It took me some time to discover that custom conventions are even possible in CTP5, and I had initially resigned myself to manually overriding the names of each and every primary key property. It was only through stumbling upon this post on the ADO.NET team blog that I found what I was looking for. Note that this post does include the caveat “There are a number of rough edges and the API surface is likely to change”.

My first impression is that custom Entity Framework conventions could turn out to be far more powerful than those offered by Fluent NHibernate, but they are also trickier to develop, requiring an understanding of the valid options for the two generic parameters that IConfigurationConvention can take, and what actions should be taken by the custom convention.

Still, after a little trial and error I was able to write the custom primary key convention that I required:

public class MyAppPrimaryKeyConvention : IConfigurationConvention<PropertyInfo, PrimitivePropertyConfiguration>
{
    public void Apply(PropertyInfo memberInfo, Func<PrimitivePropertyConfiguration>; configuration)
    {
        if (memberInfo.Name == "Id")
        {
            // We need to avoid overriding concrete class mapped using TPH.
            if (memberInfo.ReflectedType.IsSubclassOf(typeof(ProductFeature)) ||
memberInfo.ReflectedType.IsSubclassOf(typeof(ProductInsert)))
            {
                return;
            }
            configuration().ColumnName = string.Format("{0}Id", memberInfo.ReflectedType.Name);
        }
    }
}

I was disappointed that I had to insert a guard clause to ignore this convention for the concrete subclasses of hierarchies that are mapped using table-per-hierarchy (i.e. ProductFeature and ProductInsert). Given time I would hope to find a generic way of achieving this convention that doesn’t require hardcoded references to specific Domain objects from within the convention definition.

Compare and contrast with the equivalent code for Fluent NHibernate:

public class MyAppPrimaryKeyConvention : IIdConvention
{
    public void Apply(IIdentityInstance instance)
    {
        instance.Column(string.Format("{0}Id", instance.EntityType.Name));
    }
}

Updated:

Since I wrote the section above, the ADO.NET team have announced details of the forthcoming Entity Framework 4.1 Release Candidate, which removes this ability add conventions:

“This was a very painful decision but we have decided to remove the ability to add custom conventions for our first RC/RTW. It has become apparent we need to do some work to improve the usability of this feature and unfortunately we couldn’t find time in the schedule to do this and get quality up the required level. You will still be able to remove our default conventions in RC/RTW.”

For what it’s worth, I think this was the right convention to make. A lack of “pluggable” conventions is slightly disappointing, but it can easily be worked around by making the appropriate overrides with the fluent mappings. Better to hold off an nail an API that’s both powerful and usable than go too soon with something that’s liable to confuse and confound.

Removal of Default Conventions

Another nice feature described in the Pluggable Conventions blog post is the ability to remove some of the default conventions, which I immediately put to good use by disabling the default PluralizingTableNameConvention:

modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();

(I mean, for goodness sake, who in their right mind pluralizes table names anyway? Yes, it’s very impressive that this library knows that the plural of “goose” is “geese”, but it would be more beneficial if that were an extension method on System.String in the BCL rather than being buried in the bowels of System.Data.Entity.Design. Then perhaps the ADO.NET team could be left to get on with developing something more useful, like second-level caching? Sorry, rant over…)

Fluent Mapping API

Most manual tweaks to the model are fairly straightforward to perform, and there are a good set of examples in this post on the ADO.NET team blog.

I did encounter some difficulty with the mapping of one-to-many relationships, which felt quite cumbersome to perform in comparison to the brevity of Fluent NHibernate’s API. Here’s how you’d rename a foreign key on a unidirectional one-to-many relationship in Fluent NHibernate:

References(x => x.AudioFormat).Column("AudioFormat");

Whilst in Entity Framework code-first, the equivalent is:

HasOptional(x => x.AudioFormat)
.WithMany()
.IsIndependent()
.Map(m => m.MapKey(a => a.Id, "AudioFormat"));

To be fair, I think part of this clumsiness arises because Entity Framework is allowing us to define both ends of a bidirectional relationship in a single place, whereas NHibernate requires us to define each end separately. It’s just unfortunate that in unidirectional situations like this example we end up with the WithMany().IsIndependent() noise in the middle of the syntax.

Having learned the odd syntax required to rename these one-to-many foreign keys, I then wasted an inordinate amount of time trying to make this actually work. Many of my waking hours were blighted by an InvalidOperationException (“Sequence contains more than one matching element”) originating from deep within the framework. A quick ferret around on Stack Overflow revealed that I was not the only person currently banging his or her head on this particular brick wall:

Eventually Diego Mijelshon figured out what was amiss. It seems the mapping failure was due to the Id property being defined in a base class rather than on the concrete class being mapped. Whether this is intentional behaviour or a bug in the CTP5, I’m not sure, but I worked around this issue by modifying the domain model and ditching the hierarchy of base classes altogether, leaving the Id and other common properties defined only on interfaces. Ayende would be pleased.


Two days into my adventure, and my database and domain model were happily mapped. In part three I’ll look at some of the issues I encountered at runtime, which necessitated further tweaks to the domain model and database.

Entity Framework Week Part 1: Introduction, Configuration and Initialization

2

In February 2011 I found myself doing some contract development work in a team that was still doing data access using raw ADO.NET and stored procedures. Being the NHibernate fanboy that I am, I naturally attempted to persuade them of the benefits of moving over to NH, even going so far as to develop (in my own time) an NH-powered version of their application.

My efforts were partially successful. The team were sold on the idea of using an ORM, but wanted me to develop a second proof of concept using Microsoft ADO.NET Entity Framework rather than NHibernate. This prompted much mirth amongst my FaceBook friends.

FaceBook

I decided to throw myself into the task, and use this opportunity to spend some time getting to grips with Entity Framework in a real-life scenario for a couple of weeks. I figured that in the best-case scenario, I would learn to love EF even more than NH, and bolster my CV. Worst case, I’d hate it but would be moving on to a new contract shortly anyway, so wouldn’t have to live with it for too long. And of course I was getting paid for the experience either way, so what’s not to like?!

A couple of friends rightly suggested that I ought to blog about my experiences, so here we go.

This series of five blog posts details the thoughts and experiences I encountered during my week-long adventure with Entity Framework. It isn’t an EF walkthrough, nor is it a comprehensive EF-vs-NH feature comparison (the web is littered with those).

  1. Introduction, Configuration and Initialization
  2. Conventions and Fluent Mappings
  3. Runtime Issues Encountered
  4. Features and Further Investigations
  5. Concluding Thoughts

Choosing Code-First

When last I had a quick play with Entity Framework in the summer of 2008, it was very IDE and database-driven – the development process entailed dragging tables onto a designer surface which generated partial classes to represent objects, etc. Yuk! The lack of support for a domain-driven, persistence-ignorant approach was a real turn-off for me, prompting me to add my name to the list of signatories on the infamous ADO.NET Entity Framework Vote of No Confidence.

It was therefore a relief to discover that Entity Framework now supports a “Code-First” development paradigm, whereby EF can be used like a traditional ORM to fluently map an existing domain model to an existing database. At the time, this was still available only as a Community Technology Preview (CTP5), but given my remit was to introduce EF to an existing solution, it was a no-brainer to choose this option for the proof of concept, even though this would potentially leave me exposed to bugs and breaking API changes.

Configuration

Configuring Entity Framework seemed a lot easier than the equivalent procedure with NHibernate, filled as it is with esoteric options which can often be somewhat perplexing to NH newbies. With EF, the walkthroughs told me that essentially, all I needed to do was create a subclass of DbContext and Bob would be my mother’s brother. They were right. If anything, it was all too easy, and I wondered how and where I would get the option to configure details such as the second-level cache provider and ADO batch size.

I made use of a constructor overload on DbContext which takes a string parameter called nameOrConnectionString. Initially I attempted to pass into this parameter a connection string obtained from the app.config file using System.Configuration.ConfigurationManager, but this led to error messages along the lines of

“Unable to determine the provider name for connection of type ‘System.Data.SqlClient.SqlConnection’”

It seems that the connection strings used by Entity Framework are not the common or garden connection strings that we know and love, but instead special EF connection strings. Fortunately, a simple remedy to this issue was to instead just pass the connection string name into the exact same parameter. See this forum post for more details.

Database Initializer Strategies

Entity Profiler revealed that Entity Framework was unsuccessfully attempting to access a nonexistent database table called dbo.EdmMetadata. A little light Googling revealed that this is used by Entity Framework to store (perhaps unsurprisingly) metadata about the Entity Data Model. Why does it do this? Presumably so that it can decide at application start up whether code changes to the model require it to automagically make associated changes to the database schema. Some Summer 2010 blog posts from Scotts Guthrie and Hanselman show off the ability of Entity Framework to automatically create and update databases in response to model changes.

Now, this is all very clever and might be fine for hobbyist websites, integration tests and those quintessential “Look – no code!” TechEd presentations, but in an enterprise scenario it’s liable to cause acute apoplexy in DBAs.

Fortunately we can opt to suppress this crazy behaviour altogether by passing a null strategy into DbDatabase.SetInitializer, thus:

DbDatabase.SetInitializer<MyAppContext>(null);

That was day one over – I was officially up and running in the brave new (to me) world of Entity Framework. In part two of this series I’ll take a look at the mapping tweaks I had to make to successfully map our existing domain model to our existing database.

Available For Work Again

2

As 2010 heads inexorably towards its festive conclusion, so too does my current contract at NHS Choices. I’ll be looking for a shiny new contract to get my teeth into from Wednesday 22nd December.

Want some evidence that the great British taxpayer has been getting VFM from yours truly during my time providing services to the NHS? Check out this comment from Development Manager Nick Porthouse:

"Ian worked on a particularly troublesome project for me for eight months at NHS Choices without complaint. His development skills and knowledge are exemplary and his NHibernate knowledge is second to none. He worked well with all the members of the team and helped me to introduce pair programming and push the test driven development agenda. I will employ Ian whenever I have the chance.”

Do you have any troublesome .NET projects (or, if you insist, even non-troublesome ones) that could benefit from my assistance? If so, give me a call on 07901 828483 or drop me a line at ian@iannelsonsystems.com .

Enterprise Integration Anti-Patterns #2 – Shared Assemblies

1

Having slain the beast that is Shared Database, the next dragon to appear on my Enterprise Integration horizon is Shared Assemblies. That is, the suggestion that Application A can leverage the functionality of Application B by simply adding references to B’s DLLs. After all, this potential for reuse is why we put our code in reusable assemblies in the first place, isn’t it..?!

iStock_000012931845XSmall

Well, no. Actually we usually break our application apart into assemblies to create a maintainable and testable architecture. An assembly is a coarse-grained unit of encapsulated functionality in that architecture; the fact that it is the smallest deployable unit in the .NET world is not necessarily an indication that we intend or desire our assemblies to be shared with other applications.

In my experience, sharing assemblies works only when they have been developed with this intention in mind. It’s feasible to share DLLs that contain discrete chunks of easily-encapsulated functionality that have no need for external dependencies (hasn’t every dev shop got a “common” or “utilities” library hanging around somewhere?!) but it’s quite another matter to start taking a wholesale dependency on the guts of another application.

So, before you try the shared assemblies approach to enterprise integration, here are a few things to consider:

Is It Even Possible?

Firstly, this approach makes the assumption that the application to be referenced has been developed such that all the logic to be invoked exists in the assemblies. If the application wasn’t originally developed with the intention of being shared in this way, there’s every possibility that some pertinent logic has instead been located in the front-end (e.g. the “Magic Pushbutton” anti-pattern that was so prevalent in ASP.NET Web Forms development), or even client-side scripts.

Love Me, Love My Dependencies…

All applications have dependencies of one kind or another. If you’re lucky, this will just be the .NET Framework, and if you’re especially lucky it will be the same version that is used by the consuming application.

More likely, the application being referenced will depend on a whole bunch of third-party DLLs, and now they’ve suddenly become your dependencies, too! If your dev shop is anything like the places I’ve worked, each application will depend on various different versions of the NHibernate and Castle stacks and their associated dependencies. Good luck managing all those references while retaining your sanity.

Applications are more than just DLLs

The average enterprise application is more than just a bunch of assemblies that can be referenced independently. They also tend to rely on configuration settings, external resources (e.g. databases), and a whole bunch of stuff triggered at application start-up time such as the configuration of an IoC container, logging, ORM, etc. The handling of authentication and authorization might also need rethinking before assembly sharing becomes a viable option.

Maintaining the Combined Suite of Applications

As I said in my previous post on this topic, maintainability is a particular bugbear of mine as a developer, and sharing assemblies across multiple applications doesn’t help matters one iota.

Unless you work with unbearably large solutions containing every project in your suite of applications, it’s difficult to know the wider impact that modifications will have, and to avoid making breaking changes. For sure, it’s possible to run automated builds of all applications when shared code is changed (I blogged about one method of achieving this with the first version of Team Foundation Server some years ago before I came to the conclusion that this is A Bad Idea) but by then the hard work has been done, the code checked in and the damage done.

Arguably, sharing assemblies puts developers in an even stickier situation than sharing databases – at least with that anti-pattern the entanglement is limited to one particular layer of the architecture which can be largely ignored much of the time. But when assemblies are shared, the developers must always be considering whether their apparently innocent modifications will have an impact on some other application. Meaningful refactoring becomes impossible, and the software eventually falls into disrepair.

iStock_000004248715XSmall

In Conclusion…

When all is said and done, there are only two realistic patterns for sharing functionality between enterprise applications – remote procedure invocation and messaging. Both serve to decouple the applications, and both are easy to achieve with popular technologies on the .NET stack such as WCF, OpenRasta and MassTransit.

There’s an abundance of resources out there detailing how to create effective service-oriented architectures, the obvious canonical reference tome being Hohpe and Woolf’s Enterprise Integration Patterns. I also highly recommend Michael Nygard’s Release It! for an entertaining overview of some of the problems that can be encountered when you start taking dependencies on remote services, and useful patterns for ensuring your application remains responsive at all times.

My preferred technology stack for application integration remains WCF, and in a future blog post I’ll outline some tips I’ve learned through trial and error for developing and consuming effective and maintainable WCF services.

NHibernate and Mapping Aggregates

2

A few days ago a friend emailed me the following question regarding NHibernate mappings for a solution he’s currently developing:

“I have an idea entity that has a collection of comment entities and I need to get the comment count for each idea. I made a massive mistake at the beginning by calling idea.Comments.Count (even worse, I did it in the view!), which due to the collection being lazy-loaded caused about 10 database calls so performance was sluggish even with second level cache.  I was therefore wondering how you would do it – would you use HQL and use Comments.size or would you do something differently?”

Now, I’ve been pretty busy recently, so before I had opportunity to respond properly, he sent this follow-up:

"After looking for a solution for getting a Comment count back for each Idea, I found using the Nhibernate Formula method does the job – just wanting to make sure I was on the right track in terms of performance etc.  My mapping class is as follows:"

public class IdeaMap : ClassMap<Idea>
{
    public IdeaMap()
    {
        Id(x => x.Id)
            .Column("ID")
            .GeneratedBy.Identity();

        Map(x => x.Summary).Not.Nullable();
        Map(x => x.Description).WithMaxSize().Not.Nullable();
        Map(x => x.Created).Not.Nullable();
        Map(x => x.LastStatusChange).Not.Nullable();
        Map(x => x.Visible).Not.Nullable();
        Map(x => x.Status).Not.Nullable();

        Map(x => x.CommentCount)
            .Formula("(select count(*) from Comment c where c.IdeaId = ID)");

        HasMany(x => x.Votes).KeyColumn("IdeaId")
            .Inverse().Cascade.AllDeleteOrphan();

        HasMany(x => x.Comments).KeyColumn("IdeaId")
            .Cascade.All();

        References(x => x.Category).Column("CategoryId")
            .Not.Nullable();

        References(x => x.CreatedBy).Column("UserId")
            .Not.Nullable();

        Cache.ReadWrite();
    }
}

I considered this for a while, and sent the following suggestions:

“I’m glad to hear you resolved your SELECT N+1 problem but to be honest, I’m not a big fan of using formulas in NH mappings if at all possible, for the following reasons:

  1. I try to minimize my use of strings (and especially SQL) so as to make refactorings easier, and lessen the potential for runtime exceptions.
  2. The default NH behaviour will be to evaluate that formula every time an Idea entity is loaded, which might not be desirable and could negatively impact on performance when loading your idea entities. I’m not sure if the recently-added Lazy Properties feature of NH can be applied to these derived properties; if so then that could be used to negate this argument.
  3. I try to avoid putting logic (however simple) in the OR mapping layer, as future developers are unlikely to expect to find it there! I like to reduce the element of surprise in my solutions, and put such logic in the domain layer. I think logic in the OR layer limits options going forward – for example if you subsequently decide all comments have to be moderated, does the CommentCount formula have to be modified to exclude comments awaiting moderation..?

So, what would I do? Here are two options, depending on how often you’re using the CommentCount property:

If you’re only using the CommentCount occasionally and only along with a subset of the other properties from Idea, then I would write a specific query returning a projection of the required properties, including this CommentCount aggregate.

I’ve done this in the past where I had a requirement to populate a drop-down list with user names and the number of open work items assigned to each user, for example. I didn’t want or need to maintain an ActiveWorkItemCount property on the user object, I just wanted to do the calculation in one place (incidentally, LINQ made this a doddle).

Conversely, if the CommentCount property is something you’re going to be referencing frequently, then I would denormalise the database and add a CommentCount field to the Idea table. This presumes that you’re in a position to enforce the constraint that new Comments are only added to Ideas through your application (and as you know from my recent blog post, I am very fond of this kind of constraint!). This approach should give the best performance and flexibility, at the expense of irking normalisation fascists.

Typically this would be done by creating AddComment and RemoveComment helper methods on the Idea entity, which can maintain a bidirectional relationship between Idea and Comment in addition to incrementing or decrementing CommentCount accordingly.

This approach will give the best performance, and keeps the logic where it belongs (and where it can be easily extended and tested, as in my earlier hypothetical comment moderation example).

For a good example of code to maintain bidirectional one-to-many class relationships with NHibernate, see pages 39-43 of NHibernate 3.0 Cookbook.

Hope this helps. As ever, it’s just my opinion, but these two techniques have worked well for me.”

What do you think? Are there any other approaches worth considering?

Page 1 of 1412345»10...Last »