August 0928

DataReaders vs DataTables

This post is a bit of a rant, just to get things off my chest and see what anyone else has to say about the issues.

I feel like I’m in the middle of a bit of a ‘fight’ at work at the moment.

Basically I don’t like the way things are done at work and I’m trying to change them for the better. As you’d expect I’ve had some success and also some failures – so far I’ve been successful in introducing version control in the form of Subversion, setting up a minimal continuous integration server and unit testing is starting to take off. Where I’m still facing problems is in trying to improve the way we work on projects, as well as how we approach (re)developing the Data Access Layer.

A little background on how things are currently done

A Business Analyst first evaluates any new project / project changes and produces some data flow diagrams (this person will then become the project manager). When this is done the project is passed on to a Systems Analyst who then writes specs to describe exactly how this should be coded. These specs are in written as procedural pseudo code and then passed on to the developers who actually write the code. Some processes are for internal programs only and these have always been handled by the VB (WinForms) developers, some processes also need to work with the web and if that’s the case then the web team get involved too.

Recently we had a meeting to discuss whether the spec process needed changing. All our websites have been written in ColdFusion but are in the process of being rewritten in ASP.NET and some of the peculiarities of ColdFusion might no longer be relevant. I very strongly resisted the pseudo code approach because I felt it took the fun out of developing and reduced us to mere monkeys. My feeling is that the whole development team should be involved all the way through the process from analysing the requirements, through working on the database procedures up and into writing data access and business logic classes – so everyone knows about the whole system and could take over from another developer if needed. To me this broadens the entire teams’ understanding of the system and who knows someone might have a bright idea that helps the entire project! Currently only 1 or 2 people really understand how the database hangs together but by getting everyone involved I’m hoping to spread that knowledge around. Having everyone involved and getting the web developers to write SQL and DAL code was accepted as a good idea though it might be a new learning curve for some.

I also suggested unit testing and that was somewhat reluctantly accepted – I’d been going on about it for ages and the general perception (I think) is that it’s a lot of extra work for very little benefit. I originally suggested that the specs simply be written as unit tests then the developers would be free to come up with their own solutions rather than being told to loop this and do that. I nearly got that accepted but instead the spec includes pseudo code unit tests and pseudo code for us to translate into code. But hey some unit testing is better than none and hopefully the benefits of unit testing will start to become more obvious, our tester should also be freed up from some really tedious testing and start to clear his backlog I hope.

The issue today

Now the fight I’ve been having today is about how data should be made available. Historically the specs make heavy use of DataTables, (the VB developers were vb5 coders and for what they do DataTables make a lot of sense) there are no business objects, all interaction is done through the DataTables. I would love to take advantage of an ORM like NHibernate, unfortunately that’s another battle I’m having – so for the time being its got to be ADO.NET. My experience tells me that writing classes to represent business objects populated by DataReaders is the best way to go about this but, the Analysts don’t see the point in using DataReaders, DataTables work for them so they should work for the web. I’ve tried explaining that as DataTables are in memory mini databases they represent a much bigger memory footprint and this won’t scale as efficiently as the approach I’m suggesting, that and they’re slower than DataReaders as they use DataReaders behind the scenes to populate their DataRows.

Granted there is more work in getting results from the database with a DataReader and then iterating through that to populate objects or strongly typed Lists of objects, but it’s a less memory intensive solution, isn’t using Fisher Price programming; yes it will need updating if they change the data schema – unless we code gen much of the plumbing, but you can at least unit test an object / mock it, very hard to do that with a DataTable.

At times I feel like I’m banging my head on a brick wall that almost seems to delight in refusing to listen.

Have you had any similar experiences that might give me some insight into how to deal with this?

Comments

  1. United Kingdom El Karlos (Wednesday, September 02, 2009) # El Karlos

    At least you're not fighting alone!

  2. United Kingdom SimonMartin (Friday, September 04, 2009) # SimonMartin

    Hi Alex,
    I have started to look at the MS Enterprise Library and I've been pretty impressed with the validators in there; but there's so much to it that I haven't looked at the data access side of things - so will spend a bit of time digging into that thanks Smile

    I would have liked to go with LINQ to SQL but we're using an Oracle DB so that would mean using Entity Framework. Not that I'm against that, I'll use whatever is the best solution for the problem. I suspect you may be right, something with Visual Studio integration and support where things can be visually identified will feel closer to the DataTable environment and so be less of a 'shock to the system'.

  3. United Kingdom Alex (Friday, September 04, 2009) # Alex

    hi simon,

    We are using the MS Enterprise Library Block (Data Access Application Block) and its pretty damn good. gives you great flexibility to design your domain model and business logic without rewriting all that data access code that you've written thousands of times before.

    It certainly isnt an ORM but has a quick learning curve, disposes of itself nicely and is especially good for enterprise level apps.

    You could always push for Linq to SQL with its support of POCOs and then you might have the best of both worlds, an ORM data access layer that ultimately provides you with an object or collection of objects that are strongly typed, capturing errors at compile time not runtime and, improved development turnaround (quicker that the steep learning curse for nhibernate) and something that is more like disconnected data and datatables than dealing with datareaders and connection handling (although trivial) etc. Yes Linq to SQL is potentially slower that dealing with raw data readers, but this is most probably no worse that that performance impact of disconnected datasets.

    IMO.

Comments are closed