Create Sortable Collections under 60 Seconds

Monday, 26 May 2008 18:02 by Alan Mojab

Storing a collection of reference types and value types is a common practice in Software Development. Since the introduction of the Generic Types in Framework 2.0 working with collection elements has become much easier.

The ArrayList type used to be the common type for storing objects but the new kid on the block “Generic List” has taken away the popularity from the ArrayList. Since elements were stored as an object casting reference types from object to enclosing type needed the extra coding and decreased the performance. With value types the developers faced the boxing issue at storing time and unboxing when the data was retrieved from an instance of ArrayList object.

With Generic List type you have instance access to the elements of an instance generic List object without the need to cast or unboxing the data. The generic List has far more members that its predecessor ArrayList. I’m not going to delve into each member and explain what they do. I’m sure most of you already familiar with them and they have been well documented on MSDN Library or by other developers.

In short the generic List functionality (members) is best used with .Net Framework’s value types but not with reference types. As long as you are using generic List to pass a collection of data around you would never face the fundamental problem I’m about to highlight in here.

Please examine the following snippet:

List<string> names = List<string>();
if (names.Contains("Mark"))
{
    // Do Something 
}

The above code makes generic List a glorious type to work with because it offers the encapsulated “Contains” logic to see if values exist in the collection or not at ease.

Please examine the following snippet:

List<Person> persons = new List<Person>();
if (persons.Exists(this.MarkExists))
{
    //Do Something
} 


private bool MarkExists(Person person)
{
    return (person.FirstName == "Mark");
}

 
On paper the capability of generic Predicate delegate combined with other generic types such as generic List is smashing but at least in case of generic List it offers hardly any practical solutions in real world.

The downside of working with generic Predicate and generic List are as follow;

  • Possibility of duplicating logics: Even if you declared MarkExists method with public or internal visibility access you can’t be assured other types can access the enclosing type at all time.
  • Too many members to declare: You can’t possibly declare one target method for every name there is in Person’s FirstName property. How about other business logics that you might have with other members in Person type?

In one of my earlier blog posts I talked about encapsulating logics which is one of the most fundamental practices in SD. Any logics that you would write against a generic list object would not be necessarily encapsulated. This is the very reason why as soon as I need to write any logic against a generic List object instead I would declare a new type and extend the generic Collection type and encapsulate them within the extended generic Collection type.

I can pass on such objects from members to members or from the declaring type to another types knowing I have access to all predefined logics against my custom collection type.

The generic Collection type has very simple object model and extremely easy to extend. This is actually an advantage that the generic Collection has a very simple object model. The generic Collection type can be found in System.Collections.ObjectModel namespace.

If you only develop database applications then the chances that you are using more complex collection types are high such as the powerful collection types that you would get with the O/RM tools or the custom collections you built for your own needs. This post is less relative to entities and entity collections.

Smarties 2008 is not a database application but for almost every type there is you would find an extended generic Collection.

Often you are required to introduce sorting capability to the extended Collection types or add a simple AddRange() method to ease up inserting objects to the collection. The generic Collection does not offer AddRange() method by default but from the constructor you can pass any objects that implement the generic IList interface. We all know that not always it is possible to pass data on initialisation time.

I’d get a lot of joy out of my work when the output is of something that is both practical and so damn easy to use. You would find many commands in Smarties 2008 that are unique and in terms of productivity they are simply unbeatable. The same day that I published this post I released version 1.9.0 of Smarties 2008 that the main focus was to introduce sorting to the extended Collection type and to ease up working with generic IComparer, IComparable, and generic IComparable interfaces.

I have created a demo for each supported language (C# and VB.NET) that is not just specific to a single command (as I often do). It shows off several commands in actions and you will see how easy it is to create sortable collections using Smarties 2008. Please don’t think entity collections or try to compare what you see in the demo with what you get with entity generator of O/RM tools. In real world you would have far more non-entity types than entity types anyway.

Before watching the demo I would like you to think of a class with at least three properties, an extended generic Collection type since you would be encapsulating logics, and the comparer type that is needed for sorting the reference types in the collection. Lots of work for very basic functionality even if you would use the copying and pasting techniques, isn’t it?

To be precise the demo takes 51 seconds to produce the output. In fact the only thing was done for this demo was the three private fields that I declared by hand earlier and the rest was done just by few clicking of the mouse.

Here is the link to Demo Page.

IDisposable Interface explained!

Thursday, 17 April 2008 16:09 by Alan Mojab

To take advantage of this article you must have a basic understanding of IDisposable pattern already. Dan Rigsby has recently published a great article on how to implement IDisposable pattern. You will find complete code snippets and download link to the source code on his blog.

There is a misperception about IDisposable that it is used only when you have unmanaged resources declared in classes. The MSDN Library documentation is partially to be blamed for this for mentioning the “unmanaged resources” only not both managed and unmanaged resources. Here is what MSDN Library says about IDisposable interface;

“Defines a method to release allocated unmanaged resources.”

The definition only covers half the story.

I’ve recently written an article about Designing rich classes with Smarties 2008 that I talked about ways to make a class rich. One way to make a class rich is by providing services to the users of the class such as the common interfaces it would implement i.e. IDisposable, ISerializable and ICloneable.

Since the release of C# 2.0 IDisposable’s responsibilities have increased. One of the cool things that you can do in C# is to declare a type within a scope at the end of which the type will be disposed.

Example:

using (Class1 obj = new Class1())
{ 


}

Prior to C# 2.0 the declaring type only needed to have a public (Public in Visual Basic) Dispose() method but now the specification in C# 2.0 specifies that the Type must implement the IDisposable interface. I think this on its own proves my theory that there is nothing wrong to implement IDisposable for all declaring types. By doing so you would ensure the user of type is able to take advantage of the “using” keyword in C#.

As part of my research for this article I looked for patterns in .Net framework and it appeared that Microsoft only used IDisposable when there were unmanaged resources or unsafe types were used up to Framwork 2.0 but I started noticing two things in Framwork 3.0 or higher. Remember the majority of Framework 2.0 types were ported from 1.1. The first thing was that IDisposable appeared to be implemented by types that have no unmanaged resources and unsafe types. The second thing that I noticed was the pattern I use with collection types (a type that extends the generic Collection class) that I’ll explain about it shortly.

This is my conceptual understanding of how GC (Garbage Collection) works in .NET...

When an application or module (single .dll) is launched the OS creates a process and loads the application/module to the process. The process itself is a boundary that has knowledge of memory addresses it can use. A process cannot consume the memory address of another process on the same machine. The responsibility of creating the process is up to the OS when the application is written in low level programming languages such as C++.

In C++ environment each application has its own responsibility to release the memory it consumes failing to do so would result in memory leakage and eventually the machine runs out of memory space for other new processes to consume. C++ developers have the extra coding to release anything that they would consume and this makes developing C++ application very difficult.

.Net developers don’t have the same issues that C++ developers have because the .Net framework itself creates the processes therefore .Net Framework can monitor and manage all processes that it creates for assemblies. This would free the .Net Developers to write routines to release the consumed memory.

To reclaim used memories .Net framework has a service called GC (Garbage Collection) that on certain intervals it scans the memory and reclaims memory spaces that no longer are in use. I really don’t know base on what complicated algorithm GC starts scanning. One thing I know is that the GC is designed to receive low memory notification messages from OS to kick start the scanning process.

In .Net when you assign an object to null (Nothing in Visual Basic) you actually instruct the CLR (Common Language Runtime) to make the memory address of the object inaccessible but the object’s data is not destroyed from the memory (heap) yet. I have assumed there is a table (graph) that holds such memory addresses for GC to use. The GC at certain intervals would scan the table (graph) to destroy the object’s data or to delete the pointers that held a reference to another object in memory.

How about when you don’t assign objects to null (Nothing in Visual Basic) when you are done using them? In this case the GC would examine all allocated memory addresses within your application domain to see if the object in memory has been referenced by other objects or not. When an object is being referenced by another object there will be a pointer for this reference in the memory table. If no pointers found for an object and the object is out of scope then the object is destroyed.

As you can see the process that needs to check the entire memory table (graph) (objects that are not assigned to null) would take longer to be processed than the scanning process for marked memory addresses to be reclaimed. In terms of ticks you would never notice the difference between the two but if your application stores large amount of data you might notice sluggish performance when GC is in action from time to time.

You can also think of using IDisposable interface to help GC to do it jobs quicker by assigning all declared fields to null (Nothing in Visual Basic). Obviously there are few more valid reasons that I’d try to cover them in this article.

Now that I talked about a very basic concept of how GC collection works we can move on to the pattern that I use for types that extend generic Collection class.

Please examine the following type;

public class EmployeesCollection 
: Collection<Employees>
{
    public EmployeesCollection()
    { 


    }
}

The class above represents a collection of Employee type.

In database-driven applications the user might call a query that fetches hundreds of Employee records from the data store. If the user repeats the same action there is a good chance before GC gets a chance to clean up the memory of previous fetch event the user consumes more memory. The process that the application is loaded into then can run out of memory address to use and finally crashes the application.

These days newly build machines have over 1GB of extended memory. A single process can allocate up to 2GB (32bit processor) of space on the heap. That is a lot of data to store in 2GB space but EmployeeCollection object is not going to be the only object that would resides in the process of the application at one given time.

It would be a good Software Development principal to think once the object lifetime is reached to take the necessary measures. I will show you another real world example so that you can see how this principal is important even in .Net platform with GC at your disposal.

Now, let see the IDisposable pattern in action. Please assume the Employee class already implements IDisposable interface. The IDisposable pattern for the EmployeesCollection would like the snippet below:

private void Dispose(bool disposing)
{
    if (!this._Disposed)
    {
        if (disposing)
        {
            foreach (Employees emp in base.Items)
            {
                emp.Dispose();
            }
            base.Clear();
        }
    }
    this._Disposed = true;
} 


public void Dispose()
{
    this.Dispose(true);
    GC.SuppressFinalize(this);
}

The important point is in the most inner if statement. Please observer how each Employees class gets disposed and then the collection’s Clear() method is called. With a single call to Dispose() method you can now take care of everything.

To compliment this article I have added support (C# only) to Smarties2008’s Smart IDisposable command to create the foreach loop statement like in above snippet. The next release 1.3.6 that I plan to release on the same day as publishing this article will have this feature included.

Finally I’m going to talk about another scenario where IDisposable interface becomes a necessity to implement by giving you a real world example.

Smarties 2008 has a command called Regionize This that organises members within designated #region directives. Regionize This command can be executed from various locations including from Solution Explorer’s Solution Menu. Once Regionize This command is executed it would traverse the projects and project’s items.

Regionize This does a relatively complex process to complete its job and also would consume relatively large data during the Regionization. Considering a large solution such as Microsoft Enterprise Library with over 3,000 .cs files it would require a great deal of care not to run out of memory when the Regionize process is traversing and processing all the files. One way to do this is by disposing virtually all types that are used by the Regionize Process to handle a single file (Project Item) at a time. I couldn’t simply rely on GC to clean up all types that the Regionize process consumed for each .cs file as the Regionize process continues instantiating the next set of types as soon as it completes the current one.

Now we have reached to the real question as to when it is safe to call a Dispose() method. The answer to this is really depends how a type is designed and how it is used within an application therefore I can only offer you guidelines to watch for the hidden implications.

Please examine the following class below:

public class ClassA : IDisposable
{
    private string m_Field1 = null;
    private int m_Field2 = 0;
    private bool _Disposed = false; 


    public string Field1
    {
        get
        {
            return this.m_Field1;
        }
        set
        {
            this.m_Field1 = value;
        }
    } 


    public int Field2
    {
        get
        {
            return this.m_Field2;
        }
        set
        {
            this.m_Field2 = value;
        }
    } 


    protected virtual void Dispose(bool disposing)
    {
        if (!this._Disposed)
        {
            if (disposing)
            {
                this.m_Field1 = null;
            }
        }
        this._Disposed = true;
    } 


    public void Dispose()
    {
        this.Dispose(true);
        GC.SuppressFinalize(this);
    } 


    ~ClassA()
    {
        this.Dispose(false);
    }
}

ClassA does not declare any reference types that implements IDisposable pattern therefore it has no implications with regards to the overloaded Dispose(bool) method. Please examine ClassB now:

public class ClassB : IDisposable
{
    private ClassA m_Data = null;
    [System.NonSerialized]
    private bool _Disposed = false; 


    public ClassB()
    {
        this.m_Data = new ClassA();
    } 


    public ClassA Data
    {
        get
        {
            return this.m_Data;
        }
    } 


    public void Dispose()
    {
        this.Dispose(true);
        GC.SuppressFinalize(this);
    } 


    ~ClassB()
    {
        this.Dispose(false);
    } 


    protected virtual void Dispose(bool disposing)
    {
        if (!this._Disposed)
        {
            if (disposing)
            {
                if (this.m_Data != null)
                {
                    ((IDisposable)this.m_Data).Dispose();
                }
                this.m_Data = null;
            }
        }
        this._Disposed = true;
    }
}

The ClassB declares a reference type field (m_Data) that its type implements IDisposable interface. You need to pay a close attention to how ClassB is designed. For one Data property is ReadOnly therefore there is no chance this property is assigned by another object (instantiated ClassA). The second thing you need to notice is that ClassB has no constructor that declares a parameter of ClassA type and in default constructor m_Data field is instantiated. Once again the Dispose() method of ClassB can be called without any implications.

Please examine ClassD now: I do know after ‘B’ is ‘C’ in English alphabet but ClassC is so ugly to type :)

public class ClassD : IDisposable
{
    private ClassA m_Data1 = null;
    private ClassB m_Data2 = null;
    [System.NonSerialized]
    private bool _Disposed = false; 


    public ClassD()
    {
        this.m_Data1 = new ClassA();
    } 


    public ClassD(ClassB data2)
        : this()
    {
        this.m_Data2 = data2;
    } 


    public ClassA Data1
    {
        get
        {
            return this.m_Data1;
        }
    } 


    public ClassB Data2
    {
        get
        {
            return this.m_Data2;
        }
        set
        {
            this.m_Data2 = value;
        }
    } 


    protected virtual void Dispose(bool disposing)
    {
        if (!this._Disposed)
        {
            if (disposing)
            {
                if (this.m_Data1 != null)
                {
                    ((IDisposable)this.m_Data1).Dispose();
                }
                this.m_Data1 = null;
                if (this.m_Data2 != null)
                {
                    ((IDisposable)this.m_Data2).Dispose();
                }
                this.m_Data2 = null;
            }
        }
        this._Disposed = true;
    } 


    public void Dispose()
    {
        this.Dispose(true);
        GC.SuppressFinalize(this);
    } 


    ~ClassD()
    {
        this.Dispose(false);
    }
}

ClassD’s design is similar to ClassB with exception of m_Data2 member which can be assigned via the constructor or the setter of Data2 property. The overloaded Dispose(bool) method calls the Dispose() method of m_Data2 if it is not null. This is the scenario that you need to watch because you can easily cause a runtime exception in your application.

When a type’s member can be assigned in this manner that means most likely (not always) you would assign or pass another object reference of the same kind to ClassD. Please examine the code snippet below to see what I mean:
 

ClassB objB = new ClassB();
ClassD objD = new ClassD(objB);
objD.Dispose();
// The following code throws a null reference exception since objD 
// disposed objB by calling the Dispose() method.
ClassA objA = objB.Data;

Here is the correct version of overloaded Dispose(bool) method:

protected virtual void Dispose(bool disposing)
{
    if (!this._Disposed)
    {
        if (disposing)
        {
            if (this.m_Data1 != null)
            {
                ((IDisposable)this.m_Data1).Dispose();
            }
            this.m_Data1 = null;
            this.m_Data2 = null;
        }
    }
    this._Disposed = true;
}

We simply assign the m_Data2 field to null. In real world the first instantiated object can be safely disposed as shown in the following snippet:

ClassB objB = new ClassB();
ClassD objD = new ClassD(objB);
objD.Dispose();
ClassA objA = objB.Data;
objA.Dispose();
// All references to objB are also Disposed
objB.Dispose();

As you have seen how a type is constructed can determine how safe or unsafe it is to call the Dispose() method of reference type fields that implemented IDisposable pattern. Perhaps unlike what you have assumed previously the key of implementing a trouble free IDisposable pattern actually is in the overloaded Dispse(bool) method.

You need to be aware of how reference type fields are assigned or instantiated in the types that implements IDisposable and then decide what measures to take. Smart Interface command of Smarties 2008 can never correctly determine the type’s design therefore it would prompt for the fields that implemented IDisposable pattern to be selected by the developer who would know which field is safe its Dispose() method to be called.

The only other thing you need to watch is not to call a type’s Dispose() method too early when copies of the same reference are still being used.

Conclusion

I hope this article cleared some of the confusions or doubts that you had about IDisposable interface in your mind. I leave you with some keys points in this article so that you can come back and remind yourself of them from time to time;

  • Try to see IDisposable interface as a service to a type.
  • Collection types that hold Entities or Business Objects must implement IDisposable interface.
  • To take advantage of “using” in C# make sure types are Disposable.
  • Types that are used in recursive or traverse processes must implement IDisposable to release resources as soon as the type’s lifetime is reached.
  • The key to implement IDisposable safely is in Overloaded Dispose(bool) method and taking a good look at how the type is designed.
  • Ensure the Dispose() method is not called too early when copies of the same reference are still being used.
  • IDisposable can ease up the job of GC
  • Consider assigning objects to null (Nothing in Visual Basic) more often when IDisposable is not implemented i.e. string (String in Visual Basic) type.
  • IDisposable is not just for unmanaged resources.

Added the rest on 20 April, 2008

After Greg’s discussion I felt I need to add more details to this article to clarify certain parts of the article.

One of the things that I touched on was how GC is capable of cleaning up the memory. I stated that since the .net runtime launches the processes GC can monitor and clean up the memory.

The question that you need to ask yourself is?

If GC is so clever why it doesn’t clean up all the processes in your machine, even those that are not compiled in .Net?

Well it can’t because managed code executed very differently. In .Net environment processes are loaded within Application Domains. Here is what MSDN Library says about Application Domains;

"Application domains provide a more secure and versatile unit of processing that the common language runtime can use to provide isolation between applications. You can run several application domains in a single process with the same level of isolation that would exist in separate processes, but without incurring the additional overhead of making cross-process calls or switching between processes. The ability to run multiple applications within a single process dramatically increases server scalability."

In an essence the process has to be created first within the Application Domain to host the assembly (Process Module) that is being loaded.

I have also stated that unlike C++ application when is launched or executed the framework runtime creates the process not the OS. I do agree that this statement can be a bit confusing to some readers. Here is what I meant by that statement.

To understand this first we need to understand .net assemblies which are fundamental unit of deployment. .Net assemblies unlike C++ written applications cannot be executed by OS “directly” but assemblies contain information about how they can be loaded into a runtime (VM). Let call it the “loader” information that OS can understand. Once I read about this in early days of .net but I can’t remember what exactly it is called now. This is true that once you pass 40 your memory starts rusting :)

Here is what MSDN says about assemblies;

"Assemblies are the building blocks of .NET Framework applications; they form the fundamental unit of deployment, version control, reuse, activation scoping, and security permissions. An assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. An assembly provides the common language runtime with the information it needs to be aware of type implementations. To the runtime, a type does not exist outside the context of an assembly."

The way I understand everything up to this point is that executable assemblies (*.exe) are packed with types and resources than can be loaded, executed, and managed in an environment called .Net Framework. They have no common characteristics to native applications that we have known from the past. The OS has no knowledge as how to execute them (run) therefore no process is created “directly” from the call (launching) to the executable assembly. The only thing OS can do is to pass the assembly to its runtime.

The framework itself is written in C++ to create an environment for managed code. The framework itself calls Windows APIs to do so many things i.e. to create a new process or to unload a process. In that sense every process created to host an assembly, known as Process Module, are created by OS but via a call from the framework (runtime).

Something extra about GC…

GC is not as active as you might think in the background or at least by the time an application is terminated (unloaded) still GC has few jobs to finish with the memory cleaning up process. One way that you can observe this is by inserting breakpoints into the Dispose() method of types that implemented IDisposable and unload the running application. I’m not too sure which part of the framework is responsible for these calls but I doubt it the call is from GC. If you do know then please let us all know.

Picture a web hosting server if you will, if GC meant to be that active all processes that are running in the server would require extra processing power to handle the GC activities.

You have to forgive me for not being good at putting down what I really want to say. I have only recently come out of my shelf after number of years working in Software Development field. I would only get better at writing my next articles for you :)

Designing rich classes with Smarties 2008

Wednesday, 9 April 2008 17:04 by Alan Mojab

A class represents a template for an object that can have attributes and/or behaviours. The object itself normally represents something in real world i.e. Person or something that physically does not exists but one can define it i.e. a Blog Post.

In .Net development platform classes and structs are the core object templates. Classes and Structs derive from .Net’s Object class. Then there are Interfaces. An interface is a contract template that you can sign with class and struct types to ensure they implement the same members on the contract.

Quite often I use the phrase “Rich Class” to suggest the class in question exposes common services to its users. Also a “Rich Class” is the type that encapsulates all possible logics within itself. This is the most fundamental design strength of a class.

Here is an example to what I mean. You have a class called Person with FirstName and LastName properties that the user can use to retrieve values and to assign values to the designated fields. Now imagine the Person class is used by hundreds of developers. If the developers need to show the full name of the person they need to write a concatenating statement like the code snippet below:

textBox1.Text = person.FirstName + “ “ + person.LastName;

If the Person class was “rich” then it would have either a ReadOnly property called FullName or overrided the ToString() method of System.Object to make it easier for the users of the Person class.

Examples:

textBox1.Text = person.ToString();

or

textBox1.Text = person.FullName;

The poor design of Person class made all developers to code extra unnecessary.

You can also design rich classes that expose services to its users. For instance, I look at the implemented Dispose() method of IDisposable pattern as a service for types. This is up to the user whether they want to call it or not. Incidentally, in C# 2.0 if you intend to use the “using” keyword the type now is required to implement the IDisposable interface.

Similar to IDisposable you can introduce ISerializable and ICloneable as services too. Developers can’t predict everything to what the user of the type would intend to do therefore it would always be a good practice to make a type as rich as possible.

I like to design rich types because I know such types would save me a great deal of time later on when I have less time. Creating rich classes is very difficult and time consuming something that developers never have.

Considering that developers are required to learn more to be able to use the never ending new technologies it makes time ever more precious. I have noticed in the past 2-3 years software quality has dramatically dropped. Obviously apart from the lack of time and rushing the product out to bring in the money there are other factors involved too, the lack of skills/experience, out-sourcing, and few more to mention.

Now to the good part…

Smarties 2008 helps developers designing better types by reducing the coding time and freeing the developer to work more on the types. With Smarties 2008 many painful routines can be done with a couple of clicks.

Smarties 2008 features:

Region Commands: If you use #regions to organise code that will boost productivity then you would find Smarties 2008 very handy and even if you dislike #regions then Smarties 2008 gives you all the commands you to remove them from your source files. 

Refactor Commands: Common refactor commands plus many unique commands that are pure time-savers.

Smart Interfaces: You can click 2-3 times and implement one of the supported Interfaces with full statements generated. The supported interfaces are IDisposable, ISerializable, and ICloneable.

Data Commands: Even though Smarties 2008 is not an OR/M or DAL Generator we have recently added Data Commands related commands. One of the things you can do is to create Flat BO/VBO from one of the nine supported database engines. I’m currently working on adding DAO support to save you even more time.

This is true to say if developers have the right tools they would be more productive therefore can produce better software. A single feature in photoshop the king of graphic applications doesn’t do much and it doesn’t make you creative either but collectively you can do so much and make eye catching graphic images. In that sense Smarties 2008 might appear not to do much at first glance but collectively you can achieve a lot.

Imagine you need to create a class that has four properties, of any types, the class has to implement IDisposable, ISerializable since it would be transported over the wire, and ICloneable interfaces. I’m not done yet… Then the class also need to produce an XML string to pass the data to a legacy system. With Smarties 2008 all you need to do to declare the fields and the rest is done by few clicks. With a couple of more clicks you can even create a generic collection of above type. When you have powerful features at your feet then you wouldn’t need to be so economical in your design.

I really just touched the surface as to what you can achieve with Smarties 2008. Please free to visit the site to download a free trial or to watch the videos to see what Smarties 2008 can do for you.

Data Access Objects

Tuesday, 8 April 2008 11:21 by Alan Mojab

I promised to blog about the lack of DAO support in Business Object Generator command.

As you might have noticed by now I have created a new menu under Smarties 2008 called “Data Commands” to list all the relevant commands there. One of the first commands that I added for “Data Commands” was the BO/VBO Generators. Basically my goal was to add the capability for the developers to create flat BO or VBO from database tables.

Early on into the work I mentioned about DAO and how simple it would be. After working for a few days to see how I could do that I gave it up because I could only see myself doing what I did in the past with my actual OR/M project.

I have also concluded touching DAO would side track me from Smarties 2008 project all together. Believe me when I say this I have a weakness when it comes to DAL because I’m so passionate about making this process easier.

There are three main approaches that I think of for creating DAL. Before I talk about them let me explain or sell using Business Objects to you instead of the conventional way of using DataSets.

There are millions (exaggerating of course) of Software Development principals that one need to follow to reduce risk of failure. The more you know the more you realise how painful software development actually is. One of the SD principals is to have an object template that represents an entity within a domain i.e. Employee. My biggest problem with DataSet class has always been it cannot be defined as an entity.

Picture this if you will, you have a method called ProcessEmployees(DataSet emp) that accept a parameter of DataSet type. If you are going to expose such method to consumers then each method that accept DataSets needs to check if the DataSet contains all the correct attributes for an Employee. This is a nightmare for sure.

With strong BO types you never face such fundamental issues. It would be ten times more difficult to create DAL with BOs than using DataSets but all is worth it. If you use the same approach all the time and more or less use the same database engine then you can always spend a quality time designing something that you can re-use for the next projects as well. This is another software development principal. The extra time you spend always pays back ten times more at later time.

DataSets can come to rescue when the structure of data is not known until runtime (late-binding). A good example of that would be the GetSchema() methods of Connection providers.

As I mentioned before there are three main approaches for creating DAL for application systems. These are:

  • Direct Interaction with ADO.NET
  • ADO.NET > DAO Helper > BO
  • OR/M

Direct Interaction with ADO.NET

In this approach you would code against ADO.NET directly which you would end up writing the same kind of routines all the time. The hardest part would be populating data from DataReader to BO and writing that boring insert routine. I believe I can add supports to Smarties 2008 for this. Writing this article has already given me some fresh ideas that I would conclude them at the end of this article. I’m thinking loudly now.

ADO.NET > DAO Helper > BO

In this approach you would use a DAO Helper i.e. Microsoft Enterprise Library to avoid writing the same routine to interact with the backend database. In this approach still populating BO on data fetch events are a pain.

OR/M

It does all you need from generating the BO, BO Collection Types and DAOs. OR/M doesn’t exactly reduces the number of methods you need to write for data operations since you still need to write the routines to instruct the underlying OR/M framework what to do but the routines you write are far more easier than the ADO.NET > BO. There are disadvantages in OR/M such as synchronisation, dependencies i.e. mapping, and losing over 60% of its power in web environment.

You can be assured that all above approaches have pros and cons therefore no one can say for certain which one is the best. Considering the size of database, deployment environment, and what you want to achieve makes one of the above approaches to stand out but for others it might be the worst approach.

I know of many developers that are totally arrogant towards OR/M and I know of I.T. Managers that would use DataSet in an Enterprise System for the sake of DTO.

I know one thing for sure I never use DataSet in place of BO and I have refused to work on projects that the man in charge wanted to use DataSets in the past. I’m not an advocate of Linq either and I doubt it very much Linq has brought any true advantage for the developers except for the fancy lambda expression which will be limiting where as in my opinion OQL (Object Query Language) can handle more object queries. When Linq is part of framework itself then you have to wait until the next release to see more lambda expressions support. Then once you use the new expressions you cannot re-use your code (investment) in the previous version of .Net Framework. It might not sound a big issue to you but in business world it is.

What I have achieved with my OR/M project was this… You could run virtually all SQL aggregate functions and mathematical functions in memory (Client-Side) to avoid hitting the database server for the simplest operations. If I’m not mistaken Linq supports some of the SQL aggregate functions in memory too.

When you fetch a record set from database the data will resides in the client machine. If you wanted to get an average figure from one of the columns then you would have to write a routine and run it against the database that has the same where clause and joins to get the result back. It made a perfect sense to me that the lack of support in the client-side causes system to hit the database far more. We developers are not going to be immune to the “green” campaign and one day we are required to write ‘greener’ software. Hitting the server for a small operation is not ‘green’ at all. You might laugh at this but one day Microsoft will issue guidelines on how to write greener software, remember where you read about this first.

I have predicted long time ago Microsoft’s attempt to OR/M would be the same with DataSet. I had a chance to look at vNext ADO.NET long time ago and it really scared me how much mapping I had to do to get it to work and also how many layers are in place to handle simple operations.

I really think Microsoft should stop adding certain things as part of .Net Framework because they can’t simply do a good job, not because of technical incompetence but because of the Framework nature which has its draw backs too. The ADO.NET 2.0 provider model is great isn’t it? Well 99% of it but it can fail just because of the 1% implication involved with different data types in database engines.

The nightmare (it is really) is the Parameter’s type. Each provider knows how to handle its own Data Type enumerator when is used but universally the DbType is not truly compatible with all the providers. To make this to work you need to add a new layer in between to do the mapping when writing DAL that meant to support multi-platforms. The OR/M developers write their own provides on top of ADO.NET to handle this one issue. The .Net Framework did not server its purpose very well with data providers.

What is DAL?

Data Access Layer is made up different objects that each server a purpose. In general there are Entity (BO) that stores a single row of database table, Entity Collection Type that stores a collection of table rows, and DAO (Data Access Objects).

DAO’s responsibility is to have all the data operations either specific to an Entity or to act as a helper class when the entities in DAL have been given the responsibility to handle their own data operations. I recall numerous discussions about DAL models which are Domain, Entity, and Direct Table a couple of years go. No one came out as a winner because simply one solution doesn’t fit all problems.

Personally, I don’t like entities to have any knowledge of how to interact with the backend database for two reasons, security and loosely-coupled. When you place all data operations with DAO then you can introduce security. The Proxy pattern works very well with DAOs. A proxy class is like an object but is not the object itself. You can also think of a Proxy class as a wrapper around an object.

The best way to understand something is to use a real world example. You are responsible to write a DAL within your organisation and distribute it to different departments which in return the developers within each department write their own UI on top of your DAL API. Within your organisation each department have different access rights to database tables. For instance HR can modify an Employee record but Accounting Dept. can only view or make very limited updates to the Employee table.

The first thing you’d notice is that if the Entity had the knowledge how to interact with the backend database then the developers in each department can do all sort of things with the Employee table. This is where Proxy classes shine. You have two choices now. First choice is to create one proxy class for each department and limit the members you expose in the proxy class. The second choice would be to implement a security check against the user login credentials in the proxy class to see if he/she can execute methods.

How Smarties 2008 can help?

If an OR/M is not a solution and BO is a must then there isn’t much option available to ease up the process. Smarties 2008 version 1.3.0 supports nine database engines to create BOs from database tables. The supported databases are MS SQL Server, MS SQL CE, MS Access, Oracle, Firebird, Sqilte, PostgreSql, MySql and VistaDB.

Smarties 2008 can help ease up the first two approaches by creating rich BOs and BO Collection types. While I’ve been writing this article I had some refresh idea that could ease up the first two approaches even further.

After all I might be able to work on DAO that I know would save developer a lot of time. My mistake was thinking OR/M and multi-platform database support in DAO that I originally meant to design. I can now narrow this down to the database engine that the developer is using to create the Business Objects. With this approach I can write all the routines for populating BO on data fetch events and write that boring insert routine.

A must read article by Chris Love

Sunday, 16 March 2008 18:28 by Alan Mojab

I really enjoyed reading Chris's article and I recommend it to any software developer out there to read it. However, there are a couple of points that I like to touch on.

Dilbert Writes…

Today the Pointy Haired Boss says he follows the measure twice and cut once philosophy. Dilbert then wisely points out that in software it is really much cheaper and easier to just cut because the nature of development is not like construction or furniture making. In those traditional industries physical resources are the limiting factor in determining production costs. Labor, or time is not as expensive as natural resources, like lumber. Besides to measure in software is to debug and log.

China is moving their manufacturing to aboard for both shortages of labor and to reduce production cost. Who predicted that? We need to look at labor as physical natural resource too. Labor is a form of energy. If labors don’t produce baby labors sooner or later there isn’t enough “energy” to produce enough for all. When life becomes expensive people tend to bring less life (energy) to earth.

Software Development is actually exactly like construction and has borrowed many elements from construction methodologies.

Both labor and time are actually more expensive than natural resources. Imagine the chair a company bought for the developer to sit on to work. The chair was made from natural resources that Dilbert talked about, right? Let say the chair was $500USD and the hourly rate of the developer is $15USD.  Less than a week the labor cost will be higher than the cost of the chair.

Chris Writes…

In the world of software development time is the most valuable resource we have. So having seasoned and intelligent developers is the key to efficient projects.

Both time and skill (developers) are equally important. You also have contradicted Dilbert’s comment about time not being more expensive. In my opinion in the world of Software Development the done project is the most valuable resource any company can have.

Chris Writes…

I think the reality is we actually need to be fluent in all of the above, but so much more. We need to know enough networking, user experience concepts, PhotoShop (design tools) and other indirect technologies that it makes things very hard. I honestly do not know how a real software developer hopes to succeed in the near future without having a rich set of skills and experiences.

Can I ask who is going to pay for the time developers are going to spend to learn new technologies that only secure them jobs that they have to work hard for it? I can’t possibly think of any other industry that has the same trend. Imagine if the doctors have to practice on their families and friends to develop enough skills and experience to get a job. Now I know what Jack the Ripper was up to. The poor man was only trying to get a job at London’s Hospital :-)

I have a lot of respects for Bill Gates but when I read his recent interview with BBC that he said developers should also develop skills to communicate with the clients effectively made me really mad. I don’t even believe the developers should talk to the clients directly let alone to have the skills. This is not the job of the developers to talk to the clients directly within an organisation if so why then they are being called software developers? There are well defined job descriptions within a team that should do the communication with the clients.

What else do we need to know first aid, cooking, dancing, social skills, babysitting boss's kids, and how to play musical instruments at the Christmas party?

I’m sorry but those who have good I.T. skills put all their time in learning them, no time to do for anything else. We all know we need to have skills to be successful but to say something unjustified and not thought carefully would make it a trend as it is today.

The follow up to the real hard drive serial number article

Friday, 14 March 2008 17:56 by Alan Mojab

I have just discovered new information that you should know about. Please make sure you have already read the previous article first otherwise nothing would make sense to you in here.

  1. The InterfaceType of some SATA hard drives are returned as SCSI rather than IDE
  2. If the Hard Drive returns a proper serial number you need to watch for the underscore character. Please see below for more details.

I have noticed some hard drives that do return proper serial numbers have some extra information. The extra information is a suffix to the serial number that starts with the underscore character. I have no idea what the extra information is.

Example:

IDE\
DISKWDC_WD800BB-00CAA1______________________17.07W17\
4457572D41434538333438343132_035_0_0_0_0

Code Example:

To extract the serial number property before converting the hex string to string then to do the reversing process your code should look something like the following;

I like to thank Krzysztof Kosmic and Joe for sending me the data to discover my new findings. Obviously I would add more findings as I discover them.

Happy Coding!

The missing TagAttribute in DotNet Framework

Wednesday, 27 February 2008 18:17 by Alan Mojab

For one strange reason I adore the Tag property. The Tag property has been around for as long as I remember.

I love practical things, so much so, I’d get so excited about them regardless how cheap or small they are. The Tag property is one of them. It always comes to your rescue and it never let you down.

The Attribute class in .Net Framework plays an important role and the developers can take advantage of this class to extend it for their own use. The extended attribute class normally is used for storing some kind of meta-data for types.

Imagine what you could have done if from day one or at least from Framework 1.1 Microsoft introduced a TagAttribute that looked something like the snippet below:

Quite often I could have solved design issues if the TagAttribute was part of the .Net framework itself rather than to distribute the extended version with my code. If you have noticed almost all protection tools would distribute one such attribute class so that you can mark a class to be ignored or to be processed.

The trick to get the real serial number of hard drive

Friday, 22 February 2008 12:14 by Alan Mojab

Here is the solution that would save you hours of research without a success. I have already turned every page on google (literally but spent over 50 hours before I discovered my solution) without a success so you are in the right place right now.

There are two solutions to this so you can pick the one that fits your needs best. Both solutions works under user account on XP and Vista. Without google I wouldn’t know anything about windows registry forensic techniques that helped to find these solutions.

Solution 1: Getting the serial number from Windows Registry

The serial numbers are stored under the following path:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\IDE

Or

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI

This solution is much harder to dig out the serial number as it needs searching child keys and parsing strings.

The child nodes that are listed under the above keys can contain CDROM drives as well or any device has been categorised as IDE/SCSI but you can examine a Key value called ‘Class’ to see if the drive is ‘CDROM’ or  ‘DiskDrive’. Expand the child nodes until you find the Key called ‘Class’ in the left pane of Windows Registry.

On my machine I have two hard drives… In case you never thought about this every machine that is designed to be a development machine must have two hard drives. One is used by OS and other Programs and the second one for your data/source code. You would reduce the risk of losing your source code tremendously by this setup.

Here is one of key entries for one of my hard drives:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\IDE\

DiskMaxtor_6V160E0__________________________VA111630

Model Number: DiskMaxtor_6V160E0 (The word ‘Disk’ is a prefix that is added by PnP service)

Controller Revision Number: VA111630

You should be able to see another child node below the above key. Here is mine:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\IDE\

DiskMaxtor_6V160E0__________________________VA111630\
3356333038524743202020202020202020202020

The child node 3356333038524743202020202020202020202020 is your serial number in hexadecimal. At the time I had no idea what that number was until I used Smarties2008’s Hex tool to examine the value. I can assure you if it wasn’t for the Hex tool in Smarties 2008 not in a million year I would have guessed the number I was looking at was actually my hard drive serial number.

However, there is one more thing you need to know. Once you converted the Hex value to string then you need to reverse every two characters to get the serial number in the correct order, if that was needed.

Notes:

If there is no serial number returned by the Hard Drive the PnP service would generate a serial number for the Hard Drive. You might notice the ‘&0’ characters in the serial number. Depending where they appear they mean something but I don’t know what they mean as I have decided not to use the Windows Registry solution all together. The second solution is much easier.

If you decided to use this solution then make sure you always open a key for ‘read’ that way you never face security permission issue under user account privilege.

Solution 2: Getting the serial number from WMI

Here how is done in my way that works under User Account both on XP and Vista:


ManagementScope managementScope = new ManagementScope(@"\root\cimv2"); 
managementScope.Options.Impersonation = system.Management.ImpersonationLevel.Impersonate; 
ManagementObjectSearcher searcher = new ManagementObjectSearcher(managementScope, new ObjectQuery("SELECT * FROM Win32_DiskDrive WHERE InterfaceType=\"IDE\" or InterfaceType=\"SCSI\"")); 
foreach (ManagementObject disk in searcher.Get()) 
{ 
    if (disk["PNPDeviceID"] != null) 
    { 
       string pnpDeviceID = disk["PNPDeviceID"].ToString(); 
       string[] split = pnpDeviceID.Split('\\'); 
    } 
}

The serial numbers have been hidden in PNPDeviceID property of Win32_DiskDrive class all this time and if someone knew about it he/she never gave away the secret, not even the ‘Scripting Guy’ shared the secret with us.

Here is the value of PNPDeviceID property from the same hard drive that I used in Windows Registry example:

IDE\DISKMAXTOR_6V160E0__________________________VA111630\
3356333038524743202020202020202020202020

It looks familiar doesn’t it? Now you can split this string by ‘\’ character to get the serial number in the last element of the split. Once again the 3356333038524743202020202020202020202020 value needs to be converted from Hex to string then reverse every two characters to get the serial number in the right order.

Here is a quick dirty way to reverse it:

private string ReverseSerialNumber(string serialNumber) 
{ 
    serialNumber = serialNumber.Trim(); 
    StringBuilder sb = new StringBuilder(); 
    for (int i = 0; i < serialNumber.Length; i += 2) 
    { 
        sb.Append(serialNumber[i + 1].ToString() + serialNumber[i].ToString()); 
    } 
    serialNumber = sb.ToString(); 
    sb = null; 
    return serialNumber; 
}/[code] 


And here is the code to convert Hex to String 


[code=csharp]private static Byte[] GetHexStringBytes(string hex) 
{ 
    try 
    { 
        if (hex.Contains(String.Empty)) 
        { 
            hex = hex.Replace(" ", String.Empty); 
        } 
        if (hex.Length % 2 == 1) 
        { 
            hex = "0" + hex; 
        } 
        int size = hex.Length / 2; 
        Byte[] bytes = new Byte[size]; 
        for (int i = 0; i < size; i++) 
        { 
            bytes[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16); 
        } 
        return bytes; 
    } 
    catch 
    { 
        return new byte[] { }; 
    } 
}


Here how you would use the methods after splitting the PNPDeviceID

byte[] bytes = this.GetHexStringBytes(split[2]); 
string serial = this.ReverseSerialNumber(Encoding.UTF8.GetString(bytes));

Notes:

If the serial number was generated by PnP service then the split[2] element needs to be handled differently. To examine the split[2] value to see if the value was generated by PnP service use the following if statement below:

if (split[2][1] == '&')

Please make sure you read the follow up article to this.

Happy Coding!

Selecting the right tools (Add-In)

Tuesday, 18 December 2007 05:12 by Alan Mojab

These days Add-ins are all about code refactoring. What is “code refactoring” and can they bring benefits to your work?

"A code refactoring is any change to a computer program's code which improves its readability or simplifies its structure without changing its results"

Source: Refactoring

What you need to keep in mind is that “Code Refactoring” is useless without existing codes. So that means if you are not going to bother about existing projects that are dusting some where in one of your folders then “Code Refactoring” is not for you unless you have the habit of writing bad codes that you need to refactor your codes as you go along.

In software development there are techniques that can be applied to improve code structure, readability, and maintainability. I won’t be able to cover them all in one post. In my early days in software development I discovered about the encapsulation technique, the old name for Code Refactoring.

The encapsulation technique is very easy to learn. All you have to do not to rush that code to make it to work and once it does forget all about it how actually you could have improved it. Software developers would never go back to the code they write unless there is a bug or have to change it to support the new logic.

I can think of numerous occasions where I spent only few minutes more on the routine that I was working on and in return it saved me hours of work later on.

In encapsulation technique you would divide the algorithm into sub-routines where each sub-routine can be modified at later time to handle more logics without breaking the entire logic and also you would allow the same sub-routines to be used by other layers or services in your application without repeating the same logic again.

If you are writing a method/function that has a large algorithm (routines) be assured there is something wrong with your approach in solving the problem. One of the hardest things in software development is to keep the solution simple. This is something you can master only by learning from your past mistakes.

Software development is far more than learning about the language specific syntaxes, hacking the API, and how to google the algorithm you need. The bottom line is that if you are not already familiar with some of the techniques in software development then you wouldn’t know how beneficial code refactoring is.

If you are an experienced developer less likely you would benefit from the massive refactoring features are being offered by various vendors. This is because you would code from the beginning in the same manner these code refactoring features offer.

The market currently is focusing on “Code Refactoring” techniques to help the software developers on their daily tasks. I personally feel professional developers are left behind as I don’t believe all “Code Refactoring” techniques are beneficial to them as they know how to code an already refactored code from the start.

The “Code Refactoring” can be divided into two main categories;

  1. Code Enhancements
  2. Pure Productivity  

A good example for “Code Enhancements” category would be the Visual Studio’s built-in “Extract Method” refactor command. A good example for Productivity would be the Visual Studio’s built-in “Extract Interface” refactor command.

What is the difference between the two you might ask?

To answer this question fairly it would be better to talk about the benefits a command such as “Extract Interface” brings then you can compare the two by yourself. As you already know this command can extract members from existing types and creates a new interface from them. The obvious usage of this command is when at one point you decide to derive existing types from an interface. On the other hand, the not so obvious benefit is how you can work when you have such command at your disposal. Let me elaborate more on that if you will.

When you are designing a class that you would know in advance it derives from an interface most likely you would design the interface first and then the derived type second. The chances you get everything right first time are very small therefore any changes to the members of either object templates need to be synchronised between the two.

To increase your productivity you can design your actual type first without worrying about the interface part then once you are happy with the initial testing and design then within seconds you can create the interface. In this way you have saved yourself from working with two object templates at lesser time.

Tools are made because there were demands for them. You need to look at your own needs to see what tools would fit to your requirements best with the available budget.

There isn’t a perfect tool out there unless it does one thing only. Stop looking for that perfect tool, look for the tool that would have the most benefits for you.