Unions

This isn’t an article about the AFL-CIO, but about those strange data structures called unions. Unions are an efficient and powerful way to organize data, but strangely enough, I don’t see many people using them. This article defines a union, describes the benefits and shows how to implement them in your own programs.

You can think of a union as a type-def that can hold different data within a single data segment. The size of a union is the size of the largest data item in the union definition. If you define a union with a 4-byte element and an 8-byte element, the size of the union is 8 bytes. Unlike a type-def though, a union can only hold a single data item at any given time. This may not sound very efficient or useful, but it is a powerful way to organize different types of data using a single data object.

A union is efficient. Using the above definition, if you use a union to hold a 4 byte data element, it may sound like you are wasting 4 bytes, but in fact you are gaining 4 bytes. Without a union, you would have to allocate a 4-byte and an 8-byte data item, using 12 bytes total. By using a union, the 4-byte data element fits within the 8-byte data segment of the union; in other words you are using the same space for the 4-byte data element and the 8-byte data element. You are saving 4 bytes by not having to allocate an extra 4 bytes for the 4-byte data element.

While efficiency in a program is always welcome, the real power of a union derives from the fact that at any given instance, a union can only hold one data item at a time. This may sound like a contradictory statement, but it is a way to manage complexity in a program but grouping disparate data items within a single container. A real-world example will illustrate this concept.

A Union in Action

In my program Deep Deadly Dungeons (DDD), a rogue-like game, I needed a way to contain and manipulate a player character’s inventory. The inventory items include rations, weapons, ammo, armor, potions and scrolls. Each inventory item is unique; each has different properties that must be managed. At a first approximation, it would appear that a lot of coding would be required to manage all of these properties because the items are so different. Handling each one as a separate data structure would require routines for each item, increasing the complexity of the program.

It would be better to have a single data structure that would represent a general inventory object, and a single set of routines to manage that general inventory object. You can create such an object by using a union, and once you have this general-purpose object, you can create a set of routines to manipulate that single object, thereby reducing the complexity of the program. Here is how I defined my inventory object in DDD, showing only the weapon and armor type-def elements. The other elements are defined in a similar manner.

type armortype
       id as integer
       eval As Integer
       ac As Integer
       defmod As Integer
       defmagic As Integer
       defmagicmod As Integer
       strg As Integer
       cursed As Integer
       dr As Integer
       noise As Integer
    End Type

    type weapontype
       id as integer
       eval as integer
       hands as integer
       strg As Integer
       tohitmod As Integer
       damage as integer
       dammod as integer
       cursed as integer
       dr As Integer
       skill As Integer
       noise As Integer
    End Type

    type invtype
       typeid As Integer
       desc As String * 20
       union
           light as lighttype
           supply as supplytype
           ammo As ammotype
           necklace as necklacetype
           ring as ringtype
           wand as wandtype
           potion as potiontype
           scroll as scrolltype
           weapon as weapontype
           armor as armortype
           shield as shieldtype
       end union
    end type

Notice that the union is inside a type-def. The union part of this type describes a single inventory item. Since only a single instance of an inventory item resides in a union, how do you determine what is in the union? My method was to use a typeid, which is an integer value that indicates what the union is holding. The id is defined within the type-def, and not the union, so it is available to each inventory item. The following defines describe the different inventory item classes.

‘item class ids
    #Define supplies 1
    #Define necklaces 2
    #Define potions 3
    #Define rings 4
    #Define wands 5
    #Define weapons 6
    #Define armors 7
    #Define lights 8
    #Define ammos 9
    #Define shields 10
    #Define scrolls 11

If the typeid is 6, then I know that the union holds a weapon inventory item. If the typeid is 7 then I know that the inventory item is armor. I can now write a set of routines to manipulate the inventory object. Since each routine operates on a single object, I can create a high-level set of inventory methods that will work on any type of inventory item. For example, to generate an inventory item, I use the following subroutine.

sub SetItem(item as integer, inv as invtype)
       dim cursed as integer

       cursed = GetPercentage(10)
       with inv
           select case .typeid
               case supplies
                   .supply.id = item
                   GenSupply inv
               case necklaces
                   .necklace.id = item
                   GenNecklace inv
               case potions
                   .potion.id = item
                   GenPotion inv
               case rings
                   .ring.id = item
                   GenRing inv
               case wands
                   .wand.id = item
                   GenWand inv
               case weapons
                   .weapon.id = item
                   GenWeapon inv, cursed
               Case armors
                   .armor.id = item
                   GenArmor inv, cursed
               case lights
                   .light.id = item
                   GenLight inv
               Case ammos
                   .ammo.id = item
                   GenAmmo inv
               case shields
                   .shield.id = item
                   GenShield inv, cursed
               case scrolls
                   .scroll.id = item
                   GenSCroll inv
           end select
       end with
    end sub

This subroutine sets the passed inventory object (inv as invtype) to the appropriate inventory item based on the passed item id and the typeid of the inventory object. Here I have a single method to handle any type of inventory item. If my inventory items were separate type-defs, I would have to create a SetItem for each inventory item type. By using a union I can create a single method that will handle whatever type I send it.

A Simple Example

The preceding code shows an example of the use of a union in a real-world program. To illustrate how to create and use a union, let’s create a small, non-trivial program that implements a simple variant data type. Our variant can either be a string or an integer, and we will add methods to set the value of the variant, to add two variants together and to print a variant.

First, let’s define our union that will be our variant type.

'define our data type classes
    #Define isnull 0
    #Define isstring 1
    #Define isinteger 2

    'define our union type
    Type vtype
       id As Integer
       Union
           sdata As String
           idata As Integer
       End Union
    End Type

The #defines indicate what type of value is in the variant. Notice that there are three defines, one for NULL, or no value, one for string data and one for integer data. This completely covers all the various states that our union may be in any given instance. In our vtype, we have defined an id field that we will use to indicate what data type we are holding; isnull, isstring or isinteger. All of our routines will examine this id field to determine what actions to take. The actual data is stored in the union, and can either be a string, sdata or an integer, idata. Notice, that we don’t have a NULL entry in our union. If our variant is NULL, that is id = isnull, then we simply ignore any values that are contained within the union, since NULL means “no meaningful value”.

To use our new variant type, we simply create one or more variables of type vtype.

'create some variant data
    Dim As vtype v1, v2, v3, v4

In order to use a variant, we must initialize it with data. This means that not only do we have to load the actual data item into the union portion of the type-def, we have to indicate what type of data is being stored in the union. The easiest way to do this is to create a set of methods that will set the id flag and load the data. In FreeBasic we can use the overload keyword to make the job of setting our variant very easy.

'define our set function using overload
    Declare Sub SetV Overload (idata As Integer, v As
    vtype)
    Declare Sub SetV (sdata As String, v As vtype)

    'create the integer set
    Sub SetV(idata As Integer, v As vtype)
       v.id = isinteger
       v.idata = idata
    End Sub

    'create the string set
    Sub SetV(sdata As String, v As vtype)
       v.id = isstring
       v.sdata = sdata
    End Sub

Here we declare our set method, SetV as an overloaded subroutine that can take either an integer or a string. We then write the actual subroutines to initialize variant for each type of data being stored. If we are passing an integer to SetV, then the id is set to isinteger and the data is stored in the idata field. If the passed value is a string, then the id is set to isstring and the data is stored in the sdata field of the union. We simply call Setv with the appropriate arguments. Since we have overloaded the subroutine, the complier will use the correct subroutine based on the data being passed.

'create an integer type
    SetV 10, v1

After this call, v1.id = isinteger and v1.idata = 10.

'create a string type
    SetV "10", v2

After this call, v2.id = isstring and v2.sdata = “10”.

But how do we set the variant to be NULL? Since a NULL means no meaningful value, simply setting the id to isnull is enough to indicate that the value of the variant is NULL.

'create a null type
    v3.id = isnull

Next we need to display the variant. Our PrintV method prints out the value of the variant based on the id value.

Sub PrintV(v As vtype)
       'print out data based on type
       If v.id = isstring Then
           Print v.sdata
       ElseIf v.id = isinteger Then
           Print v.idata
       Else
           'just print a null string for null
           Print
       End If
    End Sub

Here we simply examine the id and print out the appropriate data field in the union. So to print out v1 to the screen, we would use the following code snippet.

PrintV v1

Finally, we need a way to add two variants together. This is where it gets a bit tricky, because we potentially are dealing with two different data types at the same time. That is, we may need to add a string and integer together; how do we define what the result will be? In order for this to work, we will need to come up with some rules for handling the different data types.

1. If both values are of the same type, simple add them together. If both are strings, concatenate them, if both are integers add them.

2. If one is a string and the other is an integer then:

2.1 If the string can be converted to an integer, then convert the string to an integer and add the two together.

2.2 If the string cannot be converted to an integer, then convert the integer to a string and concatenate them together.

3. If one or both are NULL, return a NULL.

That covers all the possible outcomes using our variants. Keep in mind that these rules are completely arbitrary. This is simply how I define our variants to behave. You may want to define the behavior differently. With these rules in mind, we can code the AddV method.

Sub AddV(v1 As vtype, v2 As vtype, vret As vtype)
       Dim vtmp As Integer

       'init our return value to null
        vret.id = isnull
        'check to see if both values are strings
        If v1.id = isstring And v2.id = isstring Then
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + v2.sdata
        End If
        'check to see if both values are integers
        If v1.id = isinteger And v2.id = isinteger Then
           vret.id = isinteger 'set the id type
           vret.idata = v1.idata + v2.idata
        End If
        'check for string - integer combination
        If v1.id = isstring And v2.id = isinteger Then
           'check to see if string can be converted to integer
           vtmp = Val(v1.sdata)
           'successful conversion so add as integers
           If vtmp > 0 Then
               vret.id = isinteger 'set the id type
               vret.idata = vtmp + v2.idata
           Else
               'can't convert to integer so convert integer to string
               vret.id = isstring 'set the id type
               vret.sdata = v1.sdata + Str$(v2.idata)
           End If
        End If
        'check for integer - string combo
        If v1.id = isinteger And v2.id = isstring Then
           'check to see if string can be converted to integer
           vtmp = Val(v2.sdata)
           'successful conversion so add as integers
           If vtmp > 0 Then
               vret.id = isinteger 'set the id type
               vret.idata = vtmp + v1.idata
           Else
               'can't convert to integer so convert integer to string
               vret.id = isstring 'set the id type
               vret.sdata = Str$(v1.idata) + v2.sdata
           End If
        End If
        'if one or both values are null, return null.
        If v1.id = isnull Or v2.id = isnull Then
           vret.id = isnull
        End If
    End Sub

This subroutine checks each id of the first and second parameters and then chooses the appropriate operation, returning the result in the third parameter. I coded this the “long way” to make it clear that you have to check each parameter combination; that is, string-string, integer-integer, string-integer or integer-string, and finally if either value is NULL, we simply return a NULL. Lets examine the string-string combo in detail.

'check to see if both values are strings
        If v1.id = isstring And v2.id = isstring Then
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + v2.sdata
        End If

Here we check the ids of v1 and v2. Since both are strings, we are going to do a concatenate operation, saving the string data in vret.sdata and indicating that we have a string value with vret.id = isstring. Let’s look at the string-integer combination.

'check for string - integer combination
        If v1.id = isstring And v2.id = isinteger Then
           'check to see if string can be converted to integer
           vtmp = Val(v1.sdata)
           'successful conversion so add as integers
           If vtmp > 0 Then
               vret.id = isinteger 'set the id type
               vret.idata = vtmp + v2.idata
           Else
               'can't convert to integer so convert integer to string
               vret.id = isstring 'set the id type
               vret.sdata = v1.sdata + Str$(v2.idata)
           End If
        End If

Here we have a string and integer. We check to see if the string can be converted to a number using Val(v1.sdata). If vtmp is greater than 0, then we will add the value in vtmp to v2.idata and set vret as an integer type. If vtmp is 0, meaning Val() could not convert the string to a number, then we convert v2.idata to a string, Str$(v2.idata), and set vret to a string type.

To call Addv we simple pass the two data items, and the return value.

Addv v2, v1, v4

The two operands v2 and v1 are added together and returned in v4.

Complete Program

Here is the complete program.

Option Explicit
    'compiled using Freebasic .14b

    'define our data type classes
    #Define isnull 0
    #Define isstring 1
    #Define isinteger 2

    'define our union type
    Type vtype
       id As Integer
       Union
           sdata As String
           idata As Integer
       End Union
    End Type

    'define our set function using overload
    Declare Sub SetV Overload (idata As Integer, v As
    vtype)
    Declare Sub SetV (sdata As String, v As vtype)

    'create the integer set
    Sub SetV(idata As Integer, v As vtype)
       v.id = isinteger
       v.idata = idata
    End Sub

    'create the string set
    Sub SetV(sdata As String, v As vtype)
       v.id = isstring
       v.sdata = sdata
    End Sub

    'Define our method to add two variants using the
    following rules:
    '1 If both are the same type, add them and return new value.
    '2 If one is an integer and one is a string, convert string
    '  to integer if it can be represented as an integer, otherwise
    '  convert integer to string and append v2 to v1.
    '3 If one or both values are null return null.
    Sub AddV(v1 As vtype, v2 As vtype, vret As vtype)
       Dim vtmp As Integer

       'init our return value to null
        vret.id = isnull
        'check to see if both values are strings
        If v1.id = isstring And v2.id = isstring Then
           vret.id = isstring 'set the id type
           vret.sdata = v1.sdata + v2.sdata
        End If
        'check to see if both values are integers
        If v1.id = isinteger And v2.id = isinteger Then
           vret.id = isinteger 'set the id type
           vret.idata = v1.idata + v2.idata
        End If
        'check for string - integer combination
        If v1.id = isstring And v2.id = isinteger Then
           'check to see if string can be converted to integer
           vtmp = Val(v1.sdata)
           'successful conversion so add as integers
           If vtmp > 0 Then
               vret.id = isinteger 'set the id type
               vret.idata = vtmp + v2.idata
           Else
               'can't convert to integer so convert integer to string
               vret.id = isstring 'set the id type
               vret.sdata = v1.sdata + Str$(v2.idata)
           End If
        End If
        'check for integer - string combo
        If v1.id = isinteger And v2.id = isstring Then
           'check to see if string can be converted to integer
           vtmp = Val(v2.sdata)
           'successful conversion so add as integers
           If vtmp > 0 Then
               vret.id = isinteger 'set the id type
               vret.idata = vtmp + v1.idata
           Else
               'can't convert to integer so convert integer to string
               vret.id = isstring 'set the id type
               vret.sdata = Str$(v1.idata) + v2.sdata
           End If
        End If
        'if one or both values are null, return null.
        If v1.id = isnull Or v2.id = isnull Then
           vret.id = isnull
        End If
    End Sub

    'define our print routine
    Sub PrintV(v As vtype)
       'print out data based on type
       If v.id = isstring Then
           Print v.sdata
       ElseIf v.id = isinteger Then
           Print v.idata
       Else
           'just print a null string for null
           Print
       End If
    End Sub

    'create some variant data type
    Dim As vtype v1, v2, v3, v4

    'create an integer type
    SetV 10, v1
    Print "Set integer: ",
    PrintV v1
    'create a string type
    SetV "10", v2
    Print "Set string: ",
    PrintV v2
    'create a null type
    v3.id = isnull
    Print "Set null: ",
    PrintV v3
    Print

    'add two integers together
    Addv v1, v1, v4
    Print "Adding 2 integers:",
    PrintV v4

    'add two strings together
    Addv v2, v2, v4
    Print "Adding 2 strings:",
    PrintV v4

    'add string and integer
    Addv v1, v2, v4
    Print "Adding string, integer:",
    PrintV v4

    'add integer and string
    Addv v2, v1, v4
    Print "Adding integer, string:",
    PrintV v4
    Print

    'adding null to integer
    Addv v3, v1, v4
    Print "Adding null, integer:",
    PrintV v4

    'adding integer to null
    Addv v1, v3, v4
    Print "Adding integer, null:",
    PrintV v4

    'adding null to string
    Addv v3, v2, v4
    Print "Adding null, string:",
    PrintV v4

    'adding string to null
    Addv v2, v3, v4
    Print "Adding string, null:",
    PrintV v4

    'adding null to null
    Addv v3, v3, v4
    Print "Adding null, null:",
    PrintV v4
    Print

    'create a new string type
    SetV "This is a string", v2
    Print "Set string: ";
    PrintV v2
    Print

    'add string and integer
    Addv v2, v1, v4
    Print "Adding new string, integer:",
    PrintV v4

    'add integer and string
    Addv v1, v2, v4
    Print "Adding integer, new string:",
    PrintV v4

    Print
    Print
    Print "Press any key"
    Sleep

Output

And here is the output.

Set integer: 10
Set string:   10
Set null:

Adding 2 integers:           20
Adding 2 strings:           1010
Adding string, integer:      20
Adding integer, string:      20

Adding null, integer:
Adding integer, null:
Adding null, string:
Adding string, null:
Adding null, null:

Set string: This is a string

Adding new string, integer: This is a string10
Adding integer, new string: 10This is a string

Press any key

Summary

Unions are a way to reduce complexity in a program, by enabling the programmer to package different types of data within a single object, and create a single set of methods to manage the data contained within the object.