ar.pas : Array interface

Table of Contents

1 The IArray Interface

Previously, we looked at the sq unit, in which we defined a generic sequence type, whose interface looked like this:

type generic ISequence<tKey, tVal> = interface
  function Length : cardinal;
  function GetItem( i : tKey ) : tVal;
  procedure SetItem( i : tKey; var value : tVal );
  property item[ i : tKey ] : tVal
    read GetItem write SetItem; default;
end;

We further provided an abstract class, GSeq<tKey,tVal>, which provides an enumerator for classes implementing ISequence. This allows us to loop through the values of a sequence with pascal's for ... in ... do syntax.

An array is a specific type of sequence that uses a continuous range of values for its keys.

Arrays in standard pascal can have arbitrary start and end indices (that is, you can declare an array[ -5 .. 237 ] if you like, or array['a'..'z'], or even array[byte], which translates to array[0..255], the range of the byte type.

However, most of the time, it's sufficient to range from zero to some maximum value, and in fact this is how dynamic arrays work in object pascal, thus we can define an interface that works very much like a dynamic array simply by specializing ISequence and using cardinal as the key type:

type
  IArray<T> = interface (ISequence<cardinal, T>)
    procedure Resize( newlen : cardinal );
    procedure Append( value : T );
  end;

Remember, T is a type variable, indicating some particular but as-yet unspecificred type, as if we'd declared an array of T.

The two additional methods we've added indicate that we will concern ourselves here with dynamic arrays, which can grow and shrink at runtime, rather than having fixed bounds.

Object pascal offers dynamic arrays, which are basically the same as static arrays, except they're allocated on the heap. You can resize them with SetLength, but since this can copy the entire array each time you resize it, it's not always efficient to grow the array.

Since keys in an array are a contiguous sequence of values, arrays generally store their values in a single contiguous block of RAM. This allows the compiler to use simple pointer arithmetic under the hood to allow very fast access to the data.

Sometimes, though, we prefer something that looks and feels like an array, but uses some other kind of storage system underneath. In this unit, we'll explore several classes that all implement this same interface, but have very different implementations under the hood.

2 A simple test suite.

The compiler will check that each of our array classes provides the necessary methods to implement IArray, but it can't guarantee that the implementation actually does what we want.

Actually, nothing short of a formal proof can guarantee that our software is correct (and then only if we trust both the compiler and our logic), but we can at least provide evidence that it's probably doing the right thing.

To that end, we will run each class through the following test suite, which simply creates and populates an array.

type
  ICardinalArray = IArray<cardinal>;

procedure check_array( a : ICardinalArray );
  var i : byte;
  begin

    { test that we can grow the array with append: }
    chk.equal( 0, a.length );
    for i := 0 to 127 do a.append( i * i );

    chk.equal( 128, a.length );

    { now resize and make sure we can both read the
      old values and write the new slots }
    a.resize( 256 );
    for i := 255 downto 128 do
      begin
        { writeln( 'a[', i, '] := a[', 256 - i - 1, '] -> ',
          a[ 256 - i - 1 ], ' -> ', i, ' ',
          round(sqrt( a[ 256 - i - 1 ]))); }
        a[ i ] := round(sqrt( a[ 256 - i - 1 ]));
      end;

    { do a few spot tests to make sure it worked right. }
    chk.equal( a[   0 ],     0 );
    chk.equal( a[  64 ],  4096 );
    chk.equal( a[ 126 ], 15876 );
    chk.equal( a[ 127 ], 16129 );
    chk.equal( a[ 128 ],   127 );
    chk.equal( a[ 129 ],   126 );
    chk.equal( a[ 192 ],    63 );
    chk.equal( a[ 255 ],     0 );
  end;

3 Implementations

3.1 DONE GAbstractArray

GAbstractArray is a partial implementation of the interface. It doesn't actually store any data and thus can't be used on its own, but it gives us a place to put common code that we will reuse for all the full implementations.

The first batch of methods deal with routine maintenance on the _length field.

constructor GAbstractArray<T>.Create( len : cardinal );
  begin
    _length := len;
  end;

procedure GAbstractArray<T>.CheckRange( ix : cardinal );
  begin
    if ix >= _length then
      raise ERangeError.Create( 'Out of range: ' + IntToStr( ix ));
  end;

procedure GAbstractArray<T>.Resize( len : cardinal );
  begin
    _length := len;
  end;

function GAbstractArray<T>.Length : cardinal;
  begin
    result := _length;
  end;

We can now previde a generic version of Append usable by all the subclasses.

procedure GAbstractArray<T>.Append( value : t );
  begin
    self.Resize( _length + 1 );
    { -1 because length has now increased by one }
    self[ _length - 1 ] := value;
  end;

3.2 DONE GDynArray

Our first implementation simply wraps the normal dynamic arrays that Free Pascal already provides. GDynArray offers no practical benefit for users except that it conforms to our interface, and thus will work with any other code we write to that interface.

However, it does provide a good baseline sanity check for our tests and a simple reference implementation of the interface.

constructor GDynArray<T>.Create( len : cardinal );
  begin
    inherited Create( len );
    SetLength( _array, len );
  end;

procedure GDynArray<T>.Resize( len : cardinal );
  begin
    inherited Resize( len );
    SetLength( _array, len );
  end;

function GDynArray<T>.GetItem( ix : cardinal ) : T;
  begin
    CheckRange( ix );
    result := _array[ ix ];
  end;

procedure GDynArray<T>.SetItem( ix : cardinal; const val: T );
  begin
    CheckRange( ix );
    _array[ ix ] := val
  end;

destructor GDynArray<T>.Destroy;
  begin
    _array := Nil
  end;

3.3 DONE GFileArray

GFileArray operates on a disk file: items are loaded to and from file. Usually, this corresponds to a binary file on disk somewhere, but files are implemented much like classes in pascal (in that their behavior is defined by a record that contains functon pointers) so other implementations are also possible.

Note that the tight coupling to the disk makes GFileArray somewhat slower than other options. This class is designed for use with our database, and in conjunction with GCachedArray, defined next.

In most cases, if you need to save an array to disk, it's probably better to write the whole thing to disk at once, and just work with a cached copy in ram.

constructor GFileArray<T>.Create( var f : file );
  begin
    _file := f;
    inherited Create( filesize( f ) div sizeOf( T ));
  end;

procedure GFileArray<T>.Resize( len : cardinal );
  var n : cardinal; buf : T;
  begin
    if len < _length then
      begin
        seek( _file, filesize( _file ));
        truncate( _file );
      end
    else if len > _length then
      begin
        seek( _file, filesize( _file ));
        buf := default( T );
        for n := _length to len do
          BlockWrite( _file, buf, sizeOf( T ));
      end;
    // else do nothing
    inherited Resize( len );
  end;

function GFileArray<T>.GetItem( ix : cardinal ) : T;
  begin
    seek( _file, ix * sizeOf( T ));
    BlockRead( _file, result, sizeof( T ));
  end;

procedure GFileArray<T>.SetItem( ix : cardinal; const val: T );
  begin
    seek( _file, ix * sizeOf( T ));
    BlockWrite( _file, val, sizeof( T ));
  end;

destructor GFileArray<T>.Destroy;
  begin
    Close( _file );
  end;

3.4 TODO GCachedArray

3.5 TODO GBPlusArray

constructor GBPlusArray<T>.Create( len : cardinal );
  begin
    inherited Create( len )
    bp.TTree.Create;
  end;

function GBPlusArray<T>.GetItem( ix : cardinal ) : T;
  begin
     result := default( t )
  end;

procedure GBPlusArray<T>.SetItem( ix : cardinal; const val: T );
  begin
  end;

destructor GBPlusArray<T>.Destroy;
  begin
  end;

3.6 TODO GEmbeddedArray

constructor GEmbeddedArray<T>.Create( a : IArray<T>; len : cardinal );
  begin
    inherited Create( len )
  end;

function GEmbeddedArray<T>.GetItem( ix : cardinal ) : T;
  begin
     result := default(t)
  end;

procedure GEmbeddedArray<T>.SetItem( ix : cardinal; const val: T );
  begin
  end;

destructor GEmbeddedArray<T>.Destroy;
  begin
  end;

4 Appendix: Templates for Generated Files

4.1 template for UNIT ar

{$mode delphi}
unit ar; { Array interface }
interface uses sq, sysutils, bp;

  <<type:IArray>>

  type
    GAbstractArray<T> = class ( GSeq<cardinal, T>, IArray<T> )
      protected
        _length : cardinal;
        procedure CheckRange( ix : cardinal );
      public
        constructor Create( len : cardinal );
        function Length : cardinal; override;
        procedure Resize( len : cardinal ); virtual;
        procedure Append( value : t ); virtual;
      end;

    GDynArray<T> = class( GAbstractArray<T> )
      protected
        _array : array of T;
      public
        constructor Create( len : cardinal );
        function GetItem( ix : cardinal ) : T; override;
        procedure SetItem( ix : cardinal; const val: T ); override;
        procedure Resize( len : cardinal ); override;
        destructor Destroy; override;
      end;

    GFileArray<T> = class( GAbstractArray<T> )
      protected
        _file : file;
      public
        constructor Create( var f : file );
        procedure Resize( len : cardinal );
        function GetItem( ix : cardinal ) : T; override;
        procedure SetItem( ix : cardinal; const val: T ); override;
        destructor Destroy; override;
      end;

    GBPlusArray<T> = class( GAbstractArray<T> )
      protected
        _tree : bp.TTree<T>;
      public
        constructor Create( len : cardinal );
        function GetItem( ix : cardinal ) : T; override;
        procedure SetItem( ix : cardinal; const val: T ); override;
        destructor Destroy; override;
      end;

    GEmbeddedArray<T> = class( GAbstractArray<T> )
      public
        constructor Create( a : IArray<T>; len : cardinal );
        function GetItem( ix : cardinal ) : T; override;
        procedure SetItem( ix : cardinal; const val: T ); override;
        destructor Destroy; override;
      end;

implementation
  <<ar:imp>>
end.

4.2 template for test suite

{$mode delphi}
{$i test_ar.def}
implementation
uses ar, fs, sysutils;

<<check_array>>
type
  TDynArray      =  class (GDynArray<cardinal>, ICardinalArray);
  TFileArray     =  GFileArray<cardinal>;
  TBPlusArray    =  GBPlusArray<cardinal>;
  TEmbeddedArray =  GEmbeddedArray<cardinal>;

procedure test_dynarray;
  begin
    check_array( TDynArray.Create( 0 ));
  end;

procedure test_filearray;
  var f : file of cardinal;
  begin
    fs.update( f, 'test_ar.b4sd' );
    check_array( TFileArray.Create( f ));
  end;

procedure test_bplusarray;
  begin
    check_array( TBPlusArray.Create( 0 ));
  end;

procedure test_embeddedarray;
  begin
    check_array( TEmbeddedArray.Create( TDynArray.Create( 1024 ), 32 ));
  end;

begin
end.