User Defined Databases

Summary
The database, at its most abstract representation is a set of records, with each record having a similar structure.
Databases are specific resources that are dedicated purely to holding data and searching the database for specific values.
Get or set the number of records in the database.
Return the next record which has a ‘zero’ value for the given field (read-only).
Get or set the value of a given field.
Get or set the number of elements in the given field for every record.
Get or set whether the given field is indexed.
This frees all data associated with the record, resets all values to 0, and unschedules any tasks associated with it.
This allows you to search the database for a given value, starting with the record immediately following the specified start record.
Works the same as a search, but returns the first database entry after that <start> record that does not match the given value.
While the structure of a database is easily changed, it’s quite tedious (and very poor programming practice) to refer to the elements of a record by their numeric ordering.

DB Field Types

The database, at its most abstract representation is a set of records, with each record having a similar structure.  The record has a set of data fields associated with it that store data.  For example, the database ‘foo’ could be initialized using the following code:

database box = "Box-Data";

subroutine main()
{
box.data.maxrecord = 1;
box.string$.maxrecord = 3;
box.flag.maxrecord = 6;
}

This would set up each record in the foo database with 1 ‘data’ type, 3 ‘strings’, and 6 ‘flags’.  If, you decide later on, that you need more (or less) info per record, simply modify the maxrecord for each type.  It’ll readjust and restructure every existing record to fit your new format.

There are several types that can be used in databases, each having distinct space/type tradeoffs:

dataA 32-bit signed integer value
variableAn unsigned 16-bit integer value
byteAn unsigned 8-bit integer value
flagA single bit (boolean) value
string$A variable length string

Using Databases

Databases are specific resources that are dedicated purely to holding data and searching the database for specific values.  Databases resources are declared just like globals and have the same scoping rules (must be extern’d to be visible in other files, etc).  The format for declaring a database is as follows:

database <name>;                   (* Declare an anonymous database. *)
database <name> = "<filename>"; (* Declare a database. *)

For an extern declaration, just like globals, no space is allocated and there is no resource really created, it simply tells Rapture that there will be one somewhere and allows it to be referenced.  If there are any external declarations, there must be one (and only one) of the above declarations somewhere in the project, as these actually create the resource, tell Rapture to manage the database, and perform all initialization necessary to use it.

The difference between the two forms of declaration is that the first, the anonymous database, is never associated with a file, and is not written to disk (it is also cleared when loading or reloading the executable).  Anonymous databases are useful for information that is only pertinent during a single running of the executable and doesn’t require permanent storage.  The second form creates a link between the internal (memory resident) database and a file on the disk.  Note: The actual db content is located inside a data archive file, which is what is written to disk when you call the backup() subroutine.

There are several internal databases already declared and setup internally (available in all source files and common between them).  These are what were originally determined to be the basic framework required for operating a text based game.  (At least one way of doing so.)  These are the persona, replica, object, node, room, and game databases.  Most of them have extra fields that are designed explicitly to operate with a number of internal routines to provide time-efficient ways of searching and manipulating them.

Once a database has been declared, it can be used by simply using the following accesors.

Summary
Get or set the number of records in the database.
Return the next record which has a ‘zero’ value for the given field (read-only).
Get or set the value of a given field.
Get or set the number of elements in the given field for every record.
Get or set whether the given field is indexed.
This frees all data associated with the record, resets all values to 0, and unschedules any tasks associated with it.
This allows you to search the database for a given value, starting with the record immediately following the specified start record.
Works the same as a search, but returns the first database entry after that <start> record that does not match the given value.

<name>.maxrecord

Get or set the number of records in the database.  Any references to records higher than this number will generate a runtime error.

<name>.<fieldtype>.next

Return the next record which has a ‘zero’ value for the given field (read-only).  For numeric fields, an actual 0, for string fields, simply the empty string (“”).  This is a way of determining which record is free to use without clobbering existing record data.

<name>[].<fieldtype>[]

Get or set the value of a given field.  The standard fields for databases are string$ (a string), data (4 byte signed integer), variable (2 byte unsigned integer), byte (1 byte unsigned integer) and flag (1 bit integer flag).  Note that each field is also an array, and for every record, a 1-based index must be specified (bounds checking is performed, and if found to be out of bounds, a runtime error is generated, see below).  Also, the first expression is the (1-based) record you want to deal with.  For example: replica[10].data[4] is the 4th data element of replica 10.

<name>[].<fieldtype>.maxrecord

Get or set the number of elements in the given field for every record.  It is important to note that changing this value affects every record in the database.  For example, if you have a database with 100,000 records, and you increase the data.maxrecord from 2 to 5, you’re actually adding almost 1.5 megabytes of data to the database (3 * 4bytes * 100,000 records = 1.44MB)!

<name>[].<fieldtype>.indexed

Get or set whether the given field is indexed.  Rapture supports indexing for arbitrary values, which greatly increases search speeds in most cases.  However, this also greatly increases the amount of system memory Rapture requires to operate.  It also increases the amount of time requires to change a value (as the new value must be re-indexed), although this overhead is usually not noticeable.

Note that if a field is indexed, searching might (and very likely will) be unordered.  This is to keep index maintenance as fast as possible.  Also, if traversing records via a search in a loop, be warned that deleting a record during the traversal has undefined behavior (namely the next search will likely fail).  To work with this, simply fetch the ‘next’ value before deleting the current one:

next = db.data[1].search(1, 0);
while( next )do
{
current = next;
next = db.data[1].search(1, next);
(* Now that we know which is next, the 'current' record
can be safely deleted without messing up the traversal. *)
}

<name>[].delete()

This frees all data associated with the record, resets all values to 0, and unschedules any tasks associated with it.

<name>[].<fieldtype>[].search(<value>, <start>)

This allows you to search the database for a given value, starting with the record immediately following the specified start record.  To start searching from the beginning of the database, simply use 0 as the start value.

<name>[].<fieldtype>[].nsearch(<value>, <start>)

Works the same as a search, but returns the first database entry after that <start> record that does not match the given value.

DB Field Aliasing

While the structure of a database is easily changed, it’s quite tedious (and very poor programming practice) to refer to the elements of a record by their numeric ordering.  DB aliasing provides a way to symbolically define mappings of names to numeric field indexes.  This lets you treat each record more like a ‘structure’ that’s available in other programming languages.

The aliasing feature also allows an N-to-1 mapping, so that fields of a record can be shared among different parts of code.  While, potentially dangerous, this is quite powerful and permits dramatic savings in storage requirements (by allowing bits of data to have different meaning based on programatic context).

How to set up an alias

dbalias <database>
{
<alias identifier> => <field type> [<integer>],
...
}

As you can see, the alias is a proper identifier (a single word with an alphabetic character as the first character) and maps to a specific element of each data type.  By convention, aliases begin with a capitol letter and are mixed case (e.g.  “AnAliasName”), but this is not enforced.  Also by convention, string type aliases end with “$” to indicate their type, but this is also not enforced.

A dbalias example

dbalias box
{
Type => byte[1],
Width => data[1],
Height => data[2],
Weight => variable[1],
Description$ => string$[1],
IsOpen => flag[1],
IsLocked => flag[2],
}

Data member sharing

To illustrate the data sharing concept, look at the following.  We have a simple database record, that depending on the value of one of the other fields, a data member has a different ‘meaning’.

dbalias door_or_lock
{
IsALock => flag[1],
IsClosed => flag[2],
IsLocked => flag[2],
}

As you can see, depending on the value of IsALock, it has an implied meaning that the flag[2] member is determining whether the lock is currently locked or not.  If it’s not a lock, it assumes it’s a door and the IsClosed alias makes more sense.  Either way, only a single bit is used to store whether it is closed or locked, and another bit is used to determine which it means.  Clever use of this feature can save a lot of space when storing many records in a database.

Chaining multiple dbalias {} constructs

It’s also worth noting that dbalias declarations can be chained together, and subsequent instances simply add to the definition during compilation.  This permits a rudimentary private/public interface specification if used in a header file context.  For example:

In foo.rh

(* The database exists in some file. In this case foo.r *)
extern database foo;

(* Declare 'public' members. *)
dbalias foo
{
MyPublicMember$ => string$[1],
AnotherPublicMember => data[1],
}

And then, in the implementation file foo.r

(* Include the 'public' members. *)
#include "foo.rh"

(* Declare the database and tell it to persist. *)
database foo = "Foo-Data";

(* Append our 'private' members to the alias mappings *)
dbalias foo
{
PrivateMember => data[2],
PrivateMember2 => data[3],
}

If you try to overwrite an existing alias an error will be generated.

Note: Adding alias mappings during compilation will not make a database declare room for them or adjust the structure of a database record to accomodate them.  You need to explicitly set the number of each type using a <db>.<fieldtype>.maxrecord assignment sometime during your initialization.

Force a write of all database information to disk.