SIRA_PRISE rationale and features

What is it ?

SIRA_PRISE is an acronym for Straightforward Implementation of a Relational Algebra - Prototype of a Relational Information Storage Engine. That's more than just a mouthful, but it does cover what it's all about :

This piece of software is mainly inspired by three books about the relational model and relational database design :

References will occasionally be made to these books. They will be referred to as "TTM", "TDRM" and "AM4DP" respectively.

Ergo : SIRA_PRISE is, quite simply, a relational DBMS. In particular, SIRA_PRISE is aimed at providing a data storage mechanism and an implementation of a relational algebra, such that a TTM-compliant language can be trivially built on top of it. Which is why SIRA_PRISE is not (at all) a TTM implementation sensu strictu, but rather "just inspired by".

As for the words 'Straightforward' and 'Prototype' : that's how it started. They don't really apply any longer, but we regard removing the concerned characters from the name as being inappropriate for psychological reasons.

With respect to the indefinite article in 'a relational algebra', we assume it is well-known that there is no such thing as "the" relational algebra, and furthermore it won't probably come as a surprise that the relational algebra implemented here is not the "A" algebra as defined in TTM, but rather an algebra with the more familiar RESTRICT, UNION, INTERSECT, JOIN, EXTEND, PROJECT, … operators.

What is supposed to be so special about it ?

Or, put otherwise, how is SIRA_PRISE different from :

  1. existing industrial-strength technology (read : SQL systems) ?
  2. other ongoing attempts at implementing the relational model ?

The answer to (1) is rather obvious : great care has been taken to avoid any of the relational mistakes that pollute SQL so heavily : no nulls, no duplicate rows, no nameless columns, no duplicate column names in the result of a join, empty headings are allowed, empty keys are allowed, etc. This actually makes SIRA_PRISE among the first relational DBMS's in IT history, since no SQL product truly qualifies as 'relational', and non-SQL DBMS products are few and far between (well, at least if we consider only those that adhere to all those "well-known" sound principles of relational theory).

The answer to (2) requires a bit more explaining. Most of the other ongoing attempts at implementing the relational model seem to be highly focussed on the language used for data manipulation. The actual storage mechanism used with that kind of approach is typically left to some existing SQL system, or to some engine that is de facto only used to support SQL systems. The main focus in those projects seems to be to build a compiler/interpreter/runtime environment for a (programming) language that qualifies as a D. The aspect of physical data storage is "forgotten" (well, not truly forgotten of course, but left significantly unaddressed anyway).

There is a problem with that approach. Especially in the context of temporal data (which was the main motivation to start this project so many years ago), and the 6NF that goes along with that territory, SQL systems (and therefore also any system built on top of them) usually have the effect of dispersing data that will often be used together, across many distinct physical places. Data that will often need to be joined together, will reside in many distinct physical pages, and joining that data may/will require lots of physical I/O, with the predictable effect on general performance.

SIRA_PRISE aims specifically at offering a data storage solution that is different from the storage options we get from the "average" SQL system. In particular, it aims at "keeping data physically together if those data logically "belong" together, and keeping data physically apart if those data do not logically "belong" together". Being more focussed as it is on the physical aspects of data management, we do not regard SIRA_PRISE's language as any kind of D, nor do we feel it needs to qualify as such. Rather, this exercise should be regarded as an attempt to build, as someone in the TTM discussion group once put it, "a high-level database interface on which a D can be trivially built".

That is of course not to say that "physical storage" is the only problem that SIRA_PRISE deals with. In practice, the engine must of course record some data of its own (data about the files, records, indexes, etc.). Otherwise it could never work. The place to store those data is traditionally referred to as "the catalog". And ever since Codd formulated his famous 12 rules, it is a widely accepted idea that this catalog should be structured exactly like any other database that the engine manages, and should be manipulated using the very same facilities that the engine offers to its users. To cut a longer story short, SIRA_PRISE is faced with the need to also have an implementation of a relational algebra, allowing the user to manipulate the catalog data in a relational way.

More details about this algebra can be found on the "Introduction to the Relational Algebra" page. If you never heard of terms such as "Relvar" or "Tuple" and the only knowledge you have about the Relational Model was acquired through the use of SQL systems, then carefully studying this documentation is probably a sine qua non for you.

Features :