Category Archives: TTM

A Paraphrase of The Third Manifesto

The Third Manifesto by C. J. Date and Hugh Darwen is the culmination of more than two decades of work on defining a foundation for the future of managing data. The authors view SQL and the relational database management systems that use it as deeply flawed, and have set down their views on a language to address those flaws. The title reflects two prior attempts (manifestos) by other authors.

The intention here is to paraphrase, generalise and shorten the wording and to use terminology familiar to a general IT audience, while retaining the original intent, meaning and numbering. This has involved some reorganisation and some inclusion of background material from other writings to ensure that the all the terms used can be understood within one document.

Read more

2 Comments

Filed under Rationale, TTM

Alice and the Relational Algebra

This is a summary largely based on the ‘Alice’ book: http://webdam.inria.fr/Alice/.

The non-redundant set of operations for a maximally effective Relational Algebra that is restricted to safe, domain independent, non-recursive, type safe queries is:

  • Select (restrict/where)
  • Project
  • Rename
  • Join (natural)
  • Union

This set is referred to as SPRJU. This set is incomplete but ‘safe’ because it excludes negation. It can be completed by adding:

  • Minus (set difference).

This set could be referred to as SPRJUM, and from this foundation all the other operations of the Relational Algebra can be created. That includes other set operations, and other kinds of join such as Antijoin or Not Matching.

To this the authors propose the addition of

  • While (recursion)

The complete set could be referred to as SPRJUMW. It is able to handle recursive joins, similar to the SQL Recursive Common Table Expression.

This is the endpoint for domain-independent queries. There is nothing to add (or take away). This is a complete Relational Algebra, capable of the full range of safe, domain-independent queries (including recursive queries).

However, this Algebra is unable to perform a range of familiar tasks based on computation, which means that it cannot return any attribute values that are not already in the database and it cannot perform computation-like activities such as counting. It is not Turing Complete, and cannot be made so.

Computation requires abandoning domain-independence. To compute, one needs a type system. The SELECT of SQL and the EXTEND of Tutorial D require computation and a type system. These issues are not directly addressed in this book or anywhere else that I know of.

Some useful extensions that lie outside the domain-independent Relational Algebra include:

  • Introducing an attribute type system and calculated attributes (EXTEND in Tutorial D, SELECT in SQL)
  • Introducing algorithmic or calculated relations (see Hall-Hitchcock-Todd or Appendix A)
  • An extended While (see book for details).

Appendix A shows that the addition of calculated relations can provide a substitute for Select/Restrict/Where. Operators that provide recursion can substitute for While. Ordering or relaxing type safety provide further choices. At some point there are just too many options available. However, aggregation is not adequately treated in any of the above. The Alice book has a brief section of little practical value.

It is my view that this is an adequate theoretical foundation for a Relational Algebra that is complete within the limits of safe, domain-independent, type safe queries. A foundation for the various methods of introducing computation is still lacking, especially with respect to aggregation and ordering.

Leave a Comment

Filed under Relational Algebra, TTM

Why do we need a new database language?

As it stands, The Third Manifesto satisfies an academic need. It sets out in great detail a proposed foundation for database programming, along with a language and type system, with the clear intention of replacing and surpassing SQL.

But the question is: what purpose does it serve in the wider IT community? Who has a problem that it will solve? Who needs products built to TTM guidelines?

I think the ‘smoking gun’ of current business application development is the ‘object relational impedance mismatch’. The plethora of ‘object relational mappers’ or ORMs out there provide evidence for that view. The story is something like this.

  1. The ‘object relational impedance mismatch’ is a real impediment to productive application development. The ORM layer consumes large amounts of programmer time and energy to develop and maintain, it’s a major source of bugs and a major source of performance issues. The ORM is a solution to a problem that should not exist. It’s time for it to go.
  2. Attempts to resolve this on the database side (by creating ‘object databases’) have been a dismal failure, at least for general purpose application development. The ‘NoSQL’ databases have done better, at least for a certain range of applications, but there is no sign that either is capable of taking over as the next generation.
  3. Relational theory is a proven foundation for building robust databases over a wide range of size and function. Relational databases are here for the long haul.
  4. Therefore the ORM problem must be resolved on the language side (by creating ‘relational languages’). The essential business logic of any application has to be expressed in a language that has native access to relational data, without the need to translate the data into objects and back again.
  5. The first attempt at such a language was SQL, and it has failed to deliver.
    1. SQL at the 1992 level is widely used as a data sublanguage in conjunction with an ORM. It cannot be used to develop business logic.
    2. SQL since 1996 has included SQL/PSM, which can be used to develop business logic (often referred to as stored procedures). However actual implementations of SQL/PSM very widely in compliance and capabilities; there are multiple partial and incompatible implementations, and in many environments there is none (eg Sqlite on Android).
    3. SQL is a very old language (since the 1970s) and has many technical deficiencies including an awkward syntax, difficulties in fully expressing the Relational Algebra and limited capabilities for extension and defining types.
    4. SQL does not provide or is incompatible with modern language development systems: IDE, debugger, version control, build systems, etc.
  6. So the need is for a modern language that
    1. Can be used instead of, as well as or alongside SQL
    2. Can be used to code the essential business logic of applications
    3. Has native access to relational data and data types
    4. Has a sophisticated extensible type system
    5. Plays well with other languages and technologies
    6. Has no ‘object relational impedance mismatch’ (by design)
    7. Is competitive with other modern languages in its use of and compatibility with modern language development systems
    8. Is available as the same language on all possible platforms.

That’s what The Third Manifesto describes and that’s what Andl aspires to be.

Leave a Comment

Filed under Pondering, TTM

A Paraphrase of the Third Manifesto

The Third Manifesto is the highly authoritative result of over two decades of work by CJ Date and Hugh Darwen. It sets out their view on the future for database management systems, and on a language in particular. To a large extent Andl is based on that work.

But TTM (as it is known) is not an easy read. It is written in academic language for an academic audience, and not everyone will find that accessible. This TTM Paraphrase is my attempt to express the ideas of TTM in language more familiar and more accessible to the general IT reader. It is the document I use when I want to quickly remind myself of some key points and terminology. It’s my ‘cheat sheet’ for TTM.

So I’m publishing this draft in the hope that it will be useful, and to solicit feedback. It is not a substitute for, an improvement on or even a refinement of TTM. At best it may help some people to understand TTM. I know writing it has helped me.

The TTM paraphrase, initial draft release, with minor updates: TTM-para-d26.

Leave a Comment

Filed under TTM

The true calling of ‘D’

Much of TTM and related writings deals with what’s wrong with SQL and how it should be done better. SQL is an essential part of the communications between applications and databases, in conjunction with an ORM of some kind. Which kind of points to a similar role for the language D. At least that’s the way it has seemed to me.

My question is: would it be better to think of D not so much as a replacement for SQL as the language in which to code an application data model?

Using a slightly modified version of my 4 layers, and just thinking about a modest web app:

  1. UI access: coded in HTML, CSS and JS.
  2. Glue code: written in GP language: Java, C#, ruby, etc
  3. Data model: coded in GP language.
  4. DBMS access: coded in GP language and SQL.

By data model here I mean the totality of the state of the business data that models the application, both transient and persistent. The idea would be to code layer 3 entirely in a suitable D. It would draw together data from a variety of sources, and is free to use SQL and a DBMS for persisting or retrieving data.

And that leaves me pondering two questions.

  1. What specific D features are required to fully implement a data model? Scoped constructs like functions or modules are vital, I think.
  2. What should the API between application and data model look like? POCOs rather than relvars I think.

All of a sudden this sounds like a bigger project.

Leave a Comment

Filed under Rationale, TTM