# Category Archives: Language

## 1.    Introduction

These are the formal definitions of an Extended Relational Algebra in a style similar to Algebra A in Appendix A of DTATRM by D&D. The operators of this ERA are used to construct shorthands, intended to form the basis of a relational programming language. It should be noted that Appendix A takes the reduction path from language shorthand to algebra operation. Here the path is taken in the opposite direction, defining algebraic operations from which the shorthands of a language may be constructed.

These operators do not provide or require any particular type system. Any use of the algebra is dependent on an external provider of a type system and functions over those types. Shorthands such as GROUP and UNGROUP, WRAP and UNWRAP require that relations and tuples be provided by that type system.

The naming conventions are adapted from Appendix A. In what follows:

A is an attribute name

T is a type

v is a value

<A,T> is an attribute (member of a heading)

<A,T,v> is an attribute value (member of a tuple)

r is a relation

Hr is the heading of r

Br is the body of r

tr is a tuple of r

## 2.    Basics from Appendix A

These are the basic 5 definitions adapted from Appendix A.

### a)      NOT

Given a relation r then s = NOT(r) is defined as follows.

Hs = Hr

Bs = { ts : ∃ tr ∉ Br, ts = tr

The NOT operator yields the complement s of a given relation r. The heading of s is the heading of r. The body of s contains every tuple with that heading that is not in the body of r.

The NOT operator is used in shorthands that involve negation, such as ANTIJOIN, MINUS.

### b)      REMOVE

Given a relation r and a type T such that <A,T> ∈ Hr then s = REMOVE(A) is defined as follows.

Hs = Hr minus { <A,T> }

Bs = { ts : ∃ tr ∉ Br, ∃ v ∈ T, <A,T,v> ∈ tr, ts = tr minus { <A,T,v> }

The REMOVE operator yields a relation s formed by removing a given attribute A from a given relation r. The operation is equivalent to taking the projection of r over all of its attributes except A. The heading of s is the heading of r minus the ordered pair <A,T>. The body of s contains every tuple that conforms to the heading of s and is a subset of some tuple of r.

The REMOVE operator is used in shorthands that remove one or more attributes, such as PROJECT.

### c)      RENAME

Given a relation r, a type T such that <A,T> ∈ Hr and no type T such that <B,T> ∈ Hr then s = RENAME(A,B) is defined as follows.

Hs = ( Hr minus { <A,T> } ) union { <B,T> }

Bs = { ts : ∃ tr ∈ Br, ∃ v ∈ T, <A,T,v> ∈ tr,

ts = ( tr minus { <A,T,v> } ) union { <B,T,v> } }

The RENAME operator yields a relation s that differs from a given relation r only in the name of one of its attributes, which is changed from A to B. The heading of s is the heading of r except that the ordered pair <A,T> is replaced by the ordered pair <B,T>. The body of s consists of every tuple of the body of r, except that in each such tuple the triple <A,T,v> is replaced by the triple <B,T,v>.

The RENAME operator is used in shorthands that rename one or more attributes, such as RENAME.

### d)      AND

Given relations r1 and r2 such that <A,T1> ∈ Hr1 and <A,T2> ∈ Hr2 implies T1 = T2 then s = AND(r1,r2) is defined as follows.

Hs = Hr1 union Hr2

Bs = { ts : ∃ tr1 ∈ Br1, ∃ tr2 ∈ Br2, ts = tr1 union tr2 }

The AND operator is relational conjunction, which is usually known as a natural join. The heading of s is the union of the headings of r1 and r2. The body of s contains every tuple that conforms to the heading of s and is a superset of both some tuple in the body of r1 and some tuple in the body of r2.

The AND operator is used in join shorthands such as JOIN, SEMIJOIN, ANTIJOIN, COMPOSE, TIMES.

### e)      OR

Given relations r1 and r2 such that <A,T1> ∈ Hr1 and <A,T2> ∈ Hr2 implies T1 = T2 then s = OR(r1,r2) is defined as follows.

Hs = Hr1 union Hr2

Bs = { ts : ∃ tr1, ∃ tr2, ( tr1 ∈ Br1 or tr2 ∈ Br2), ts = tr1 union tr2 }

The OR operator is relational disjunction, being a generalization of what is usually known as union. In the special case where the given relations r1 and r2 have the same heading, the result s is in fact the union of those two relations in the traditional sense. The heading of s is the union of the headings of r1 and r2. The body of s contains every tuple that conforms to the heading of s and is a superset of either some tuple in the body of r1 or some tuple in the body of r2.

The OR operator is used in set shorthands such as UNION, MINUS, SYMDIFF, INSERT.

## 3.    Relational Constant

This is the formalisation of a relational constant or ‘relcon’, and relies on some previously defined function f. Separate formalisations are provided for monadic and dyadic functions, in both predicate and value-returning forms. Extending them to n-adic functions is left as an exercise.

Given attribute names X, types Tx and function f with the type signature Tx->Boolean then s = RELCON(X,f) is defined as follows.

Hs = {<X,Tx>}

Bs = { ts : ∃ vx ∈ Tx, f(vx) = true, ts = {<X,Tx,vx>} }

Given attribute names X,Y, types Tx,Ty and function f with the type signature Tx->Ty then s = RELCON(X,Y,f) is defined as follows.

Hs = {<X,Tx>, <Y,Ty>}

Bs = { ts : ∃ vx ∈ Tx, ∃ vy ∈ Ty, f(vx) = vy,

ts = {<X,Tx,vx>, <Y,Ty,vy>} }

Given attribute names X,Y, types Tx,Ty and function f with the type signature Tx->Ty->Boolean then s = RELCON(X,Y,f) is defined as follows.

Hs = {<X,Tx>, <Y,Ty>}

Bs = { ts : ∃ vx ∈ Tx, ∃ vy ∈ Ty, f(vx,vy) = true),

ts = {<X,Tx,vx>, <Y,Ty,vy} }

Given attribute names X,Y,Z, types Tx,Ty,Tz and function f with the type signature Tx->Ty->Tz then s = RELCON(X,Y,Z,f) is defined as follows.

Hs = {<X,Tx>, <Y,Ty>, <Z,Tz>}

Bs = { ts : ∃ vx ∈ Tx, ∃ vy ∈ Ty, ∃ vz ∈ Tz, f(vx,vy) = vz),

ts = {<X,Tx,vx>, <Y,Ty,vy, <Z,Tz,vz>} }

[Note: for the relcon PLUS in App-A, types Tx, Ty and Tz are all INTEGER, and the function f is the scalar operator “+”.]

The RELCON operator predicate forms are used together with AND in shorthands such as WHERE/RESTRICT.

The RELCON operator value-returning forms are used together with AND in shorthands such as EXTEND, SUMMARIZE, UPDATE and DELETE.

## 4.    Aggregation

This is the formalisation of simple aggregation, similar to SUM, MAX or MIN but allowing for a wider range of functions. It relies on some previously defined aggregating function f. An aggregating function has an argument which is a list of values (a ‘bag’ or ‘multiset’ rather than a ‘set’) and returns a single value. Extending this approach to more complex aggregations is left as an exercise.

Given a relation r, a type T such that <A,T> ∈ Hr and an aggregating function f with the type signature T[]->Ta then s = AGGREGATE(r,A,f) is defined as follows.

Hs = ( Hr minus {<A,T>} ) union {<A,Ta>}

Bs = { ts : ∃ v ∈ T, ∃ tr ∈ r, ∃ <A,T,v> ∈ tr, v ∈ v[],

ts = tr minus {<A,T,v>} union {<A,Ta,f(v[])} }

Note: the term v[] should be read as: a list of the values of v for some value of ts.

The AGGREGATION operator is used in shorthands such as SUMMARIZE.

## 5.    Transitive Closure

As transitive closure cannot be defined directly in set-builder notation, this formalisation defines it indirectly by means of a recurrence relation. This consists of a starting value (given) and a sequence of successors (defined by set-builder). The transitive closure is the fix-point union of that sequence.

The starting value (‘seed’) represents known edges in a directed graph; the end value is all the possible paths through the graph.

### a)      Simple Transitive Closure

Given a relation r with the heading {<A,T>,<B,T>} for some type T then the successor relation r’ is defined as follows.

Hr’ = Hr

Br’ = { tr’ : tr’ ∈ Br or

∃ v1 ∈ T, ∃ v2 ∈ T, ∃ v3 ∈ T, ∃ v4 ∈ T,

∃ tr1 ∈ Br, tr1 = {<A,T,v1>, <B,T,v2>},

∃ tr2 ∈ Br, tr2 = {<A,T,v2>, <B,T,v3>},

tr’ = {<A,T,v1>, <B,T,v3>} }

Then the transitive closure s = TCLOSE(r) is defined as follows.

S = r1 U r2 U r3 … r¥

This is a linear recurrence, which can be shown to reach a fix-point termination.

The TCLOSE operator is used in a shorthand of the same name.

### b)      Extended Transitive Closure

In this case each tuple is associated with a value, and this definition relies on some previously defined dyadic function f that takes values of that type as its argument and return value. The value computed represents in some sense the cost of that path.

Given a relation r with heading {<A,T>,<B,T>,<C,Tv>} for some types T and Tv, and a dyadic function f with the type signature Tv->Tv->Tv then the successor relation r’ is defined as follows.

Hr’ = Hr
Br’ = { tr’ : tr’ ∈ Br or
( ∃ v1 ∈ T, ∃ v2 ∈ T, ∃ v3 ∈ T, ∃ v4 ∈ T,
∃ w1 ∈ Tv, ∃ w2 ∈ Tv,
∃ tr1 ∈ Br, tr1 = {<A,T,v1>, <B,T,v2>, <C,Tv,w1>},
∃ tr2 ∈ Br, tr2 = {<A,T,v2>, <B,T,v3>, <C,Tv,w2>},
tr’ = {<A,T,v1>, <B,T,v3>, <C,Tv,f(w1,w2)} ) }

Then the extended transitive closure s = ETCLOSE(r,f) is defined as follows.

s = r1 U r2 U r3 … r¥

This is a linear recurrence, which can be shown to reach a fix-point termination.

The ETCLOSE operator is used in combination with AGGREGATE in a shorthand to perform Generalised Transitive Closure as defined by D&D (RM VSS 5).

Filed under Language, Relational Algebra

## An Extended Relational Algebra

The following is a comprehensive set of algebraic operations that comprise an Extended Relational Algebra. They are atomic, in the sense that they each carry out precisely one operation and can be combined as needed. There is some redundancy, by design.

Each operation is a function with one or two relation values as arguments and returns one as its result. Each operation is generic, and has to be specialised for each use. It is specialised by:

• The heading of each of its relation arguments
• One or more attribute names, where the operation involves adding, removing or substituting attributes.
• A function name and appropriate arguments, as attribute names or literal values, where the operation requires a computation.

Each operation returns a value with a heading determined entirely by the specialisation as set out above. The ERA depends on a library of pre-defined functions that are compatible with the types of the various attributes. The library and those types are specified separately. The ERA has type system, and no means to access tuple or attribute values.

The ERA also includes an assignment operation, which updates the value of a pseudo-variable, identified by a literal value.

Filed under Language, Relational Algebra

## Lambdas and functions as values

Andl now has ‘functions as values’, and lambda literals.

A lambda is the literal form of a value of type ‘function’. It corresponds to a defined function but can be treated as a first class value: it can be

• stored in a variable, component member of a user-defined type or attribute of a tuple
• passed as a parameter
• executed by the postfix operator of parentheses enclosing an argument list.

The syntactic definition is similar to a function definition omitting the name.

```// function literal
def funlit(def(a:'') => a & a)
funlit(‘ab’)
// function variable
var funval := def def(a:'') => a & a
funval(‘ab’)
// direct call on function value
(def def(a:'') => a & a)(‘ab’)```

In this example, funlit, funval and the literal lambda can be called interchangeably. The value can be stored in a variable, user type component or tuple attribute and can be passed as a parameter. There is no type declaration as such, just literals of the type. Treating the function call on values as a postfix operator produces an odd asymmetry:

```funval(‘ab’) // legal
funlit(‘ab’) // legal
(funval)(‘ab’) // legal
(funlit)(‘ab’) // illegal! [Built in and user-defined operators are still second class.]```

Please note that these are individual function values, not methods. They have access to local and global scope, but there is no object or instance scope.

The latest release provides the working system and samples. Download from github.

Filed under Language, Release

## Announcing: Andl Syntax 6

Andl syntax has been evolving over time. The original syntax was somewhat cryptic and a bit of a challenge for anyone to understand. Also some parts of the syntax could be ambiguous in some situations, or at least would limit the generality of some constructs. It has also become increasingly clear that much of the syntax has direct parallels in SQL, and that it is people who already know SQL who are most likely to be interested.

Introducing Syntax 6. The key development is that each of the parts of a query takes a form that is closely parallel to a corresponding construct in SQL. So, for example:

• The function `.where(expression)` performs the same function as an SQL WHERE clause.
• The function `.order(field, field...)` performs the same function as an SQL ORDER BY clause.
• The function `.set{ field, field:=value }` performs the same function as an SQL SELECT.

And there are lots more. These are postfix functions (with a dot) that follow a relational expression. On the other hand JOIN and UNION are infix operations, placed between two values. When you put them together you get something like this SQL query and the corresponding Andl code.

```// SQL>select distinct sname from s, sp, p where p.color='Red' and s.s#=sp.s# and p.p#=sp.p#
(S .set{ S#, SNAME } join SP .set{ S#, P#} join P .set{ P#,COLOR }) .where(COLOR='Red') .set{ SNAME }```

The order is not quite the same but all the same elements are there. If you know SQL, now with Syntax 6 you can learn Andl easily!

Latest releases are on the Downloads page. There are lots more examples of SQL and matching Andl syntax in the sample file `SPPSample1.andl`.

Filed under Code sample, Language, Release

## Announcing: Andl with Thrift is here

Thrift is a platform-independent API technology, see http://thrift.apache.org/. It’s a “software framework for scalable cross-language services development, combining a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages”. And now with Andl.

Andl provides the following Thrift-related capabilities.

1. The Andl compiler can generate a Thrift Interface Definition Language (IDL) file as an optional output.
2. The Andl runtime performs data conversion to and from the format required by Thrift classes for wire transmission.

That’s all you really need. To put together a complete application involves the following steps.

1. Write the database backend in Andl.
2. Compile with the /t switch to generate the Thrift IDL. Compiled data and code are stored in a database catalog.
3. Run the Thrift IDL compiler to generate an API of callable functions in the target language.
4. The release includes a generic server written in C# using Thrift classes, which can be modified to suit specific needs. It simply needs to load the catalog.
5. The release also includes sample client programs using both Thrift classes and the target language API functions, but containing no Andl  code or functions at all.
6. The developer should modify the client programs or create new ones that call the target language API according to the needs of the application.

And that’s it. Mostly it is just Thrift and the Thrift documentation applies. The end result is that you can write an Andl program that exposes a set of callable entry points, and then call them from any Thrift-supported language.

Filed under Language, Release

## Recursive Queries: Sudoku Solver

This Sudoku solver uses a number of recursive queries, first to generate fixed data structures and then to drive the search for a solution.

This solver simply applies the rules.

1. Every row, column and box (3×3) must have the digits 1 to 9 once and once only.
2. If only one digit can go in a cell then it must go there.
3. If a digit can go in only one place in a row, column or box then it must go there.

Here is the code.

First, the fixed data structures.

```digits := {{ sdigit:= '1', ndigit := 1}} recurse( {{ sdigit:=text(ndigit+1), ndigit:=ndigit+1 }} [?(ndigit <= 9)] )
digitsx := digits union {{ sdigit := '.', ndigit := 0 }}
units := {{ index := 0, row := 0, col := 0, box := 0 }} recurse(
{{ index := index + 1,
row := (index + 1) div 9,
col := (index + 1) mod 9,
box := (index + 1) div 3 mod 3 + (index + 1) div 27 * 3 }} [?(index <= 80)] )
poss := units [{ index }] join digits [{ ndigit }]
possu := units join digits [{ ndigit }]
```

Now some useful functions.

```showb(t:text) => do {
seq(11)[{ N, line:=
if(N mod 4 = 3,
fill('-', 9),
right(left(t, 9 + (N - N div 4) * 9), 9))}]
}
// Show a set of knowns. First fill out all index values, then convert to text
showunk(k:poss) => do {
t := (k union (units ajoin k)[{ index, ndigit := 0}]) join digitsx[{ ndigit, sdigit }]
showb(t [\$(index)][{ fold(&, sdigit) }])
}```

The original raw data

```inp := {{ sud := '53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79' }}
//inp := {{ sud := '1....7.9..3..2...8..96..5....53..9...1..8...26....4...3......1..4......7..7...3..' }}
inp
board := ((units join inp) [{ * sud, sdigit := right(left(sud, index + 1), 1) }] compose digitsx) [{ index, ndigit }]
knowns := board [?( ndigit <> 0)]
'Knowns=' & knowns.count
showunk(knowns)```

This the solver, which recurses as long as it can make progress. After this you have to guess.

```solution := knowns recurse(
do {
knownsu := knowns join units
allowedu := possu ajoin knownsu[{ index }]
ajoin knownsu[{ row, ndigit }]
ajoin knownsu[{ col, ndigit }]
ajoin knownsu[{ box, ndigit }]

// algorithm 1 - a cell with only one possible digit must be that digit
new1 := allowedu [{ index, tot:=fold(+,1) }] [?(tot=1)] join allowedu

// algorithm 2 - a digit with only one place in a unit must go there
new2a := allowedu [{ ndigit, row, tot:=fold(+,1) }] [?(tot=1)] join allowedu
new2b := allowedu [{ ndigit, col, tot:=fold(+,1) }] [?(tot=1)] join allowedu
new2c := allowedu [{ ndigit, box, tot:=fold(+,1) }] [?(tot=1)] join allowedu

new1[{ index, ndigit }] union new2a union new2b union new2c
}
)
showunk(solution)```
`And here is the grid, before and after solving.`
```N      | line
------------------
0 | 53..7....
1 | 6..195...
2 | .98....6.
3 | ---------
4 | 8...6...3
5 | 4..8.3..1
6 | 7...2...6
7 | ---------
8 | .6....28.
9 | ...419..5
10 | ....8..79

N      | line
------------------
0 | 534678912
1 | 672195348
2 | 198342567
3 | ---------
4 | 859761423
5 | 426853791
6 | 713924856
7 | ---------
8 | 961537284
9 | 287419635
10 | 345286179```

Filed under Code sample, Language

## Recursive Queries: the Mandelbrot Set

Recursive queries can be used to solve a variety of problems that require multiple passes, not just recursive data structures. Here is an implementation of the Mandelbrot set as ASCII art.

```xaxis := {{ x:=-2.0 }} recurse( {{ x:=x+0.05 }} [?(x<1.2)] )
yaxis := {{ y:=-1.0 }} recurse( {{ y:=y+0.1 }} [?(y<1.1)] )
m := ({{ iter:=0, x:=0, y:=0 }} join xaxis[{ cx:=x }] join yaxis[{ cy:=y }])
recurse( {{ iter:=iter+1, x := x*x-y*y+cx, y:=2*x*y+cy, cx, cy, }} [?(x*x+y*y<4.0 and iter<28)] )
m.count
m2 := m[{ iter := fold(max,iter), cx, cy }] [\$(cy,cx)]
m2.count
a := m2 [ { cy, t := fold(&, right(left(' .+*#', 1 + iter div 6), 1)) }]
a```

This query iterates each point up to 28 times to find those that lie on the border of the Mandelbrot set. Each point that is still within range is iterated until no more can be found. The number of iterations selects which character to display.

Here is the result.

```cy     | t
--------------------------------------------------------------
-1.0 |                                     ....#
-0.9 |                                    ..#*..
-0.8 |                                  ..+####+.
-0.7 |                             .......+####+...   +
-0.6 |                            ..##+###########*.++++
-0.5 |                           .+.##################*.
-0.4 |               .............*###################+.*
-0.3 |               ..+*.+#+....*#####################+.
-0.2 |              ...+#######++#######################.
-0.1 |           ...++*################################.
0.0 |  #############################################...
0.1 |           ...++*################################.
0.2 |              ...+#######++#######################.
0.3 |               ..+*.+#+....*#####################+.
0.4 |               .............*###################+.*
0.5 |                           .+.##################*.
0.6 |                            ..##+###########*.++++
0.7 |                             .......+####+...   +
0.8 |                                  ..+####+.
0.9 |                                    ..#*..
1.0 |                                     ....#```

Filed under Code sample, Language

## Recursive Queries

Andl now supports recursive queries. This can be seen as a direct extension of the Relational Algebra for the situation where a query cannot be satisfied in a single operation.

Imagine a company organisation chart. One query will find every employee who reports to a given boss, but it takes a succession of queries to find out who reports to that employee, and who reports to them, until at last we reach the workers and no-one reports to them.

Here is the data for a simple orgchart (data thanks to SQLite).

```name  | boss
-------------
Alice |
Bob   | Alice
Cindy | Alice
Dave  | Bob
Emma  | Bob
Fred  | Cindy
Gail  | Cindy
```

Here is the Andl code to query it.

```def orgchart:db(csv)
orgchart
ua := {{ name:= 'Alice', level := 0 }} recurse( {{ boss := name, level := level+1 }} compose orgchart)
ua
ua [{ t:=fill('.', level*3) & name }]```

And here is the result. The recurse function repeatedly evaluates the join with successive tuples starting from the seed, until no more can be found.

```name  | level
--------------
Alice |      0
Bob   |      1
Cindy |      1
Dave  |      2
Emma  |      2
Fred  |      2
Gail  |      2
```
```t
----------
Alice
...Bob
...Cindy
......Dave
......Emma
......Fred
......Gail
```

Filed under Code sample, Language

## Image relations

TTM and its companion book DTATRM talk about an IMAGE RELATION function, that joins two relations, embedding matching tuples from the right as a relation-valued attribute of the left.

Here is a piece of code that creates an RVA. The relation exam_mark is partitioned across the several courses.

```cer := course [{ CourseId, ExamResult := {{ * }} joinr exam_mark}]

CourseId   | ExamResult
----------------------------------
C1         | StudentId  | Mark
| ---------------------
| S1         |       85
| S2         |       49
| S4         |       93
C2         | StudentId  | Mark
| ---------------------
| S1         |       49
C3         | StudentId  | Mark
| ---------------------
| S3         |       66
C4         | StudentId  | Mark
| ---------------------```

Here is a piece of code that does the job of putting the tuples back together again.

```cer [{ fold(union,ExamResult) }]

StudentId  | Mark
---------------------
S1         |       85
S2         |       49
S4         |       93
S1         |       49
S3         |       66```

Note that this relies on an anonymous attribute, which is ‘lifted’ out as a single value.

Filed under Language

## 99 bottles of Andl beer!

In response to a challenge, I decided to actual solve the problem. Here it is. The lyrics are here: http://www.99-bottles-of-beer.net/lyrics.html

```ten := {{ n:=0 },{ n:=1 },{ n:=2 },{ n:=3 },{ n:=4 },{ n:=5 },{ n:=6 },{ n:=7 },{ n:=8 },{ n:=9 }}
hundred := (ten join ten[{nn:=n}]) [ {neg := -10*nn-n,nnn := 10*nn+n}] [nnn>0]
line1 := hundred [{ neg, text := nnn || " bottle" || if(nnn=1,"","s") || " of beer on the wall, " ||
nnn || " bottle" || if(nnn=1,"","s") || " of beer."}]
line2 := hundred [{ neg:=neg+0.5, text := "Take one down and pass it around, " || (nnn-1) || " bottle" ||
if(nnn=2,"","s") || " of beer on the wall."}]
line3 := "No more bottles of beer on the wall, no more bottles of beer."
line4 := "Go to the store and buy some more, 99 bottles of beer on the wall."
(line1 union line2) [\$] [{text}] union {{text:=line3},{text:=line4}}```

The first two lines just generate the sequence. Later there will be a better way to do that. The 4 lines of text are dictated by the song lyrics. Most of the program is fiddly details to precisely match the song lyrics as published, and get the lines in the right order. The end result is a relation containing the song lyrics in the correct order, which is then simply printed.

I found this quite a fascinating exercise. I had to implement IF(,,) but I think the only other control structure needed is user-defined recursive functions. The result is a programming paradigm quite unlike any that I’m familiar with, and surprisingly powerful. And weird.