Ulf Wendel

MaxDB series: …just SQL

Dear MySQL users, MaxDB users and friends,

The previous “please wait” posting was caused by a new guideline on series postings. The new guideline says that postings must be written one week before they get published. The week between the writing and the publication is used for corrections. I’m writing these lines on April, 4th but the article won’t be published before Wednesday April, 12th. We hope that in the time between writing and publishing we can catch most of the Gotchas to further improve the quality of the series. However, this series will continue to be more like a collection of blog postings than a book, a magazine article or any other “official documentation”. We try to do our best, but unfortunately we cannot apply the same amount of quality assurance and proof-reading to the MaxDB series that we do for magazine articles, official documentation or official class materials.

In this issue

In the past, we recognized that most MaxDB users do have some SQL knowledge. Hardly any of the typical MaxDB users seem the be beginners in the field of databases. Therefore SQL basics are not of interest. For example, most users who asked us SQL questions in the past did know what kind of joins exist, what the difference between a right join, left join and a full join are and how to define tables. Also, there seems to be no demand on describing the basics of views as many articles did when MySQL 5.0 introduced views to the MySQL server.

MaxDB users seem to be more interested in the limits and additional features over the SQL92 Entry Level compatible SQL dialect implemented by MaxDB. For this reason, we will often provide you only with brief overview information on some topics. If you need to details, check the SQL Tutorial chapter in the MaxDB documentation. Other online resources like the SQL Course and the SQL Course 2 may also be worth reading if you want to learn the very basics. Those who like printed books better and prefer reading on a sunny balcony, can check the book listing on the MySQL Developer Zone.


Data types for table columns

Let us start the discussion of the MaxDB SQL dialect with something very basic: data types. In general you should always select the smallest data type for a column that can hold all values you want to store in the column. Try to keep your data as compact as possbile to save costy disk-read operations and to hold the most records in as short a buffer cache as possible.

For comparisons with MySQL data types, check what mappings have been choosen for the MaxDB plugin of the MySQL Migration Toolkit. The mappings are desribed in two german language blog postings. Although the texts are written in german, no german language knowledge is required to understand the tables given in the articles on mapping strings, BLOB/LONG, numeric types and temporal types.

Datatype Value range Storage Notes
Numeric types
SMALLINT -32768 to 32767, 16-bit 5 bytes Equal to FIXED(5, 0)
INT[EGER] -2147483648 to 2147483647, 32-bit 7 bytes Equal to FIXED(7, 0)
FLOAT(p), precision p (0 < p <= 38) 9. 9999999999999999999999999999999999999E +62 to 1E-64, 0.0, +1E-64 – +9. 9999999999999999999999999999999999999E +62 (p+1) DIV 2 + 2 Floating point numbers.

For MySQL users: try SUM(float_column / 3) * 3 or summing up small values. MaxDB FLOAT is MySQL Precision Math.

FIXED (p,s), 0 <p <= 38, s <= p Depends on p and s, see FLOAT/SMALLINT/INT (p+1) DIV 2 + 2 Fixed point number.

MaxDB is using a packed decimal storage format for INTEGER, FLOAT and FIXED values. This means that if you store the same values in three columns of the three data types (with comparable precisions), you will get the same performance when doing comparisons.

For MySQL users: FIXED is for Precision Math.

String types
CHAR[ACTER](n), n < = 8000 (UNICODE: 4000)   see Manual UNICODE requires 2 bytes per character

Code attributes: ASCII, BINARY, UNICODE

Depending on n (<30, >=30) strings will be stored with a variable length

VARCHAR(n), n <= 8000 (UNICODE: 4000)   see Manual UNICODE requires 2 bytes per character

Code attributes: ASCII, BINARY, UNICODE

LONG [VARCHAR]   up to 2GB of bytes Also known as BLOB, LOB, etc.

Code attributes: ASCII, BINARY, UNICODE

Temporal types
TIME ISO: ’00:00:00′ to ’23:59:59′ 9 bytes Check the manual for database session settings and the database parameter DATE_TIME_FORMAT
DATE ISO: ’0001­-01­-01′ to ’9999­-12-­31′ 9 bytes Check the manual for database session settings and the database parameter DATE_TIME_FORMAT
TIMESTAMP ISO: ’0001-01-­01 00:00:00.000000′ to ’9999-12-31 23:59:59.999999′ 21 bytes Check the manual for database session settings and the database parameter DATE_TIME_FORMAT

Supports Microseconds

Other types
BOOLEAN TRUE | FALSE | NULL 2 bytes

Check the manual for several aliases for data types, for example DECIMAL, NUMERIC, REAL and many more.



Check Constraints

As you might have noticed there is no UNSIGNED INTEGER data type in MaxDB like in MySQL. It is not needed, because MaxDB supports check constraints. The check contraint syntax is accepted by MySQL and does not lead to any errors, but MySQL leaves it to the client to check the validity of values.

CREATE TABLE test_check_constraints (
  c_int_positive INTEGER CHECK c_int_positive >= 0,
  c_int_negative INTEGER CHECK c_int_negative  < 0,
  c_enum  CHAR(6)  CHECK c_enum IN ('male', 'female')
)
//
INSERT INTO test_check_constraints(c_int_positive) VALUES (-1)


---- Error -------------------------------
Auto Commit: On, SQL Mode: Internal, Isolation Level: Committed
 General error;300 POS(1) Integrity violation:C_INT_POSITIVE,TEST_CHECK_CONSTRAINTS,DBADMIN
INSERT INTO test_check_constraints(c_int_positive) VALUES (-1)


Domains

In cases where you are using the same check constraints for multiple tables, consider defining a domain. A domain is a value range definition which consists of a data type definition, an optional default specification and an optional constraint definition. Using domains instead of individual definitions in each table helps you to ensure that certain columns always accept the same value ranges.

CREATE DOMAIN birthday_domain DATE DEFAULT DATE
 CONSTRAINT birthday_domain> '1880-01-01' AND birthday_domain < = DATE
//
CREATE TABLE people (
  first_name VARCHAR(64),
  last_name VARCHAR(64),
  birthday birthday_domain
)
//
INSERT INTO people(first_name, last_name, birthday) VALUES ('Albert', 'Einstein', '1879-03-14')


---- Error -------------------------------
Auto Commit: On, SQL Mode: Internal, Isolation Level: Committed
 General error;300 POS(1) Integrity violation:BIRTHDAY_DOMAIN,PEOPLE,DBADMIN
INSERT INTO people(first_name, last_name, birthday) VALUES ('Albert', 'Einstein', '1879-03-14')


Synonyms

MaxDB enables you to define an alternative name (a synonym) for a table. The synonym can be made visible to other users using the keyword PUBLIC. There is not counterpart to synonyms in MySQL. However, you can emulate the feature on MySQL using a view.

 CREATE [PUBLIC] SYNONYM [<schema_name>]<synonym_name> FOR <table_name>


For example you could define an alternative name for the table people created in the last SQL example.

CREATE SYNONYM friends FOR people
//
SELECT * FROM friends


Indexes

It seems less know that MaxDB features function-based indexes. Only user-defined functions can be used with a function-based index, but this is no severe restriction. It’s simple to use a user-defined function as a wrapper for a build-in function as shown in the following example. The example gives a solution to a common pitfall for users with a MySQL background. MaxDB does always perform case sensitive string comparisons whereas old MySQL versions did case insensitive comparisons. As of version 4.1 of the MySQL server you can use a collation to control the comparison rules. However, MaxDB has nothing comparable: ‘ALbert’ = ‘Albert’ and ‘ALbert’ LIKE ‘Albert’ both evaluate to false.


++++ Execute +++++++++++++++++++++++++++++++
Auto Commit: On, SQL Mode: Internal, Isolation Level: Committed
SELECT 'Condition met, strings are the same' FROM DUAL WHERE ('ALbert' = 'Albert') OR ('ALbert' LIKE 'Albert')
Statement successfully executed. No Result
Execution Time:  17:00:43.071 - 17:00:43.080  (00.009 sec)

For a case insensitive search, you have to convert both operands to upper or lower cases, for example: SELECT * FROM PEOPLE WHERE UPPER(first_name) = UPPER(‘ALbert’). It would not be a big deal to use such a query if the query would still profit from an index on the
column first_name. But this is not the case. A normal index on the column first_name cannot be used efficiently, because it stores the original value of the first name – ‘Albert’ – and not a upper case version of value – ‘ALBERT’. All values stored in the index must be converted to upper case before the comparison against UPPER(‘ALbert’) can be made. No extra conversion of index values would be needed if the results of the function UPPER(first_name) would have been stored in the index. This is what a function-based index does.


CREATE FUNCTION my_upper(string_value VARCHAR(64)) RETURNS VARCHAR(64) DETERMINISTIC AS
  RETURN UPPER(string_value);
//
CREATE INDEX idx_my_upper_first_name ON people(my_upper(first_name))
//
SELECT COUNT(*) FROM people WHERE my_upper(first_name) = UPPER('ALbert')

Note: A bug in the current version of MaxDB 7.6 leads to a wrong cost value calculation for function-based indexes. The optimizer will not use any function-based indexes. The problem will be fixed in 7.6.00.26. You can expect to see a new MaxDB version during the next few weeks that contains the fix.


Sequences and SERIAL

Sequences are number generators. For example, number generators can be used to create primary key values or to record the insert order of the rows of a table by adding an extra column to the table that gets filled with sequential numbers from a sequence.

<create_sequence_statement> ::= CREATE SEQUENCE [<schema_name>.]<sequence_name>

[INCREMENT BY <integer>] [START WITH <integer>]
[MAXVALUE <integer> | NOMAXVALUE] [MINVALUE <integer> | NOMINVALUE]
[CYCLE | NOCYCLE]
[CACHE <unsigned_integer> | NOCACHE]
[ORDER | NOORDER]

Let us assume you have a table of registered users in the database and you want to keep track of the order in which users have registered themselves. The username serves as a natural primary key for the table. The column registration_timestamp contains a timestamp with the date of the registration. Although the timestamp includes microseconds it still could theoretically happen that two users register at exactly the same point in time. Therefore another column has been added which gets filled with sequential numbers from the sequence registered_users_seq. When doing the insert, <schema_name>.<sequence_name>.NEXTVAL gets used to retrieve the next number from the sequence. The current number – the last number returned for a <schema_name>.<sequence_name>.NEXTVAL call – can be retrieved using <schema_name>.<sequence_name>.CURRVAL. Note that you must specify the schema when calling the number generator.

CREATE SEQUENCE registered_users_seq
  START WITH 1
    INCREMENT BY 1
//
CREATE TABLE registered_users (
  username CHAR(32) NOT NULL,
  registration_timestamp TIMESTAMP NOT NULL,
  registration_order INT NOT NULL,
  PRIMARY KEY(username)
)
//
INSERT INTO registered_users(username, registration_timestamp, registration_order)
  VALUES ('nixnutz', TIMESTAMP, dbadmin.registered_users_seq.NEXTVAL)
//
SELECT dbadmin.registered_users_seq.CURRVAL FROM DUAL
//
SELECT * FROM registered_users

You cannot use <schema_name>.<sequence_name>.NEXTVAL for the default specifition of a table column. But MaxDB has something that is roughly comparable to AUTO_INCREMENT in MySQL: SERIAL. A column that has a DEFAULT SERIAL definition will be automatically assigned a sequential number if no other value is given for the column. Note that you can have only one DEFAULT SERIAL column definition per table and that you cannot update the values assigned by the database. This makes SERIAL especially useful for columns tracking the insert order of records as you cannot fake any other orders afterwards by updating the columns with DEFAULT SERIAL values.

DROP TABLE registered_users
//
CREATE TABLE registered_users (
  username CHAR(32) NOT NULL,
  registration_timestamp TIMESTAMP NOT NULL DEFAULT TIMESTAMP,
  registration_order INT NOT NULL DEFAULT SERIAL(10),
  PRIMARY KEY(username)
)
//
INSERT INTO registered_users(username) VALUES ('nixnutz')
//
SELECT* FROM registered_users

There is more to say about sequences and SERIAL. We strongly recommend that you check the manual for the details before you start using any of them intensively.


Limiting result sets

Result sets can be limited to the first n results by adding a ROWNO predicate to the WHERE clause of a SELECT statement. ROWNO is in some
ways similar to LIMIT in MySQL, but it is not as powerful as LIMIT. Using ROWNO you can only limit a result set to the first n records. You cannot use ROWNO to get a list of the 1th to the 20th records of the result set. In case you need this, use cursors instead. On the other hand you can add the ROWNO to the list of columns of the result set which has no counterpart in MySQL.

SELECT ROWNO, * FROM registered_users WHERE ROWNO < = 3

Plans exist to add full LIMIT (TOP) support to future versions of MaxDB.


Cursors and Recursion

A cursor is a named result table. Functions exists to drop the named result table and to access its records in sequential and random order. The functions to access cursors are provided by the programming interfaces, for example JDBC, DBI and ODBC. But cursors can also be accessed using SQL commands. It is a bit difficult to demonstrate the use of the SQL commands. The SQL Studio uses ODBC to communicate with MaxDB. ODBC does not support the SQL commands. Therefore, we have wrapped a simple database procedure around the SQL commands for
working with cursors. This way, the commands are run “inside” the server. Alternatively we could have used a non-ODBC client (e.g. sqlcli), but we assume that most readers are using some sort of ODBC-based GUI to access MaxDB.

When you are new to database procedures, pay special attention to all loops inside a procedure. Ensure that the loops will not run indefinetly but will terminate. Hint: $RC is set not only by FETCH, but also by INSERT!

In the example a loop gets used to fetch the 1st, 3rd, 5th and so on user name from the table registered_users to insert it into another table. Instead of calling FETCH NEXT twice to skip one record, you could calculate an offset value and use FETCH POS(offset) instead. Note that inside a database procedure you do not need to open a cursor explicitly. Please check the manual on further important syntax details.

CREATE DBPROC BASIC_CURSOR_EXAMPLE AS
  VAR username CHAR(64);

  DECLARE users_cursor CURSOR FOR
    SELECT username FROM DBADMIN.registered_users ORDER BY registration_order;

  FETCH NEXT users_cursor INTO :username;
  WHILE $RC = 0 DO BEGIN
    INSERT INTO DBADMIN.selected_users(username) VALUES (:username);
    FETCH NEXT users_cursor INTO :username;
    FETCH NEXT users_cursor INTO :username;
  END;

  CLOSE users_cursor;

The hottest aspects of MaxDB cursors is that they can be used for recursive queries. Recursion is particulary helpful to traverse tree structures. A classical example of a tree structure is the reporting structure of a company: Ulf reports to Patrik, Patrik reports to Kaj and Kaj reports to Marten. It is difficult to retrieve a list of all employees reporting to Kaj using classical SQL and you have to use a “trick” like “Nested Sets”. With MaxDB, however, you can create a recursive cursor that builds an initial result set and adds new records to it until it does not find any further hits, then returning the entire result set to you.

DECLARE <result_table_name> CURSOR FOR WITH RECURSIVE <reference_name> (<alias_name>,...) AS
(<initial_select> UNION ALL <recursive_select>) <final_select>

The syntax of a recursive cursor reflects the three steps explained below. A recursive cursor consists of an initial select, a recursive select and a final select. When the cursor gets executed and the names result table gets built, the initial SELECT is executed first. We will show step by step how it works, and explain the complete example below.

  DECLARE report_cur CURSOR FOR WITH RECURSIVE
    emp(emp_name) AS (SELECT employee FROM DBADMIN.employees WHERE boss = :boss_name
    UNION ALL SELECT employee FROM DBADMIN.employees INNER JOIN emp ON employees.boss = emp.emp_name)
    SELECT emp_name FROM emp ORDER BY  emp_name


The example shows how to find all people reporting to Kaj. In the example, the initial select is used to retrieve all employees that report directly to Marten.

Original data set (table employees) Result set of the initial select (reference name of the intermediate result set used during recursion)
Initial select: SELECT employee FROM DBADMIN.employees WHERE boss = ‘Kaj’
boss employee
NULL Marten
Marten Kaj
Kaj Patrik
Patrik Ulf
emp_name
Kaj

The initial select identifies Kaj as the person reporting directly to Marten and adds Kaj to an intermediate result set. The intermediate resultset has been given the reference name emp. This reference name can be used by the recursive query which is executed next. The recursive query joins intermediate resultset with the original data set and looks for all people reporting to Kaj. It finds Patrik. The UNION ALL part of the recursive cursor statement adds Patrik to the intermediate result. As there have been results, the execution of the recursive query is repeated. This time, the query looks for all people reporting to Patrik and adds Ulf. As there have been results, the execution of the recursive query is repeated, but there is nobody reporting to Ulf. No more records are added to the intermediate result set and MaxDB continues with the final step, the final select.

Intermediate result set (reference name: emp) Intermediate result set (reference name: emp)
1st recursion: SELECT employee FROM DBADMIN.employees INNER JOIN emp ON employees.boss = emp.emp_name
boss employee
NULL Marten
Marten Kaj
Kaj Patrik
Patrik Ulf
emp_name
Kaj
Patrik
2nd recursion: SELECT employee FROM DBADMIN.employees INNER JOIN emp ON employees.boss = emp.emp_name
boss employee
NULL Marten
Marten Kaj
Kaj Patrik
Patrik Ulf
emp_name
Kaj
Patrik
Ulf

The final select returns all records from the intermediate result set ordered by the column emp.

emp_name
Kaj
Ulf
Patrik

As mentioned above, the cursor examples are wrapped in a database procedure to allow you to run the commands even on ODBC-based GUI tools like SQL Studio.

CREATE TABLE employees (
  boss VARCHAR(64),
  employee VARCHAR(64)
)
INSERT INTO employees(boss, employee) VALUES (NULL, 'Marten')
//
INSERT INTO employees(boss, employee) VALUES ('Marten', 'Kaj')
//
INSERT INTO employees(boss, employee) VALUES ('Kaj', 'Patrik')
//
INSERT INTO employees(boss, employee) VALUES ('Patrik', 'Ulf')
//
CREATE TABLE reports_to(emp VARCHAR(64))
//
CREATE DBPROC report_structure(IN boss_name VARCHAR(64))  AS
  VAR emp_name VARCHAR(64);

  DECLARE report_cur CURSOR FOR WITH RECURSIVE
    emp(emp_name) AS (SELECT employee FROM DBADMIN.employees WHERE boss = :boss_name
    UNION ALL SELECT employee FROM DBADMIN.employees INNER JOIN emp ON employees.boss = emp.emp_name)
    SELECT emp_name FROM emp ORDER BY  emp_name;

  FETCH NEXT report_cur INTO :emp_name;
  WHILE $RC = 0 DO BEGIN
    INSERT INTO DBADMIN.reports_to(emp) values (:emp_name);
    FETCH NEXT report_cur INTO :emp_name;
  END;

  CLOSE report_cur;
//
DELETE FROM reports_to
//
CALL report_structure('Marten')
//
SELECT * FROM reports_to


In the next issue

This issue has focused on several interesting details of the MaxDB SQL dialect. It might have looked like a “random” selection and in some ways it has been such. The next issue will also be on the SQL dialect, but it has only one topic: transactions. We will repeat the basic properties of transactions, explain about isolation levels, locking, subtransactions and say some words about deadlock detection.

Meanwhile you could check the manual for system triggers. System triggers are called after the start of a database instance… Check it out!

Ulf Wendel for the MySQL MaxDB team

One Comment

  1. Regarding recursive select statements that carries references to “itself” (foreign keys pointing into the table they resides) check out:
    http://www.frostinnovation.com/Blog.aspx