Skip to content

SQL Support

Enrico Olivelli edited this page Jun 14, 2016 · 9 revisions

HerdDB is born due to the need to port existing JDBC application to a distributed, shared nothing, embeddable database, and so even if it is a key-value database at the low level the most commonly used API is the JDBC/SQL interface.

This is a brief description of SQL/JDBC supported functions If you need to support any of the SQL costructs just ask to the dev team on the mailing-list, file a GitHub issue or a JIRA issue

  • CREATE/DROP TABLE
  • BEGIN/COMMIT/ROLLBACK TRANSACTION
  • ALTER TABLE ADD COLUMN
  • ALTER TABLE DROP COLUMN
  • INSERT INTO TABLE (....) VALUES(....)
  • SELECT .... FROM TABLE WHERE PK=? (lookup using Primary Key is the best data access pattern!!)
  • UPDATE TABLE SET ... WHERE PK=? (lookup using Primary Key is the best data access pattern!!)
  • DELETE FROM TABLE ... WHERE PK=? (lookup using Primary Key is the best data access pattern!!)
  • UPDATE TABLE SET ... WHERE ......
  • SELECT .... FROM TABLE WHERE ......
  • DELETE FROM TABLE WHERE ....
  • JDBC PreparedStatements (which query plan cache, leveraging JDBC Parameters)
  • SUBQUERIES which deal on other tables on the same tablespace
  • SUM/COUNT/MIN/MAX aggregated functions
  • UPPER/LOWER scalar functions
  • AUTO_INCREMENT for integer/long columns (only on single columns PKs)
  • CURRENT_TIMESTAMP MySQL virtual column

Supported types:

  • strings (backed by java UTF encoding)
  • 32bit signed integers
  • 64bit signed integers
  • java.sql.Timestamp (datetimes with millisecond precision)
  • blobs (byte arrays)

The supported list of datatype names is just the mapping from the various names to the list of the previous list raw datattypes, for instance VARCHAR2 maps to 'string'

Other notes:

  • identifiers are always non case-insensitive
  • JDBC ResultSet are never updatable and forwards-only
  • the same Connection can run multiple statements concurrently and keeps multiple ResultSets/cursors open
  • Transaction isolation is ALWAYS "COMMITTED_READ"
  • PrimaryKey access is the best data access path
  • When a DML statement spans multiple columns the selection is performed in transaction and then the mutation on each row (READ_COMMITTED semantics, not REPEATABLE_READ or SERIALIZABLE)
  • Every column is NULLABLE, except from the columns in the PK
  • There is no constraint on string lengh (syntax is supported only for supporting existing scripts)
Clone this wiki locally