CREATE TABLE (TSQL Statement)

creates a new table.

Syntax

CREATE TABLE <table_qname>
( {     <column_definition>
    | [ <table_constraint> ]
    | [ <table_index> ]  } [ ,...n ]
)
[ ON <file_group> ]

<table_qname> ::= [ database_name. [ schema_name ] . | schema_name. ] table_name

<column_definition> ::=
    column_name datatype [ COLLATE collation_name ] [ NULL | NOT NULL ]
    [ IDENTITY [ ( seed , increment ) ]
    [ [ CONSTRAINT constraint_name ] { { PRIMARY KEY | UNIQUE } [ CLUSTERED | NONCLUSTERED ] } ]
    [ INDEX index_name [ CLUSTERED | NONCLUSTERED ] ]

<table_constraint> ::=
    [ CONSTRAINT constraint_name ]
    { PRIMARY KEY | UNIQUE }
    [ CLUSTERED | NONCLUSTERED ]
    ( column [ ASC ] [ ,...n ] )

<table_index> ::=
    INDEX index_name
    [ CLUSTERED | NONCLUSTERED ]
    (column [ ASC ] [ ,...n ] )

Arguments

<table_qname>
is the qualified or unqualified name of the table to create.
column_name
name of a column.
datatype
is a valid datatype.
  • The default precision for varbinary, varchar and char is 1 if not specified.
  • The default precision for numeric is 18 if not specified.
COLLATE collation_name
is a valid collation name. E.g. latin1_general_ci_as, en_ci_as, fr_ci_as.
  • can be specified only for columns of type char and varchar.
  • the COLLATE clause must be written just after the datatype.
    E.g. last_name VARCHAR(30) COLLATE fr_cs_as NOT NULL.
  • if the COLLATE clause is not specified, the collation defaults to the server default collation.
NULL | NOT NULL
specifies whether the column allows NULL values.
  • If not specified, the default is NULL, unless the option ANSI_NULL_DFLT_ON is set to OFF during the session.
IDENTITY
specifies that the column is an identity column. When a new record is inserted into the table, a new incremental value is automatically generated for this column.
  • IDENTITY is often used with the PRIMARY KEY constraint.
  • the IDENTITY attribute applies only to tinyint, smallint, int, bigint.
  • only one IDENTITY column can exist per table.
seed
the start value for IDENTITY column.
  • if seed is specified, increment must be specified too.
  • if seed and increment are omitted, the default is IDENTITIY(1,1).
increment
the increment for the values generated for the IDENTITY column.
CONSTRAINT constraint_name
optional name for the constraint. If omitted, RSQL will generate a name automatically.
PRIMARY KEY
creates a primary key constraint.
  • A table can have only one primary key.
  • All columns in the primary key must be defined as NOT NULL.
  • By default, a PRIMARY KEY constraint is CLUSTERED if no other clustered index or constraint is defined.
UNIQUE
creates a unique constraint.
  • A table can have many unique constraints.
CLUSTERED
forces the index to be clustered.
  • A table can have only one clustered index.
NONCLUSTERED
forces the index to be nonclustered.
  • A table can have many nonclustered indexes.
column [ ASC ] [ ,…n ]
list of columns (also called the index key) of the constraint or index.
  • the optional ASC specifier means ascending sort order and is the default. DESC is not implemented.
INDEX index_name
is the name of the index. index_name cannot be omitted.
ON <file_group>
this clause often exists in scripts generated by MS SQL Server, but it is just ignored by RSQL.

Remarks

Only sa or dbo is allowed to execute this statement.

The command SHOW T displays the list of all tables.

The command SHOW T table_name displays the structure of the table as a SQL script.

Rowid Column

Each table has a rowid column as first column, of type bigint.

This column contains a monotonically increasing value, generated automatically when records are inserted. It is not possible to modify or reset this counter.

This column is usually hidden, but can be displayed if explicitly listed, as in SELECT rowid, * from clients.

The rowid column is for internal use only. If you need a column filled with a generated value from a counter, define a column with the IDENTITY specifier.

Storage of Records

The records of a table are stored in a file, physically sorted by the clustered key (also called the native key) order.

If the table has nonclustered indexes, the subrecords of each nonclustered index are stored in their own file, physically sorted by the index key order.

For example, if a table has 3 indexes, the records will be stored in 4 files (1 file for the base table, and 3 files for the indexes).

To see the files used by the table and its indexes, use the command SHOW ID T table_name.

Clustered Index

A table has always exactly one clustered index (aka base table). If you don’t use the CLUSTERED specifier in your table definition, RSQL will define a clustered index for you.

Except in rare cases for which you know what you do, you should always define a PRIMARY KEY in your table.

In most cases, the primary key should be defined as the clustered index.

All the columns of the table will be stored in the clustered index file, sorted physically by the order of this clustered index.

Nonclustered Indexes

Physically, nonclustered indexes are implemented as real tables:

When statements like INSERT, UPDATE, etc insert, delete or modify records in a table, the clustered index file is modified and each nonclustered index file is modified too.

So, if a table has three indexes, it means that inserting one record in the table will in fact:

Each nonclustered index is a partial copy of the base table, with records being physically sorted by a different order.
It is not a full copy because only the columns making up the index key and the “native key” of the base table are copied.

To avoid performance problem, avoid defining too many indexes.

Constraints and Indexes

Constraints are a logical concept, which enforces some conditions on data. In particular, PRIMARY KEY and UNIQUE constraints enforce uniqueness.

But PRIMARY KEY and UNIQUE constraints are physically implemented by automatically creating a physical index for them. So, whenever these constraints are defined, corresponding indexes are created.

The index types are:

The indexes are used by the query optimizer to speed up data retrieval.

Examples

CREATE TABLE employees (
    id           BIGINT      NOT NULL CONSTRAINT pk PRIMARY KEY CLUSTERED,
    name         VARCHAR(30) NOT NULL,
    first_name   VARCHAR(30) NOT NULL,
    birth_date   DATE        NULL,
    badge_number INT         NULL,

    INDEX idx_name(name, first_name),          -- nonclustered index
    CONSTRAINT uq_badge UNIQUE(badge_number),  -- nonclustered index
)
GO

INSERT INTO employees VALUES (1040, 'PAVLOV',    'Igor',  '19660705', 54700),
                             (1003, 'SMITH',     'John',  '19700605', 53040),
                             (1032, 'BOULANGER', 'Jean',  '19901204', 50001),
                             (1035, 'MULLER',    'Franz', '19550823', 51340),
                             (1020, 'SANCHEZ',   'Pedro', '19401003', 50033);

SELECT * FROM employees;

The result is:

id                  |name           |first_name     |birth_date|badge_numbe|
--------------------+---------------+---------------+----------+-----------+
                1003|SMITH          |John           |1970-06-05|      53040|
                1020|SANCHEZ        |Pedro          |1940-10-03|      50033|
                1032|BOULANGER      |Jean           |1990-12-04|      50001|
                1035|MULLER         |Franz          |1955-08-23|      51340|
                1040|PAVLOV         |Igor           |1966-07-05|      54700|

(5 row(s) affected)

The records are stored in the following files, as shown by SHOW ID T employees:

TABLE                                               FILEPATH
--------------------------------------------------  --------------------
mydb.dbo.employees                                  data/d108/s0/t7000016
       idx_name                                            index/d108/s0/i7000017
       uq_badge                                            index/d108/s0/i7000018

Physical File Content:

For those who are interested, the content of the files mentioned above are shown below:

******* data/d108/s0/t7000016 (employees base table (aka clustered index)) *******

The records are physically sorted by the clustered index (here, the [id] column).
All columns are stored in this file.

+--------------------+====================+----------+----------+----------+-----------+
|               rowid|                  id|name      |first_name|birth_date|badge_numbe|
+--------------------+====================+----------+----------+----------+-----------+
|                   1|                1003|SMITH     |John      |1970-06-05|      53040|
|                   4|                1020|SANCHEZ   |Pedro     |1940-10-03|      50033|
|                   2|                1032|BOULANGER |Jean      |1990-12-04|      50001|
|                   3|                1035|MULLER    |Franz     |1955-08-23|      51340|
|                   0|                1040|PAVLOV    |Igor      |1966-07-05|      54700|
+--------------------+====================+----------+----------+----------+-----------+



******* index/d108/s0/i7000017 (idx_name index) *******

The records are physically sorted by the key order ([name], [first_name], [rowid]).
For INDEX clause, [rowid] is automatically appended to the list of key columns, but the user doesn't see it.
The [rowid] column renders the index internally unique.
[id] is the base table clustered key.

+==========+==========+====================+--------------------+
|name      |first_name|               rowid|                  id|
+==========+==========+====================+--------------------+
|BOULANGER |Jean      |                   2|                1032|
|MULLER    |Franz     |                   3|                1035|
|PAVLOV    |Igor      |                   0|                1040|
|SANCHEZ   |Pedro     |                   4|                1020|
|SMITH     |John      |                   1|                1003|
+----------+----------+====================+--------------------+



******* index/d108/s0/i7000018 (uq_badge unique index) *******

The records are physically sorted by the key order ([badge_number]).
[id] is the base table clustered key.

+===========+--------------------+
|badge_numbe|                  id|
+===========+--------------------+
|      50001|                1032|
|      50033|                1020|
|      51340|                1035|
|      53040|                1003|
|      54700|                1040|
+===========+--------------------+

You clearly see that indexes store a subset of the base table columns, physically ordered by the index key.

Each time a record is inserted, updated or deleted, the modifications must be done in the base table file (aka clustered index) as well as in the index files (aka nonclustered indexes).