Collations
See also
COLLATE
clause.
Collations define a set of rules that determine how varchar
and char
data are sorted and compared.
Collation depends on the language.
- For example, in English collation, these words are sorted in this order: Norway, Oregon, Østerbro, Portugal, Zürich.
- In Danish collation, they are sorted like this: Norway, Oregon, Portugal, Zürich, Østerbro. Because in Danish, Ø is a letter that appears after the Z.
When records are inserted into a table, they are physically written in the file at the proper position, ordered by the columns of the clustered index.
The collations specified on the varchar/char
columns of the clustered index by the CREATE TABLE
statement define the physical sort order for these columns.
Collations also affect the result of all operations that compare strings, if the arguments are varchar/char
.
This is the case for comparison operators such as =
, >
, LIKE
, etc, and functions like CHARINDEX
or REPLACE
.
Syntax
<collation> ::= {collation_name | alias}{_CI_AI|_CI_AS|_CS_AS}
Arguments
- collation_name
- is a collation name listed in the table herebelow. E.g.
en
,fr
, etc. - alias
- is an alias for a collation name, listed in the table herebelow. E.g.
english
,latin1_general
,french
, etc. - _CI_AI
- case-insensitive and accent-insensitive.
a
andà
andA
are equal.
Use this mode if you want to ignore accents. - _CI_AS
- case-insensitive and accent-sensitive.
a
andA
are equal, but not equal toà
.
This is the most common mode, and you should use it. - _CS_AS
- case-sensitive and accent-sensitive.
a
andà
andA
are not equal.
You should avoid this mode, as case-sensitive sort order is hard to grasp.
Examples of Collations
en_ci_as english dictionary sort order, case-insensitive and accent-sensitive.
en_ci_ai english dictionary sort order, case-insensitive and accent-insensitive.
fr_ci_as french dictionary sort order, case-insensitive and accent-sensitive.
french_ci_as same as fr_ci_as
Remarks
Collations can be specified in upper case, lower case or a mix. E.g. french_ci_as
is the same as French_CI_AS
.
It is better to use the suffixes _CI_AS
or _CI_AI
when specifying a collation, because case-sensitive mode _CS_AS
may be surprising.
Default Server Collation
At server installation, you must specify a default collation which will be used by CREATE TABLE
for column definitions and by string operations, unless overriden by a specific COLLATE
clause.
If all your users speak the same language, you should specify the collation for this language.
E.g. if all your users speak french, they want to see the list of client names sorted by french alphabetical order.
So, in this case, the best collations is fr_ci_as
or fr_ci_ai
.
If users speak different languages, you should choose a more neutral collation, like en_ci_ai
or latin1_general_ci_ai
for western languages, or cyrillic_general_ci_ai
for slavic languages.
If you can, choose the same default server collation for all your RSQL installations. This way, applications will sort the records and compare strings in a uniform and consistent manner for all users and applications.
Warning
YOU CANNOT CHANGE THE COLLATION OF A COLUMN AFTER THE TABLE HAS BEEN CREATED.
YOU CANNOT CHANGE THE SERVER DEFAULT COLLATION AFTER INSTALLATION.
To change the collation of a table column, you must:
- export the table to a text file
- drop and recreate the table with proper column collations
- import the text file back into the new table
To change the server default collation, you must:
- export the script of creation for all databases (logins, users, tables, etc)
- export all tables to text files
- delete the content of your RSQL instance directory
- create a new instance with the proper server default collation
- run the script of creation of the databases to recreate all objects
- import all text files back into the tables
Available Collation List
The list of supported collations is given by the command SHOW COLLATIONS
.
If a collation for your language is not found in the list, it means that it is the same as latin1_general
, and you should specify the latter.
It is the same as specifying en
.
Description Collation Aliases
----------- --------- -------
Afar aa afar
Afrikaans af afrikaans
Albanian sq albanian
Arabic ar arabic
Armenian hy armenian
Assamese as assamese
Azerbaijani az azeri
Belarusian be belarusian
Bengali bn bengali
Bosnian bs bosnian_latin
Bosnian (Cyrillic) bs_cyrl bosnian_cyrillic
Bulgarian bg bulgarian
Burmese my burmese
Canadian French fr_ca french_ca
Catalan ca catalan
Chinese zh chinese_prc
Croatian hr croatian
Czech cs czech
Danish da danish
Dari fa_af dari
Dzongkha dz dzongkha
English en english, latin1_general, latin1_general_100
Esperanto eo esperanto
Estonian et estonian
Ewe ee ewe
Faroese fo faroese
Filipino fil filipino
Finnish fi finnish
French fr french
German de german
Greek el greek
Gujarati gu gujarati
Hausa ha hausa
Hawaiian haw hawaiian
Hebrew he hebrew
Hindi hi hindi
Hungarian hu hungarian
Icelandic is icelandic
Igbo ig igbo
Japanese ja japanese
Kalaallisut kl kalaallisut
Kannada kn kannada
Kazakh kk kazakh
Khmer km khmer
Konkani kok konkani
Korean ko korean
Latvian lv latvian
Lingala ln lingala
Lithuanian lt lithuanian
Macedonian mk macedonian
Malayalam ml malayalam
Maltese mt maltese
Marathi mr marathi
Northern Sami se northern_sami
Northern Sotho nso northern_sotho
Norwegian Bokmål nb norwegian_bokmål
Norwegian Nynorsk nn norwegian_nynorsk
Oriya or oriya
Oromo om oromo
Pashto ps pashto
Persian fa persian
Polish pl polish
Punjabi pa punjabi
Romanian ro romanian
Russian ru cyrillic_general, cyrillic_general_100, russian
Saho ssy saho
Serbian sr serbian_cyrillic
Serbo-Croatian sr_latn serbian_latin
Sinhala si sinhala
Slovak sk slovak
Slovenian sl slovenian
Spanish es spanish
Swedish sv swedish
Tamil ta tamil
Telugu te telugu
Thai th thai
Tongan to tongan
Traditional Chinese zh_hant chinese_traditional
Tswana tn tswana
Turkish tr turkish
Ukrainian uk ukrainian
Urdu ur urdu
Vietnamese vi vietnamese
Walser wae walser
Welsh cy welsh
Yoruba yo yoruba