Data Types
data types
types
data types
PostgreSQL has a rich set of native data
types available to users.
Users may add new types to PostgreSQL using the
CREATE TYPE command.
shows all general-purpose data types
included in the standard distribution. Most of the alternative names
listed in the
Aliases
column are the names used internally by
PostgreSQL for historical reasons. In
addition, some internally used or deprecated types are available,
but they are not listed here.
Data Types
Type Name
Aliases
Description
bigint
int8
signed eight-byte integer
bigserial
serial8
autoincrementing eight-byte integer
bit
fixed-length bit string
bit varying(n)
varbit(n)
variable-length bit string
boolean
bool
logical Boolean (true/false)
box
rectangular box in 2D plane
bytea
binary data
character(n)
char(n)
fixed-length character string
character varying(n)
varchar(n)
variable-length character string
cidr
IP network address
circle
circle in 2D plane
date
calendar date (year, month, day)
double precision
float8
double precision floating-point number
inet
IP host address
integer
int, int4
signed four-byte integer
interval(p)
general-use time span
line
infinite line in 2D plane (not implemented)
lseg
line segment in 2D plane
macaddr
MAC address
money
currency amount
numeric [ (p,
s) ]
decimal [ (p,
s) ]
exact numeric with selectable precision
path
open and closed geometric path in 2D plane
point
geometric point in 2D plane
polygon
closed geometric path in 2D plane
real
float4
single precision floating-point number
smallint
int2
signed two-byte integer
serial
serial4
autoincrementing four-byte integer
text
variable-length character string
time [ (p) ] [ without time zone ]
time of day
time [ (p) ] with time zone
timetz
time of day, including time zone
timestamp [ (p) ] without time zone
timestamp
date and time
timestamp [ (p) ] [ with time zone ]
timestamptz
date and time, including time zone
Compatibility
The following types (or spellings thereof) are specified by
SQL: bit, bit
varying, boolean, char,
character, character varying,
varchar, date, double
precision, integer, interval,
numeric, decimal, real,
smallint, time, timestamp
(both with or without time zone).
Each data type has an external representation determined by its input
and output functions. Many of the built-in types have
obvious external formats. However, several types are either unique
to PostgreSQL, such as open and closed
paths, or have several possibilities for formats, such as the date
and time types.
Most of the input and output functions corresponding to the
base types (e.g., integers and floating-point numbers) do some
error-checking.
Some of the input and output functions are not invertible. That is,
the result of an output function may lose precision when compared to
the original input.
Some of the operators and functions (e.g.,
addition and multiplication) do not perform run-time error-checking in the
interests of improving execution speed.
On some systems, for example, the numeric operators for some data types may
silently underflow or overflow.
Numeric Types
data types
numeric
integer
smallint
bigint
int4
integer
int2
smallint
int8
bigint
numeric (data type)
decimal
numeric
real
double precision
float4
real
float8
double precision
floating point
Numeric types consist of two-, four-, and eight-byte integers,
four- and eight-byte floating-point numbers, and fixed-precision
decimals. lists the
available types.
Numeric Types
Type name
Storage size
Description
Range
smallint>
2 bytes
small range fixed-precision
-32768 to +32767
integer>
4 bytes
usual choice for fixed-precision
-2147483648 to +2147483647
bigint>
8 bytes
large range fixed-precision
-9223372036854775808 to 9223372036854775807
decimal>
variable
user-specified precision, exact
no limit
numeric>
variable
user-specified precision, exact
no limit
real>
4 bytes
variable-precision, inexact
6 decimal digits precision
double precision>
8 bytes
variable-precision, inexact
15 decimal digits precision
serial>
4 bytes
autoincrementing integer
1 to 2147483647
bigserial
8 bytes
large autoincrementing integer
1 to 9223372036854775807
The syntax of constants for the numeric types is described in
. The numeric types have a
full set of corresponding arithmetic operators and
functions. Refer to for more
information. The following sections describe the types in detail.
The Integer Types
The types smallint, integer,
bigint store whole numbers, that is, numbers without
fractional components, of various ranges. Attempts to store
values outside of the allowed range will result in an error.
The type integer is the usual choice, as it offers
the best balance between range, storage size, and performance.
The smallint type is generally only used if disk
space is at a premium. The bigint type should only
be used if the integer range is not sufficient,
because the latter is definitely faster.
The bigint type may not function correctly on all
platforms, since it relies on compiler support for eight-byte
integers. On a machine without such support, bigint
acts the same as integer (but still takes up eight
bytes of storage). However, we are not aware of any reasonable
platform where this is actually the case.
SQL only specifies the integer types
integer (or int) and
smallint. The type bigint, and the
type names int2, int4, and
int8 are extensions, which are shared with various
other SQL database systems.
If you have a column of type smallint or
bigint with an index, you may encounter problems
getting the system to use that index. For instance, a clause of
the form
... WHERE smallint_column = 42
will not use an index, because the system assigns type
integer to the constant 42, and
PostgreSQL currently
cannot use an index when two different data types are involved. A
workaround is to single-quote the constant, thus:
... WHERE smallint_column = '42'
This will cause the system to delay type resolution and will
assign the right type to the constant.
Arbitrary Precision Numbers
The type numeric can store numbers with up to 1,000
digits of precision and perform calculations exactly. It is
especially recommended for storing monetary amounts and other
quantities where exactness is required. However, the
numeric type is very slow compared to the
floating-point types described in the next section.
In what follows we use these terms: The
scale of a numeric is the
count of decimal digits in the fractional part, to the right of
the decimal point. The precision of a
numeric is the total count of significant digits in
the whole number, that is, the number of digits to both sides of
the decimal point. So the number 23.5141 has a precision of 6
and a scale of 4. Integers can be considered to have a scale of
zero.
Both the precision and the scale of the numeric type can be
configured. To declare a column of type numeric use
the syntax
NUMERIC(precision, scale)
The precision must be positive, the scale zero or positive.
Alternatively,
NUMERIC(precision)
selects a scale of 0. Specifying
NUMERIC
without any precision or scale creates a column in which numeric
values of any precision and scale can be stored, up to the
implementation limit on precision. A column of this kind will
not coerce input values to any particular scale, whereas
numeric columns with a declared scale will coerce
input values to that scale. (The SQL standard
requires a default scale of 0, i.e., coercion to integer
precision. We find this a bit useless. If you're concerned
about portability, always specify the precision and scale
explicitly.)
If the precision or scale of a value is greater than the declared
precision or scale of a column, the system will attempt to round
the value. If the value cannot be rounded so as to satisfy the
declared limits, an error is raised.
The types decimal and numeric are
equivalent. Both types are part of the SQL
standard.
Floating-Point Types
The data types real and double
precision are inexact, variable-precision numeric types.
In practice, these types are usually implementations of
IEEE Standard 754 for Binary Floating-Point
Arithmetic (single and double precision, respectively), to the
extent that the underlying processor, operating system, and
compiler support it.
Inexact means that some values cannot be converted exactly to the
internal format and are stored as approximations, so that storing
and printing back out a value may show slight discrepancies.
Managing these errors and how they propagate through calculations
is the subject of an entire branch of mathematics and computer
science and will not be discussed further here, except for the
following points:
If you require exact storage and calculations (such as for
monetary amounts), use the numeric type instead.
If you want to do complicated calculations with these types
for anything important, especially if you rely on certain
behavior in boundary cases (infinity, underflow), you should
evaluate the implementation carefully.
Comparing two floating-point values for equality may or may
not work as expected.
Normally, the real type has a range of at least
-1E+37 to +1E+37 with a precision of at least 6 decimal digits. The
double precision type normally has a range of around
-1E+308 to +1E+308 with a precision of at least 15 digits. Values that
are too large or too small will cause an error. Rounding may
take place if the precision of an input number is too high.
Numbers too close to zero that are not representable as distinct
from zero will cause an underflow error.
The Serial Types
serial
bigserial
serial4
serial8
auto-increment
serial
sequences
and serial type
The serial data type is not a true type, but merely
a notational convenience for setting up identifier columns
(similar to the AUTO_INCREMENT property
supported by some other databases). In the current
implementation, specifying
CREATE TABLE tablename (
colname SERIAL
);
is equivalent to specifying:
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer DEFAULT nextval('tablename_colname_seq') NOT NULL
);
Thus, we have created an integer column and arranged for its default
values to be assigned from a sequence generator. A NOT NULL>
constraint is applied to ensure that a null value cannot be explicitly
inserted, either. In most cases you would also want to attach a
UNIQUE> or PRIMARY KEY> constraint to prevent
duplicate values from being inserted by accident, but this is
not automatic.
To use a serial column to insert the next value of
the sequence into the table, specify that the serial
column should be assigned the default value. This can be done
either be excluding from the column from the list of columns in
the INSERT statement, or through the use of
the DEFAULT keyword.
The type names serial and serial4 are
equivalent: both create integer columns. The type
names bigserial and serial8 work just
the same way, except that they create a bigint
column. bigserial should be used if you anticipate
the use of more than 231> identifiers over the
lifetime of the table.
The sequence created by a serial type is
automatically dropped when the owning column is dropped, and
cannot be dropped otherwise. (This was not true in
PostgreSQL releases before 7.3. Note
that this automatic drop linkage will not occur for a sequence
created by reloading a dump from a pre-7.3 database; the dump
file does not contain the information needed to establish the
dependency link.) Furthermore, this dependency between sequence
and column is made only for the serial> column itself; if
any other columns reference the sequence (perhaps by manually
calling the nextval()>) function), they may be broken
if the sequence is removed. Using serial> columns in
fashion is considered bad form.
Prior to PostgreSQL> 7.3, serial
implied UNIQUE. This is no longer automatic.
If you wish a serial column to be UNIQUE or a
PRIMARY KEY it must now be specified, just as
with any other data type.
Monetary Type
Note
The money type is deprecated. Use
numeric or decimal instead, in
combination with the to_char function. The
money type may become a locale-aware layer over the
numeric type in a future release.
The money type stores a currency amount with fixed
decimal point representation; see . The output format is
locale-specific.
Input is accepted in a variety of formats, including integer and
floating-point literals, as well as typical
currency formatting, such as '$1,000.00'.
Output is in the latter form.
Monetary Types
Type Name
Storage
Description
Range
money
4 bytes
currency amount
-21474836.48 to +21474836.47
Character Types
character strings
data types
strings
character strings
text
character strings
Character Types
Type name
Description
character(n>), char(n>)
fixed-length, blank padded
character varying(n>), varchar(n>)
variable-length with limit
text
variable unlimited length
shows the
general-purpose character types available in
PostgreSQL.
SQL defines two primary character types:
character(n>) and character
varying(n>), where n> is a
positive integer. Both of these types can store strings up to
n> characters in length. An attempt to store a
longer string into a column of these types will result in an
error, unless the excess characters are all spaces, in which case
the string will be truncated to the maximum length. (This
somewhat bizarre exception is required by the
SQL standard.) If the string to be stored is
shorter than the declared length, values of type
character will be space-padded; values of type
character varying will simply store the shorter
string.
If one explicitly casts a value to
character(n>) or character
varying(n>), then an overlength value will
be truncated to n> characters without raising an
error. (This too is required by the SQL
standard.)
Prior to PostgreSQL> 7.2, strings that were too long were
always truncated without raising an error, in either explicit or
implicit casting contexts.
The notations char(n>) and
varchar(n>) are aliases for
character(n>) and character
varying(n>),
respectively. character without length specifier is
equivalent to character(1); if character
varying is used without length specifier, the type accepts
strings of any size. The latter is a PostgreSQL> extension.
In addition, PostgreSQL supports the
more general text type, which stores strings of any
length. Unlike character varying, text
does not require an explicit declared upper limit on the size of
the string. Although the type text is not in the
SQL standard, many other RDBMS packages have it
as well.
The storage requirement for data of these types is 4 bytes plus the
actual string, and in case of character plus the
padding. Long strings are compressed by the system automatically, so
the physical requirement on disk may be less. Long values are also
stored in background tables so they don't interfere with rapid
access to the shorter column values. In any case, the longest
possible character string that can be stored is about 1 GB. (The
maximum value that will be allowed for n> in the data
type declaration is less than that. It wouldn't be very useful to
change this because with multibyte character encodings the number of
characters and bytes can be quite different anyway. If you desire to
store long strings with no specific upper limit, use
text or character varying without a length
specifier, rather than making up an arbitrary length limit.)
There are no performance differences between these three types,
apart from the increased storage size when using the blank-padded
type.
Refer to for information about
the syntax of string literals, and to
for information about available operators and functions.
Using the character types
CREATE TABLE test1 (a character(4));
INSERT INTO test1 VALUES ('ok');
SELECT a, char_length(a) FROM test1; --
a | char_length
------+-------------
ok | 4
CREATE TABLE test2 (b varchar(5));
INSERT INTO test2 VALUES ('ok');
INSERT INTO test2 VALUES ('good ');
INSERT INTO test2 VALUES ('too long');
ERROR: value too long for type character varying(5)
INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
SELECT b, char_length(b) FROM test2;
b | char_length
-------+-------------
ok | 2
good | 5
too l | 5
The char_length function is discussed in
.
There are two other fixed-length character types in
PostgreSQL, shown in .
The name type
exists only for storage of internal catalog
names and is not intended for use by the general user. Its length
is currently defined as 64 bytes (63 usable characters plus terminator)
but should be referenced using the constant
NAMEDATALEN. The length is set at compile time
(and is therefore adjustable for special uses); the default
maximum length may change in a future release. The type
"char" (note the quotes) is different from
char(1) in that it only uses one byte of storage. It
is internally used in the system catalogs as a poor-man's
enumeration type.
Specialty Character Types
Type Name
Storage
Description
"char"
1 byte
single character internal type
name
64 bytes
sixty-three character internal type
Binary Strings
The bytea data type allows storage of binary strings;
see .
Binary String Types
Type Name
Storage
Description
bytea
4 bytes plus the actual binary string
Variable (not specifically limited)
length binary string
A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from characters strings by two
characteristics: First, binary strings specifically allow storing
octets of zero value and other non-printable
octets. Second, operations on binary strings process the actual
bytes, whereas the encoding and processing of character strings
depends on locale settings.
When entering bytea values, octets of certain values
must be escaped (but all octet values
may be escaped) when used as part of a string
literal in an SQL statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two
backslashes. Some octet values have alternate escape sequences, as
shown in .
bytea> Literal Escaped Octets
Decimal Octet Value
Description
Input Escaped Representation
Example
Printed Result
0
zero octet
'\\000'
SELECT '\\000'::bytea;
\000
39
single quote
'\'' or '\\047'
SELECT '\''::bytea;
'
92
backslash
'\\\\' or '\\134'
SELECT '\\\\'::bytea;
\\
Note that the result in each of the examples in was exactly one
octet in length, even though the output representation of the zero
octet and backslash are more than one character. Bytea
output octets are also escaped. In general, each
non-printable
octet decimal value is converted into
its equivalent three digit octal value, and preceded by one backslash.
Most printable
octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternate output representation.
Details are in .
bytea> Output Escaped Octets
Decimal Octet Value
Description
Output Escaped Representation
Example
Printed Result
92
backslash
\\
SELECT '\\134'::bytea;
\\
0 to 31 and 127 to 255
non-printable
octets
\### (octal value)
SELECT '\\001'::bytea;
\001
32 to 126
printable
octets
ASCII representation
SELECT '\\176'::bytea;
~
To use the bytea escaped octet notation, string
literals (input strings) must contain two backslashes because they
must pass through two parsers in the PostgreSQL>
server. The first backslash is interpreted as an escape character
by the string-literal parser, and therefore is consumed, leaving
the characters that follow. The remaining backslash is recognized
by the bytea input function as the prefix of a three
digit octal value. For example, a string literal passed to the
backend as '\\001' becomes
'\001' after passing through the string-literal
parser. The '\001' is then sent to the
bytea input function, where it is converted to a
single octet with a decimal value of 1.
For a similar reason, a backslash must be input as
'\\\\' (or '\\134'). The first
and third backslashes are interpreted as escape characters by the
string-literal parser, and therefore are consumed, leaving two
backslashes in the string passed to the bytea input function,
which interprets them as representing a single backslash.
For example, a string literal passed to the
server as '\\\\' becomes '\\'
after passing through the string-literal parser. The
'\\' is then sent to the bytea input
function, where it is converted to a single octet with a decimal
value of 92.
A single quote is a bit different in that it must be input as
'\'' (or '\\047'),
not as '\\''. This is because,
while the literal parser interprets the single quote as a special
character, and will consume the single backslash, the
bytea input function does not
recognize a single quote as a special octet. Therefore a string
literal passed to the backend as '\'' becomes
''' after passing through the string-literal
parser. The ''' is then sent to the
bytea input function, where it is retains its single
octet decimal value of 39.
Depending on the front end to PostgreSQL> you use,
you may have additional work to do in terms of escaping and
unescaping bytea strings. For example, you may also
have to escape line feeds and carriage returns if your interface
automatically translates these. Or you may have to double up on
backslashes if the parser for your language or choice also treats
them as an escape character.
The SQL standard defines a different binary
string type, called BLOB or BINARY LARGE
OBJECT. The input format is different compared to
bytea, but the provided functions and operators are
mostly the same.
Date/Time Types
PostgreSQL supports the full set of
SQL date and time types, shown in .
Date/Time Types
Type
Description
Storage
Earliest
Latest
Resolution
timestamp [ (p) ] [ without time zone ]
both date and time
8 bytes
4713 BC
AD 1465001
1 microsecond / 14 digits
timestamp [ (p) ] with time zone
both date and time
8 bytes
4713 BC
AD 1465001
1 microsecond / 14 digits
interval [ (p) ]
time intervals
12 bytes
-178000000 years
178000000 years
1 microsecond
date
dates only
4 bytes
4713 BC
32767 AD
1 day
time [ (p) ] [ without time zone ]
times of day only
8 bytes
00:00:00.00
23:59:59.99
1 microsecond
time [ (p) ] with time zone
times of day only
12 bytes
00:00:00.00+12
23:59:59.99-12
1 microsecond
time, timestamp, and
interval accept an optional precision value
p which specifies the number of
fractional digits retained in the seconds field. By default, there
is no explicit bound on precision. The allowed range of
p is from 0 to 6 for the
timestamp and interval types, 0 to 13
for the time types.
When timestamp> values are stored as double precision floating-point
numbers (currently the default), the effective limit of precision
may be less than 6, since timestamp values are stored as seconds
since 2000-01-01. Microsecond precision is achieved for dates within
a few years of 2000-01-01, but the precision degrades for dates further
away. When timestamps are stored as eight-byte integers (a compile-time
option), microsecond precision is available over the full range of
values.
Time zones, and time-zone conventions, are influenced by
political decisions, not just earth geometry. Time zones around the
world became somewhat standardized during the 1900's,
but continue to be prone to arbitrary changes.
PostgreSQL uses your operating
system's underlying features to provide output time-zone
support, and these systems usually contain information for only
the time period 1902 through 2038 (corresponding to the full
range of conventional Unix system time).
timestamp with time zone and time with time
zone will use time zone
information only within that year range, and assume that times
outside that range are in UTC.
The type time with time zone is defined by the SQL
standard, but the definition exhibits properties which lead to
questionable usefulness. In most cases, a combination of
date, time, timestamp without time
zone and timestamp with time zone should
provide a complete range of date/time functionality required by
any application.
The types abstime
and reltime are lower precision types which are used internally.
You are discouraged from using these types in new
applications and are encouraged to move any old
ones over when appropriate. Any or all of these internal types
might disappear in a future release.
Date/Time Input
Date and time input is accepted in almost any reasonable format, including
ISO 8601, SQL-compatible,
traditional PostgreSQL, and others.
For some formats, ordering of month and day in date input can be
ambiguous and there is support for specifying the expected
ordering of these fields.
The command
SET DateStyle TO 'US'
or SET DateStyle TO 'NonEuropean'
specifies the variant month before day
, the command
SET DateStyle TO 'European' sets the variant
day before month
.
PostgreSQL is more flexible in
handling date/time than the
SQL standard requires.
See
for the exact parsing rules of date/time input and for the
recognized text fields including months, days of the week, and
time zones.
Remember that any date or time literal input needs to be enclosed
in single quotes, like text strings. Refer to
for more
information.
SQL requires the following syntax
type [ (p) ] 'value'
where p in the optional precision
specification is an integer corresponding to the
number of fractional digits in the seconds field. Precision can
be specified
for time, timestamp, and
interval types.
Dates
date
data type
shows some possible
inputs for the date type.
Date Input
Example
Description
January 8, 1999
unambiguous
1999-01-08
ISO-8601 format, preferred
1/8/1999
U.S.; read as August 1 in European mode
8/1/1999
European; read as August 1 in U.S. mode
1/18/1999
U.S.; read as January 18 in any mode
19990108
ISO-8601 year, month, day
990108
ISO-8601 year, month, day
1999.008
year and day of year
99008
year and day of year
J2451187
Julian day
January 8, 99 BC
year 99 before the Common Era
Times
time
data type
time without time zone
time
time with time zone
data type
The time type can be specified as time or
as time without time zone. The optional precision
p should be between 0 and 13, and
defaults to the precision of the input time literal.
shows the valid time inputs.
Time Input
Example
Description
04:05:06.789
ISO 8601
04:05:06
ISO 8601
04:05
ISO 8601
040506
ISO 8601
04:05 AM
same as 04:05; AM does not affect value
04:05 PM
same as 16:05; input hour must be <= 12
allballs
same as 00:00:00
The type time with time zone accepts all input also
legal for the time type, appended with a legal time
zone, as shown in .
Time With Time Zone Input
Example
Description
04:05:06.789-8
ISO 8601
04:05:06-08:00
ISO 8601
04:05-08:00
ISO 8601
040506-08
ISO 8601
Refer to for
more examples of time zones.
Time stamps
timestamp
data type
timestamp with time zone
data type
timestamp without time zone
data type
The time stamp types are timestamp [
(p) ] without time zone and
timestamp [ (p) ] with time
zone. Writing just timestamp is equivalent to
timestamp without time zone.
Prior to PostgreSQL 7.3, writing just
timestamp was equivalent to timestamp with time
zone. This was changed for SQL spec compliance.
Valid input for the time stamp types consists of a concatenation
of a date and a time, followed by an optional
AD or BC, followed by an
optional time zone. (See .) Thus
1999-01-08 04:05:06
and
1999-01-08 04:05:06 -8:00
are valid values, which follow the ISO 8601
standard. In addition, the wide-spread format
January 8 04:05:06 1999 PST
is supported.
The optional precision
p should be between 0 and 6, and
defaults to the precision of the input timestamp literal.
For timestamp without time zone, any explicit time
zone specified in the input is silently ignored. That is, the
resulting date/time value is derived from the explicit date/time
fields in the input value, and is not adjusted for time zone.
For timestamp with time zone, the internally stored
value is always in UTC (GMT). An input value that has an explicit
time zone specified is converted to UTC using the appropriate offset
for that time zone. If no time zone is stated in the input string,
then it is assumed to be in the time zone indicated by the system's
TimeZone> parameter, and is converted to UTC using the
offset for the TimeZone> zone.
When a timestamp with time
zone value is output, it is always converted from UTC to the
current TimeZone> zone, and displayed as local time in that
zone. To see the time in another time zone, either change
TimeZone> or use the AT TIME ZONE> construct
(see ).
Conversions between timestamp without time zone and
timestamp with time zone normally assume that the
timestamp without time zone value should be taken or given
as TimeZone> local time. A different zone reference can
be specified for the conversion using AT TIME ZONE>.
Time Zone Input
Time Zone
Description
PST
Pacific Standard Time
-8:00
ISO-8601 offset for PST
-800
ISO-8601 offset for PST
-8
ISO-8601 offset for PST
Intervals
interval
interval values can be written with the following syntax:
Quantity Unit [Quantity Unit...] [Direction]
@ Quantity Unit [Quantity Unit...] [Direction]
where: Quantity is a number (possibly signed),
Unit is second,
minute, hour, day,
week, month, year,
decade, century, millennium,
or abbreviations or plurals of these units;
Direction can be ago or
empty. The at sign (@>) is optional noise. The amounts
of different units are implicitly added up with appropriate
sign accounting.
Quantities of days, hours, minutes, and seconds can be specified without
explicit unit markings. For example, '1 12:59:10'> is read
the same as '1 day 12 hours 59 min 10 sec'>.
The optional precision
p should be between 0 and 6, and
defaults to the precision of the input literal.
Special values
time
constants
date
constants
The following SQL-compatible functions can be
used as date or time
values for the corresponding data type: CURRENT_DATE,
CURRENT_TIME,
CURRENT_TIMESTAMP. The latter two accept an
optional precision specification. (See also .)
PostgreSQL also supports several
special date/time input values for convenience, as shown in . The values
infinity and -infinity
are specially represented inside the system and will be displayed
the same way; but the others are simply notational shorthands
that will be converted to ordinary date/time values when read.
Special Date/Time Inputs
Input string
Description
epoch
1970-01-01 00:00:00+00 (Unix system time zero)
infinity
later than all other timestamps (not available for
type date>)
-infinity
earlier than all other timestamps (not available for
type date>)
now
current transaction time
today
midnight today
tomorrow
midnight tomorrow
yesterday
midnight yesterday
zulu>, allballs>, z>
00:00:00.00 GMT
Date/Time Output
date
output format
Formatting
time
output format
Formatting
Output formats can be set to one of the four styles ISO 8601,
SQL (Ingres), traditional PostgreSQL, and
German, using the SET DateStyle. The default
is the ISO format. (The
SQL standard requires the use of the ISO 8601
format. The name of the SQL
output format is a
historical accident.) shows examples of each
output style. The output of the date and
time types is of course only the date or time part
in accordance with the given examples.
Date/Time Output Styles
Style Specification
Description
Example
ISO
ISO 8601/SQL standard
1997-12-17 07:37:16-08
SQL
traditional style
12/17/1997 07:37:16.00 PST
PostgreSQL
original style
Wed Dec 17 07:37:16 1997 PST
German
regional style
17.12.1997 07:37:16.00 PST
The SQL style has European and non-European
(U.S.) variants, which determines whether month follows day or
vice versa. (See
for how this setting also affects interpretation of input values.)
shows an
example.
Date Order Conventions
Style Specification
Description
Example
European
day/month/year
17/12/1997 15:37:16.00 MET
US
month/day/year
12/17/1997 07:37:16.00 PST
interval output looks like the input format, except that units like
week or century are converted to years and days.
In ISO mode the output looks like
[ Quantity Units [ ... ] ] [ Days ] Hours:Minutes [ ago ]
The date/time styles can be selected by the user using the
SET DATESTYLE command, the
datestyle parameter in the
postgresql.conf configuration file, and the
PGDATESTYLE environment variable on the server or
client. The formatting function to_char
(see ) is also available as
a more flexible way to format the date/time output.
Time Zones
time zones
PostgreSQL endeavors to be compatible with
the SQL standard definitions for typical usage.
However, the SQL standard has an odd mix of date and
time types and capabilities. Two obvious problems are:
Although the date type
does not have an associated time zone, the
time type can.
Time zones in the real world can have no meaning unless
associated with a date as well as a time
since the offset may vary through the year with daylight-saving
time boundaries.
The default time zone is specified as a constant integer offset
from GMT>/UTC>. It is not possible to adapt to daylight-saving
time when doing date/time arithmetic across
DST boundaries.
To address these difficulties, we recommend using date/time types
that contain both date and time when using time zones. We
recommend not using the type time with
time zone (though it is supported by
PostgreSQL for legacy applications and
for compatibility with other SQL
implementations). PostgreSQL assumes
your local time zone for any type containing only date or
time. Further, time zone support is derived from the underlying
operating system time-zone capabilities, and hence can handle
daylight-saving time and other expected behavior.
PostgreSQL obtains time-zone support
from the underlying operating system for dates between 1902 and
2038 (near the typical date limits for Unix-style
systems). Outside of this range, all dates are assumed to be
specified and used in Universal Coordinated Time
(UTC).
All dates and times are stored internally in
UTC, traditionally known as Greenwich Mean
Time (GMT). Times are converted to local time
on the database server before being sent to the client frontend,
hence by default are in the server time zone.
There are several ways to select the time zone used by the server:
The TZ environment variable on the server host
is used by the server as the default time zone, if no other is
specified.
The timezone configuration parameter can be
set in postgresql.conf>.
The PGTZ environment variable, if set at the
client, is used by libpq
applications to send a SET TIME ZONE
command to the server upon connection.
The SQL command SET TIME ZONE
sets the time zone for the session.
If an invalid time zone is specified, the time zone becomes
UTC (on most systems anyway).
Refer to for a list of
available time zones.
Internals
PostgreSQL uses Julian dates
for all date/time calculations. They have the nice property of correctly
predicting/calculating any date more recent than 4713 BC
to far into the future, using the assumption that the length of the
year is 365.2425 days.
Date conventions before the 19th century make for interesting reading,
but are not consistent enough to warrant coding into a date/time handler.
Boolean Type
Boolean
data type
true
false
PostgreSQL provides the
standard SQL type boolean.
boolean can have one of only two states:
true
or false
. A third state,
unknown
, is represented by the
SQL null value.
Valid literal values for the true
state are:
TRUE
't'
'true'
'y'
'yes'
'1'
For the false
state, the following values can be
used:
FALSE
'f'
'false'
'n'
'no'
'0'
Using the key words TRUE and
FALSE is preferred (and
SQL-compliant).
Using the boolean type
CREATE TABLE test1 (a boolean, b text);
INSERT INTO test1 VALUES (TRUE, 'sic est');
INSERT INTO test1 VALUES (FALSE, 'non est');
SELECT * FROM test1;
a | b
---+---------
t | sic est
f | non est
SELECT * FROM test1 WHERE a;
a | b
---+---------
t | sic est
shows that
boolean values are output using the letters
t and f.
Values of the boolean type cannot be cast directly
to other types (e.g., CAST
(boolval AS integer) does
not work). This can be accomplished using the
CASE expression: CASE WHEN
boolval THEN 'value if true' ELSE
'value if false' END. See also .
boolean uses 1 byte of storage.
Geometric Types
Geometric data types represent two-dimensional spatial
objects. shows the geometric
types available in PostgreSQL. The
most fundamental type, the point, forms the basis for all of the
other types.
Geometric Types
Geometric Type
Storage
Representation
Description
point
16 bytes
(x,y)
Point in space
line
32 bytes
((x1,y1),(x2,y2))
Infinite line (not fully implemented)
lseg
32 bytes
((x1,y1),(x2,y2))
Finite line segment
box
32 bytes
((x1,y1),(x2,y2))
Rectangular box
path
16+16n bytes
((x1,y1),...)
Closed path (similar to polygon)
path
16+16n bytes
[(x1,y1),...]
Open path
polygon
40+16n bytes
((x1,y1),...)
Polygon (similar to closed path)
circle
24 bytes
<(x,y),r>
Circle (center and radius)
A rich set of functions and operators is available to perform various geometric
operations such as scaling, translation, rotation, and determining
intersections. They are explained in .
Point
point
Points are the fundamental two-dimensional building block for geometric types.
point is specified using the following syntax:
( x , y )
x , y
where the arguments are
x
the x-axis coordinate as a floating-point number
y
the y-axis coordinate as a floating-point number
Line Segment
line
Line segments (lseg) are represented by pairs of points.
lseg is specified using the following syntax:
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where the arguments are
(x1,y1)
(x2,y2)
the end points of the line segment
Box
box (data type)
Boxes are represented by pairs of points that are opposite
corners of the box.
box is specified using the following syntax:
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where the arguments are
(x1,y1)
(x2,y2)
opposite corners of the box
Boxes are output using the first syntax.
The corners are reordered on input to store
the upper right corner, then the lower left corner.
Other corners of the box can be entered, but the lower
left and upper right corners are determined from the input and stored corners.
Path
path (data type)
Paths are represented by connected sets of points. Paths can be
open, where
the first and last points in the set are not connected, and closed,
where the first and last point are connected. Functions
popen(p)
and
pclose(p)
are supplied to force a path to be open or closed, and functions
isopen(p)
and
isclosed(p)
are supplied to test for either type in a query.
path is specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
[ ( x1 , y1 ) , ... , ( xn , yn ) ]
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the arguments are
(x,y)
End points of the line segments comprising the path.
A leading square bracket ([>) indicates an open path, while
a leading parenthesis ((>) indicates a closed path.
Paths are output using the first syntax.
Polygon
polygon
Polygons are represented by sets of points. Polygons should probably be
considered equivalent to closed paths, but are stored differently
and have their own set of support routines.
polygon is specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the arguments are
(x,y)
End points of the line segments comprising the boundary of the
polygon
Polygons are output using the first syntax.
Circle
circle
Circles are represented by a center point and a radius.
circle is specified using the following syntax:
< ( x , y ) , r >
( ( x , y ) , r )
( x , y ) , r
x , y , r
where the arguments are
(x,y)
center of the circle
r
radius of the circle
Circles are output using the first syntax.
Network Address Data Types
network
addresses
PostgreSQL> offers data types to store IP and MAC
addresses, shown in . It
is preferable to use these types over plain text types, because
these types offer input error checking and several specialized
operators and functions.
Network Address Data Types
Name
Storage
Description
Range
cidr
12 bytes
IP networks
valid IPv4 networks
inet
12 bytes
IP hosts and networks
valid IPv4 hosts or networks
macaddr
6 bytes
MAC addresses
customary formats
IPv6 is not yet supported.
inet
inet (data type)
The inet type holds an IP host address, and
optionally the identity of the subnet it is in, all in one field.
The subnet identity is represented by the number of bits in the
network part of the address (the netmask
). If the
netmask is 32,
then the value does not indicate a subnet, only a single host.
Note that if you want to accept networks only, you should use the
cidr type rather than inet.
The input format for this type is x.x.x.x/y where x.x.x.x is an IP address and
y is the number of
bits in the netmask. If the /y part is left off, then the
netmask is 32, and the value represents just a single host.
On display, the /y
portion is suppressed if the netmask is 32.
cidr>
cidr
The cidr type holds an IP network specification.
Input and output formats follow Classless Internet Domain Routing
conventions.
The format for
specifying classless networks is x.x.x.x/y> where x.x.x.x> is the network and y> is the number of bits in the netmask. If
y> is omitted, it is calculated
using assumptions from the older classful numbering system, except
that it will be at least large enough to include all of the octets
written in the input.
shows some examples.
cidr> Type Input Examples
CIDR Input
CIDR Displayed
abbrev(CIDR)
192.168.100.128/25
192.168.100.128/25
192.168.100.128/25
192.168/24
192.168.0.0/24
192.168.0/24
192.168/25
192.168.0.0/25
192.168.0.0/25
192.168.1
192.168.1.0/24
192.168.1/24
192.168
192.168.0.0/24
192.168.0/24
128.1
128.1.0.0/16
128.1/16
128
128.0.0.0/16
128.0/16
128.1.2
128.1.2.0/24
128.1.2/24
10.1.2
10.1.2.0/24
10.1.2/24
10.1
10.1.0.0/16
10.1/16
10
10.0.0.0/8
10/8
inet vs cidr
The essential difference between inet and cidr
data types is that inet accepts values with nonzero bits to
the right of the netmask, whereas cidr does not.
If you do not like the output format for inet or
cidr values, try the host>(),
text>(), and abbrev>() functions.
macaddr>>
macaddr (data type)
MAC address
macaddr
The macaddr> type stores MAC addresses, i.e., Ethernet
card hardware addresses (although MAC addresses are used for
other purposes as well). Input is accepted in various customary
formats, including
'08002b:010203'>
'08002b-010203'>
'0800.2b01.0203'>
'08-00-2b-01-02-03'>
'08:00:2b:01:02:03'>
which would all specify the same
address. Upper and lower case is accepted for the digits
a> through f>. Output is always in the
last of the shown forms.
The directory contrib/mac
in the PostgreSQL source distribution
contains tools that can be used to map MAC addresses to hardware
manufacturer names.
Bit String Types
bit strings
data type
Bit strings are strings of 1's and 0's. They can be used to store
or visualize bit masks. There are two SQL bit types:
BIT(n) and BIT
VARYING(n), where
n is a positive integer.
BIT type data must match the length
n exactly; it is an error to attempt to
store shorter or longer bit strings. BIT VARYING data is
of variable length up to the maximum length
n; longer strings will be rejected.
Writing BIT without a length is equivalent to
BIT(1), while BIT VARYING without a length
specification means unlimited length.
If one explicitly casts a bit-string value to
BIT(n>), it will be truncated or
zero-padded on the right to be exactly n> bits,
without raising an error. Similarly,
if one explicitly casts a bit-string value to
BIT VARYING(n>), it will be truncated
on the right if it is more than n> bits.
Prior to PostgreSQL> 7.2, BIT data
was always silently truncated or zero-padded on the right, with
or without an explicit cast. This was changed to comply with the
SQL standard.
Refer to for information about the syntax
of bit string constants. Bit-logical operators and string
manipulation functions are available; see .
Using the bit string types
CREATE TABLE test (a BIT(3), b BIT VARYING(5));
INSERT INTO test VALUES (B'101', B'00');
INSERT INTO test VALUES (B'10', B'101');
ERROR: Bit string length 2 does not match type BIT(3)
INSERT INTO test VALUES (B'10'::bit(3), B'101');
SELECT * FROM test;
a | b
-----+-----
101 | 00
100 | 101
Object Identifier Types
object identifier
data type
oid
regproc
regprocedure
regoper
regoperator
regclass
regtype
xid
cid
tid
Object identifiers (OIDs) are used internally by
PostgreSQL as primary keys for various system
tables. Also, an OID system column is added to user-created tables
(unless WITHOUT OIDS> is specified at table creation time).
Type oid> represents an object identifier. There are also
several aliases for oid>: regproc>, regprocedure>,
regoper>, regoperator>, regclass>,
and regtype>. shows an overview.
The oid> type is currently implemented as an unsigned four-byte
integer.
Therefore, it is not large enough to provide database-wide uniqueness
in large databases, or even in large individual tables. So, using a
user-created table's OID column as a primary key is discouraged.
OIDs are best used only for references to system tables.
The oid> type itself has few operations beyond comparison
(which is implemented as unsigned comparison). It can be cast to
integer, however, and then manipulated using the standard integer
operators. (Beware of possible signed-versus-unsigned confusion
if you do this.)
The oid> alias types have no operations of their own except
for specialized input and output routines. These routines are able
to accept and display symbolic names for system objects, rather than
the raw numeric value that type oid> would use. The alias
types allow simplified lookup of OID values for objects: for example,
one may write 'mytable'::regclass> to get the OID of table
mytable>, rather than SELECT oid FROM pg_class WHERE
relname = 'mytable'>. (In reality, a much more complicated SELECT> would
be needed to deal with selecting the right OID when there are multiple
tables named mytable> in different schemas.)
Object Identifier Types
Type name
References
Description
Value example
oid>
any
numeric object identifier
564182>
regproc>
pg_proc>
function name
sum>
regprocedure>
pg_proc>
function with argument types
sum(int4)>
regoper>
pg_operator>
operator name
+>
regoperator>
pg_operator>
operator with argument types
*(integer,integer)> or -(NONE,integer)>
regclass>
pg_class>
relation name
pg_type>
regtype>
pg_type>
type name
integer>
All of the OID alias types accept schema-qualified names, and will
display schema-qualified names on output if the object would not
be found in the current search path without being qualified.
The regproc> and regoper> alias types will only
accept input names that are unique (not overloaded), so they are
of limited use; for most uses regprocedure> or
regoperator> is more appropriate. For regoperator>,
unary operators are identified by writing NONE> for the unused
operand.
OIDs are 32-bit quantities and are assigned from a single cluster-wide
counter. In a large or long-lived database, it is possible for the
counter to wrap around. Hence, it is bad practice to assume that OIDs
are unique, unless you take steps to ensure that they are unique.
Recommended practice when using OIDs for row identification is to create
a unique constraint on the OID column of each table for which the OID will
be used. Never assume that OIDs are unique across tables; use the
combination of tableoid> and row OID if you need a
database-wide identifier. (Future releases of
PostgreSQL are likely to use a separate
OID counter for each table, so that tableoid>
must> be included to arrive at a globally unique identifier.)
Another identifier type used by the system is xid>, or transaction
(abbreviated xact>) identifier. This is the data type of the system columns
xmin> and xmax>.
Transaction identifiers are 32-bit quantities. In a long-lived
database it is possible for transaction IDs to wrap around. This
is not a fatal problem given appropriate maintenance procedures;
see the &cite-admin; for details. However, it is
unwise to depend on uniqueness of transaction IDs over the long term
(more than one billion transactions).
A third identifier type used by the system is cid>, or
command identifier. This is the data type of the system columns
cmin> and cmax>. Command
identifiers are also 32-bit quantities. This creates a hard limit
of 232> (4 billion) SQL commands
within a single transaction. In practice this limit is not a
problem --- note that the limit is on number of
SQL commands, not number of tuples processed.
A final identifier type used by the system is tid>, or tuple
identifier. This is the data type of the system column
ctid>. A tuple ID is a pair
(block number, tuple index within block) that identifies the
physical location of the tuple within its table.
Pseudo-Types
record
any
anyarray
void
trigger
language_handler
cstring
internal
opaque
The PostgreSQL type system contains a
number of special-purpose entries that are collectively called
pseudo-types>. A pseudo-type cannot be used as a
column data type, but it can be used to declare a function's
argument or result type. Each of the available pseudo-types is
useful in situations where a function's behavior does not
correspond to simply taking or returning a value of a specific
SQL data type. lists the existing
pseudo-types.
Pseudo-Types
Type name
Description
record>
Identifies a function returning an unspecified row type
any>
Indicates that a function accepts any input data type whatever
anyarray>
Indicates that a function accepts any array data type
void>
Indicates that a function returns no value
trigger>
A trigger function is declared to return trigger>
language_handler>
A procedural language call handler is declared to return language_handler>
cstring>
Indicates that a function accepts or returns a null-terminated C string
internal>
Indicates that a function accepts or returns a server-internal
data type
opaque>
An obsolete type name that formerly served all the above purposes
Functions coded in C (whether built-in or dynamically loaded) may be
declared to accept or return any of these pseudo data types. It is up to
the function author to ensure that the function will behave safely
when a pseudo-type is used as an argument type.
Functions coded in procedural languages may use pseudo-types only as
allowed by their implementation languages. At present the procedural
languages all forbid use of a pseudo-type as argument type, and allow
only void> as a result type (plus trigger> when the
function is used as a trigger).
The internal> pseudo-type is used to declare functions
that are meant only to be called internally by the database
system, and not by direct invocation in a SQL
query. If a function has at least one internal>-type
argument then it cannot be called from SQL. To
preserve the type safety of this restriction it is important to
follow this coding rule: do not create any function that is
declared to return internal> unless it has at least one
internal> argument.
&array;