Data Types

Data Types data types types data types PostgreSQL has a rich set of native data types available to users. Users may add new types to PostgreSQL using the CREATE TYPE command. shows all general-purpose data types included in the standard distribution. Most of the alternative names listed in the Aliases column are the names used internally by PostgreSQL for historical reasons. In addition, some internally used or deprecated types are available, but they are not listed here. Data Types Type Name Aliases Description bigint int8 signed eight-byte integer bigserial serial8 autoincrementing eight-byte integer bit fixed-length bit string bit varying(n) varbit(n) variable-length bit string boolean bool logical Boolean (true/false) box rectangular box in 2D plane bytea binary data character(n) char(n) fixed-length character string character varying(n) varchar(n) variable-length character string cidr IP network address circle circle in 2D plane date calendar date (year, month, day) double precision float8 double precision floating-point number inet IP host address integer int, int4 signed four-byte integer interval(p) general-use time span line infinite line in 2D plane (not implemented) lseg line segment in 2D plane macaddr MAC address money currency amount numeric [ (p, s) ] decimal [ (p, s) ] exact numeric with selectable precision path open and closed geometric path in 2D plane point geometric point in 2D plane polygon closed geometric path in 2D plane real float4 single precision floating-point number smallint int2 signed two-byte integer serial serial4 autoincrementing four-byte integer text variable-length character string time [ (p) ] [ without time zone ] time of day time [ (p) ] with time zone timetz time of day, including time zone timestamp [ (p) ] without time zone timestamp date and time timestamp [ (p) ] [ with time zone ] timestamptz date and time, including time zone

Compatibility The following types (or spellings thereof) are specified by SQL: bit, bit varying, boolean, char, character, character varying, varchar, date, double precision, integer, interval, numeric, decimal, real, smallint, time, timestamp (both with or without time zone). Each data type has an external representation determined by its input and output functions. Many of the built-in types have obvious external formats. However, several types are either unique to PostgreSQL, such as open and closed paths, or have several possibilities for formats, such as the date and time types. Most of the input and output functions corresponding to the base types (e.g., integers and floating-point numbers) do some error-checking. Some of the input and output functions are not invertible. That is, the result of an output function may lose precision when compared to the original input. Some of the operators and functions (e.g., addition and multiplication) do not perform run-time error-checking in the interests of improving execution speed. On some systems, for example, the numeric operators for some data types may silently underflow or overflow. Numeric Types data types numeric integer smallint bigint int4 integer int2 smallint int8 bigint numeric (data type) decimal numeric real double precision float4 real float8 double precision floating point Numeric types consist of two-, four-, and eight-byte integers, four- and eight-byte floating-point numbers, and fixed-precision decimals. lists the available types. Numeric Types Type name Storage size Description Range smallint 2 bytes small range fixed-precision -32768 to +32767 integer 4 bytes usual choice for fixed-precision -2147483648 to +2147483647 bigint 8 bytes large range fixed-precision -9223372036854775808 to 9223372036854775807 decimal variable user-specified precision, exact no limit numeric variable user-specified precision, exact no limit real 4 bytes variable-precision, inexact 6 decimal digits precision double precision 8 bytes variable-precision, inexact 15 decimal digits precision serial 4 bytes autoincrementing integer 1 to 2147483647 bigserial 8 bytes large autoincrementing integer 1 to 9223372036854775807

The syntax of constants for the numeric types is described in . The numeric types have a full set of corresponding arithmetic operators and functions. Refer to for more information. The following sections describe the types in detail. The Integer Types The types smallint, integer, bigint store whole numbers, that is, numbers without fractional components, of various ranges. Attempts to store values outside of the allowed range will result in an error. The type integer is the usual choice, as it offers the best balance between range, storage size, and performance. The smallint type is generally only used if disk space is at a premium. The bigint type should only be used if the integer range is not sufficient, because the latter is definitely faster. The bigint type may not function correctly on all platforms, since it relies on compiler support for eight-byte integers. On a machine without such support, bigint acts the same as integer (but still takes up eight bytes of storage). However, we are not aware of any reasonable platform where this is actually the case. SQL only specifies the integer types integer (or int) and smallint. The type bigint, and the type names int2, int4, and int8 are extensions, which are shared with various other SQL database systems. If you have a column of type smallint or bigint with an index, you may encounter problems getting the system to use that index. For instance, a clause of the form ... WHERE smallint_column = 42 will not use an index, because the system assigns type integer to the constant 42, and PostgreSQL currently cannot use an index when two different data types are involved. A workaround is to single-quote the constant, thus: ... WHERE smallint_column = '42' This will cause the system to delay type resolution and will assign the right type to the constant. Arbitrary Precision Numbers The type numeric can store numbers with up to 1,000 digits of precision and perform calculations exactly. It is especially recommended for storing monetary amounts and other quantities where exactness is required. However, the numeric type is very slow compared to the floating-point types described in the next section. In what follows we use these terms: The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. So the number 23.5141 has a precision of 6 and a scale of 4. Integers can be considered to have a scale of zero. Both the precision and the scale of the numeric type can be configured. To declare a column of type numeric use the syntax NUMERIC(precision, scale) The precision must be positive, the scale zero or positive. Alternatively, NUMERIC(precision) selects a scale of 0. Specifying NUMERIC without any precision or scale creates a column in which numeric values of any precision and scale can be stored, up to the implementation limit on precision. A column of this kind will not coerce input values to any particular scale, whereas numeric columns with a declared scale will coerce input values to that scale. (The SQL standard requires a default scale of 0, i.e., coercion to integer precision. We find this a bit useless. If you're concerned about portability, always specify the precision and scale explicitly.) If the precision or scale of a value is greater than the declared precision or scale of a column, the system will attempt to round the value. If the value cannot be rounded so as to satisfy the declared limits, an error is raised. The types decimal and numeric are equivalent. Both types are part of the SQL standard. Floating-Point Types The data types real and double precision are inexact, variable-precision numeric types. In practice, these types are usually implementations of IEEE Standard 754 for Binary Floating-Point Arithmetic (single and double precision, respectively), to the extent that the underlying processor, operating system, and compiler support it. Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and printing back out a value may show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed further here, except for the following points: If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead. If you want to do complicated calculations with these types for anything important, especially if you rely on certain behavior in boundary cases (infinity, underflow), you should evaluate the implementation carefully. Comparing two floating-point values for equality may or may not work as expected. Normally, the real type has a range of at least -1E+37 to +1E+37 with a precision of at least 6 decimal digits. The double precision type normally has a range of around -1E+308 to +1E+308 with a precision of at least 15 digits. Values that are too large or too small will cause an error. Rounding may take place if the precision of an input number is too high. Numbers too close to zero that are not representable as distinct from zero will cause an underflow error. The Serial Types serial bigserial serial4 serial8 auto-increment serial sequences and serial type The serial data type is not a true type, but merely a notational convenience for setting up identifier columns (similar to the AUTO_INCREMENT property supported by some other databases). In the current implementation, specifying CREATE TABLE tablename ( colname SERIAL ); is equivalent to specifying: CREATE SEQUENCE tablename_colname_seq; CREATE TABLE tablename ( colname integer DEFAULT nextval('tablename_colname_seq') NOT NULL ); Thus, we have created an integer column and arranged for its default values to be assigned from a sequence generator. A NOT NULL constraint is applied to ensure that a null value cannot be explicitly inserted, either. In most cases you would also want to attach a UNIQUE or PRIMARY KEY constraint to prevent duplicate values from being inserted by accident, but this is not automatic. To use a serial column to insert the next value of the sequence into the table, specify that the serial column should be assigned the default value. This can be done either be excluding from the column from the list of columns in the INSERT statement, or through the use of the DEFAULT keyword. The type names serial and serial4 are equivalent: both create integer columns. The type names bigserial and serial8 work just the same way, except that they create a bigint column. bigserial should be used if you anticipate the use of more than 231 identifiers over the lifetime of the table. The sequence created by a serial type is automatically dropped when the owning column is dropped, and cannot be dropped otherwise. (This was not true in PostgreSQL releases before 7.3. Note that this automatic drop linkage will not occur for a sequence created by reloading a dump from a pre-7.3 database; the dump file does not contain the information needed to establish the dependency link.) Furthermore, this dependency between sequence and column is made only for the serial column itself; if any other columns reference the sequence (perhaps by manually calling the nextval()) function), they may be broken if the sequence is removed. Using serial columns in fashion is considered bad form. Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be UNIQUE or a PRIMARY KEY it must now be specified, just as with any other data type. Monetary Type Note The money type is deprecated. Use numeric or decimal instead, in combination with the to_char function. The money type may become a locale-aware layer over the numeric type in a future release. The money type stores a currency amount with fixed decimal point representation; see . The output format is locale-specific. Input is accepted in a variety of formats, including integer and floating-point literals, as well as typical currency formatting, such as '$1,000.00'. Output is in the latter form. Monetary Types Type Name Storage Description Range money 4 bytes currency amount -21474836.48 to +21474836.47

Character Types character strings data types strings character strings text character strings Character Types Type name Description character(n), char(n) fixed-length, blank padded character varying(n), varchar(n) variable-length with limit text variable unlimited length

shows the general-purpose character types available in PostgreSQL. SQL defines two primary character types: character(n) and character varying(n), where n is a positive integer. Both of these types can store strings up to n characters in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string. If one explicitly casts a value to character(n) or character varying(n), then an overlength value will be truncated to n characters without raising an error. (This too is required by the SQL standard.) Prior to PostgreSQL 7.2, strings that were too long were always truncated without raising an error, in either explicit or implicit casting contexts. The notations char(n) and varchar(n) are aliases for character(n) and character varying(n), respectively. character without length specifier is equivalent to character(1); if character varying is used without length specifier, the type accepts strings of any size. The latter is a PostgreSQL extension. In addition, PostgreSQL supports the more general text type, which stores strings of any length. Unlike character varying, text does not require an explicit declared upper limit on the size of the string. Although the type text is not in the SQL standard, many other RDBMS packages have it as well. The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character plus the padding. Long strings are compressed by the system automatically, so the physical requirement on disk may be less. Long values are also stored in background tables so they don't interfere with rapid access to the shorter column values. In any case, the longest possible character string that can be stored is about 1 GB. (The maximum value that will be allowed for n in the data type declaration is less than that. It wouldn't be very useful to change this because with multibyte character encodings the number of characters and bytes can be quite different anyway. If you desire to store long strings with no specific upper limit, use text or character varying without a length specifier, rather than making up an arbitrary length limit.) There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. Refer to for information about the syntax of string literals, and to for information about available operators and functions. Using the character types CREATE TABLE test1 (a character(4)); INSERT INTO test1 VALUES ('ok'); SELECT a, char_length(a) FROM test1; -- a | char_length ------+------------- ok | 4 CREATE TABLE test2 (b varchar(5)); INSERT INTO test2 VALUES ('ok'); INSERT INTO test2 VALUES ('good '); INSERT INTO test2 VALUES ('too long'); ERROR: value too long for type character varying(5) INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation SELECT b, char_length(b) FROM test2; b | char_length -------+------------- ok | 2 good | 5 too l | 5 The char_length function is discussed in . There are two other fixed-length character types in PostgreSQL, shown in . The name type exists only for storage of internal catalog names and is not intended for use by the general user. Its length is currently defined as 64 bytes (63 usable characters plus terminator) but should be referenced using the constant NAMEDATALEN. The length is set at compile time (and is therefore adjustable for special uses); the default maximum length may change in a future release. The type "char" (note the quotes) is different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs as a poor-man's enumeration type. Specialty Character Types Type Name Storage Description "char" 1 byte single character internal type name 64 bytes sixty-three character internal type

Binary Strings The bytea data type allows storage of binary strings; see . Binary String Types Type Name Storage Description bytea 4 bytes plus the actual binary string Variable (not specifically limited) length binary string

A binary string is a sequence of octets (or bytes). Binary strings are distinguished from characters strings by two characteristics: First, binary strings specifically allow storing octets of zero value and other non-printable octets. Second, operations on binary strings process the actual bytes, whereas the encoding and processing of character strings depends on locale settings. When entering bytea values, octets of certain values must be escaped (but all octet values may be escaped) when used as part of a string literal in an SQL statement. In general, to escape an octet, it is converted into the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes. Some octet values have alternate escape sequences, as shown in . <type>bytea</> Literal Escaped Octets Decimal Octet Value Description Input Escaped Representation Example Printed Result 0 zero octet '\\000' SELECT '\\000'::bytea; \000 39 single quote '\'' or '\\047' SELECT '\''::bytea; ' 92 backslash '\\\\' or '\\134' SELECT '\\\\'::bytea; \\

Note that the result in each of the examples in was exactly one octet in length, even though the output representation of the zero octet and backslash are more than one character. Bytea output octets are also escaped. In general, each non-printable octet decimal value is converted into its equivalent three digit octal value, and preceded by one backslash. Most printable octets are represented by their standard representation in the client character set. The octet with decimal value 92 (backslash) has a special alternate output representation. Details are in . <type>bytea</> Output Escaped Octets Decimal Octet Value Description Output Escaped Representation Example Printed Result 92 backslash \\ SELECT '\\134'::bytea; \\ 0 to 31 and 127 to 255 non-printable octets \### (octal value) SELECT '\\001'::bytea; \001 32 to 126 printable octets ASCII representation SELECT '\\176'::bytea; ~

To use the bytea escaped octet notation, string literals (input strings) must contain two backslashes because they must pass through two parsers in the PostgreSQL server. The first backslash is interpreted as an escape character by the string-literal parser, and therefore is consumed, leaving the characters that follow. The remaining backslash is recognized by the bytea input function as the prefix of a three digit octal value. For example, a string literal passed to the backend as '\\001' becomes '\001' after passing through the string-literal parser. The '\001' is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 1. For a similar reason, a backslash must be input as '\\\\' (or '\\134'). The first and third backslashes are interpreted as escape characters by the string-literal parser, and therefore are consumed, leaving two backslashes in the string passed to the bytea input function, which interprets them as representing a single backslash. For example, a string literal passed to the server as '\\\\' becomes '\\' after passing through the string-literal parser. The '\\' is then sent to the bytea input function, where it is converted to a single octet with a decimal value of 92. A single quote is a bit different in that it must be input as '\'' (or '\\047'), not as '\\''. This is because, while the literal parser interprets the single quote as a special character, and will consume the single backslash, the bytea input function does not recognize a single quote as a special octet. Therefore a string literal passed to the backend as '\'' becomes ''' after passing through the string-literal parser. The ''' is then sent to the bytea input function, where it is retains its single octet decimal value of 39. Depending on the front end to PostgreSQL you use, you may have additional work to do in terms of escaping and unescaping bytea strings. For example, you may also have to escape line feeds and carriage returns if your interface automatically translates these. Or you may have to double up on backslashes if the parser for your language or choice also treats them as an escape character. The SQL standard defines a different binary string type, called BLOB or BINARY LARGE OBJECT. The input format is different compared to bytea, but the provided functions and operators are mostly the same. Date/Time Types PostgreSQL supports the full set of SQL date and time types, shown in . Date/Time Types Type Description Storage Earliest Latest Resolution timestamp [ (p) ] [ without time zone ] both date and time 8 bytes 4713 BC AD 1465001 1 microsecond / 14 digits timestamp [ (p) ] with time zone both date and time 8 bytes 4713 BC AD 1465001 1 microsecond / 14 digits interval [ (p) ] time intervals 12 bytes -178000000 years 178000000 years 1 microsecond date dates only 4 bytes 4713 BC 32767 AD 1 day time [ (p) ] [ without time zone ] times of day only 8 bytes 00:00:00.00 23:59:59.99 1 microsecond time [ (p) ] with time zone times of day only 12 bytes 00:00:00.00+12 23:59:59.99-12 1 microsecond

time, timestamp, and interval accept an optional precision value p which specifies the number of fractional digits retained in the seconds field. By default, there is no explicit bound on precision. The allowed range of p is from 0 to 6 for the timestamp and interval types, 0 to 13 for the time types. When timestamp values are stored as double precision floating-point numbers (currently the default), the effective limit of precision may be less than 6, since timestamp values are stored as seconds since 2000-01-01. Microsecond precision is achieved for dates within a few years of 2000-01-01, but the precision degrades for dates further away. When timestamps are stored as eight-byte integers (a compile-time option), microsecond precision is available over the full range of values. Time zones, and time-zone conventions, are influenced by political decisions, not just earth geometry. Time zones around the world became somewhat standardized during the 1900's, but continue to be prone to arbitrary changes. PostgreSQL uses your operating system's underlying features to provide output time-zone support, and these systems usually contain information for only the time period 1902 through 2038 (corresponding to the full range of conventional Unix system time). timestamp with time zone and time with time zone will use time zone information only within that year range, and assume that times outside that range are in UTC. The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness. In most cases, a combination of date, time, timestamp without time zone and timestamp with time zone should provide a complete range of date/time functionality required by any application. The types abstime and reltime are lower precision types which are used internally. You are discouraged from using these types in new applications and are encouraged to move any old ones over when appropriate. Any or all of these internal types might disappear in a future release. Date/Time Input Date and time input is accepted in almost any reasonable format, including ISO 8601, SQL-compatible, traditional PostgreSQL, and others. For some formats, ordering of month and day in date input can be ambiguous and there is support for specifying the expected ordering of these fields. The command SET DateStyle TO 'US' or SET DateStyle TO 'NonEuropean' specifies the variant month before day, the command SET DateStyle TO 'European' sets the variant day before month. PostgreSQL is more flexible in handling date/time than the SQL standard requires. See for the exact parsing rules of date/time input and for the recognized text fields including months, days of the week, and time zones. Remember that any date or time literal input needs to be enclosed in single quotes, like text strings. Refer to for more information. SQL requires the following syntax type [ (p) ] 'value' where p in the optional precision specification is an integer corresponding to the number of fractional digits in the seconds field. Precision can be specified for time, timestamp, and interval types. Dates date data type shows some possible inputs for the date type. Date Input Example Description January 8, 1999 unambiguous 1999-01-08 ISO-8601 format, preferred 1/8/1999 U.S.; read as August 1 in European mode 8/1/1999 European; read as August 1 in U.S. mode 1/18/1999 U.S.; read as January 18 in any mode 19990108 ISO-8601 year, month, day 990108 ISO-8601 year, month, day 1999.008 year and day of year 99008 year and day of year J2451187 Julian day January 8, 99 BC year 99 before the Common Era

Times time data type time without time zone time time with time zone data type The time type can be specified as time or as time without time zone. The optional precision p should be between 0 and 13, and defaults to the precision of the input time literal. shows the valid time inputs. Time Input Example Description 04:05:06.789 ISO 8601 04:05:06 ISO 8601 04:05 ISO 8601 040506 ISO 8601 04:05 AM same as 04:05; AM does not affect value 04:05 PM same as 16:05; input hour must be <= 12 allballs same as 00:00:00

The type time with time zone accepts all input also legal for the time type, appended with a legal time zone, as shown in . Time With Time Zone Input Example Description 04:05:06.789-8 ISO 8601 04:05:06-08:00 ISO 8601 04:05-08:00 ISO 8601 040506-08 ISO 8601

Refer to for more examples of time zones. Time stamps timestamp data type timestamp with time zone data type timestamp without time zone data type The time stamp types are timestamp [ (p) ] without time zone and timestamp [ (p) ] with time zone. Writing just timestamp is equivalent to timestamp without time zone. Prior to PostgreSQL 7.3, writing just timestamp was equivalent to timestamp with time zone. This was changed for SQL spec compliance. Valid input for the time stamp types consists of a concatenation of a date and a time, followed by an optional AD or BC, followed by an optional time zone. (See .) Thus 1999-01-08 04:05:06 and 1999-01-08 04:05:06 -8:00 are valid values, which follow the ISO 8601 standard. In addition, the wide-spread format January 8 04:05:06 1999 PST is supported. The optional precision p should be between 0 and 6, and defaults to the precision of the input timestamp literal. For timestamp without time zone, any explicit time zone specified in the input is silently ignored. That is, the resulting date/time value is derived from the explicit date/time fields in the input value, and is not adjusted for time zone. For timestamp with time zone, the internally stored value is always in UTC (GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's TimeZone parameter, and is converted to UTC using the offset for the TimeZone zone. When a timestamp with time zone value is output, it is always converted from UTC to the current TimeZone zone, and displayed as local time in that zone. To see the time in another time zone, either change TimeZone or use the AT TIME ZONE construct (see ). Conversions between timestamp without time zone and timestamp with time zone normally assume that the timestamp without time zone value should be taken or given as TimeZone local time. A different zone reference can be specified for the conversion using AT TIME ZONE. Time Zone Input Time Zone Description PST Pacific Standard Time -8:00 ISO-8601 offset for PST -800 ISO-8601 offset for PST -8 ISO-8601 offset for PST

Intervals interval interval values can be written with the following syntax: Quantity Unit [Quantity Unit...] [Direction] @ Quantity Unit [Quantity Unit...] [Direction] where: Quantity is a number (possibly signed), Unit is second, minute, hour, day, week, month, year, decade, century, millennium, or abbreviations or plurals of these units; Direction can be ago or empty. The at sign (@) is optional noise. The amounts of different units are implicitly added up with appropriate sign accounting. Quantities of days, hours, minutes, and seconds can be specified without explicit unit markings. For example, '1 12:59:10' is read the same as '1 day 12 hours 59 min 10 sec'. The optional precision p should be between 0 and 6, and defaults to the precision of the input literal. Special values time constants date constants The following SQL-compatible functions can be used as date or time values for the corresponding data type: CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP. The latter two accept an optional precision specification. (See also .) PostgreSQL also supports several special date/time input values for convenience, as shown in . The values infinity and -infinity are specially represented inside the system and will be displayed the same way; but the others are simply notational shorthands that will be converted to ordinary date/time values when read. Special Date/Time Inputs Input string Description epoch 1970-01-01 00:00:00+00 (Unix system time zero) infinity later than all other timestamps (not available for type date) -infinity earlier than all other timestamps (not available for type date) now current transaction time today midnight today tomorrow midnight tomorrow yesterday midnight yesterday zulu, allballs, z 00:00:00.00 GMT

Date/Time Output date output format Formatting time output format Formatting Output formats can be set to one of the four styles ISO 8601, SQL (Ingres), traditional PostgreSQL, and German, using the SET DateStyle. The default is the ISO format. (The SQL standard requires the use of the ISO 8601 format. The name of the SQL output format is a historical accident.) shows examples of each output style. The output of the date and time types is of course only the date or time part in accordance with the given examples. Date/Time Output Styles Style Specification Description Example ISO ISO 8601/SQL standard 1997-12-17 07:37:16-08 SQL traditional style 12/17/1997 07:37:16.00 PST PostgreSQL original style Wed Dec 17 07:37:16 1997 PST German regional style 17.12.1997 07:37:16.00 PST

The SQL style has European and non-European (U.S.) variants, which determines whether month follows day or vice versa. (See for how this setting also affects interpretation of input values.) shows an example. Date Order Conventions Style Specification Description Example European day/month/year 17/12/1997 15:37:16.00 MET US month/day/year 12/17/1997 07:37:16.00 PST

interval output looks like the input format, except that units like week or century are converted to years and days. In ISO mode the output looks like [ Quantity Units [ ... ] ] [ Days ] Hours:Minutes [ ago ] The date/time styles can be selected by the user using the SET DATESTYLE command, the datestyle parameter in the postgresql.conf configuration file, and the PGDATESTYLE environment variable on the server or client. The formatting function to_char (see ) is also available as a more flexible way to format the date/time output. Time Zones time zones PostgreSQL endeavors to be compatible with the SQL standard definitions for typical usage. However, the SQL standard has an odd mix of date and time types and capabilities. Two obvious problems are: Although the date type does not have an associated time zone, the time type can. Time zones in the real world can have no meaning unless associated with a date as well as a time since the offset may vary through the year with daylight-saving time boundaries. The default time zone is specified as a constant integer offset from GMT/UTC. It is not possible to adapt to daylight-saving time when doing date/time arithmetic across DST boundaries. To address these difficulties, we recommend using date/time types that contain both date and time when using time zones. We recommend not using the type time with time zone (though it is supported by PostgreSQL for legacy applications and for compatibility with other SQL implementations). PostgreSQL assumes your local time zone for any type containing only date or time. Further, time zone support is derived from the underlying operating system time-zone capabilities, and hence can handle daylight-saving time and other expected behavior. PostgreSQL obtains time-zone support from the underlying operating system for dates between 1902 and 2038 (near the typical date limits for Unix-style systems). Outside of this range, all dates are assumed to be specified and used in Universal Coordinated Time (UTC). All dates and times are stored internally in UTC, traditionally known as Greenwich Mean Time (GMT). Times are converted to local time on the database server before being sent to the client frontend, hence by default are in the server time zone. There are several ways to select the time zone used by the server: The TZ environment variable on the server host is used by the server as the default time zone, if no other is specified. The timezone configuration parameter can be set in postgresql.conf. The PGTZ environment variable, if set at the client, is used by libpq applications to send a SET TIME ZONE command to the server upon connection. The SQL command SET TIME ZONE sets the time zone for the session. If an invalid time zone is specified, the time zone becomes UTC (on most systems anyway). Refer to for a list of available time zones. Internals PostgreSQL uses Julian dates for all date/time calculations. They have the nice property of correctly predicting/calculating any date more recent than 4713 BC to far into the future, using the assumption that the length of the year is 365.2425 days. Date conventions before the 19th century make for interesting reading, but are not consistent enough to warrant coding into a date/time handler. Boolean Type Boolean data type true false PostgreSQL provides the standard SQL type boolean. boolean can have one of only two states: true or false. A third state, unknown, is represented by the SQL null value. Valid literal values for the true state are: TRUE 't' 'true' 'y' 'yes' '1' For the false state, the following values can be used: FALSE 'f' 'false' 'n' 'no' '0' Using the key words TRUE and FALSE is preferred (and SQL-compliant). Using the <type>boolean</type> type CREATE TABLE test1 (a boolean, b text); INSERT INTO test1 VALUES (TRUE, 'sic est'); INSERT INTO test1 VALUES (FALSE, 'non est'); SELECT * FROM test1; a | b ---+--------- t | sic est f | non est SELECT * FROM test1 WHERE a; a | b ---+--------- t | sic est shows that boolean values are output using the letters t and f. Values of the boolean type cannot be cast directly to other types (e.g., CAST (boolval AS integer) does not work). This can be accomplished using the CASE expression: CASE WHEN boolval THEN 'value if true' ELSE 'value if false' END. See also . boolean uses 1 byte of storage. Geometric Types Geometric data types represent two-dimensional spatial objects. shows the geometric types available in PostgreSQL. The most fundamental type, the point, forms the basis for all of the other types. Geometric Types Geometric Type Storage Representation Description point 16 bytes (x,y) Point in space line 32 bytes ((x1,y1),(x2,y2)) Infinite line (not fully implemented) lseg 32 bytes ((x1,y1),(x2,y2)) Finite line segment box 32 bytes ((x1,y1),(x2,y2)) Rectangular box path 16+16n bytes ((x1,y1),...) Closed path (similar to polygon) path 16+16n bytes [(x1,y1),...] Open path polygon 40+16n bytes ((x1,y1),...) Polygon (similar to closed path) circle 24 bytes <(x,y),r> Circle (center and radius)

A rich set of functions and operators is available to perform various geometric operations such as scaling, translation, rotation, and determining intersections. They are explained in . Point point Points are the fundamental two-dimensional building block for geometric types. point is specified using the following syntax: ( x , y ) x , y where the arguments are x the x-axis coordinate as a floating-point number y the y-axis coordinate as a floating-point number Line Segment line Line segments (lseg) are represented by pairs of points. lseg is specified using the following syntax: ( ( x1 , y1 ) , ( x2 , y2 ) ) ( x1 , y1 ) , ( x2 , y2 ) x1 , y1 , x2 , y2 where the arguments are (x1,y1) (x2,y2) the end points of the line segment Box box (data type) Boxes are represented by pairs of points that are opposite corners of the box. box is specified using the following syntax: ( ( x1 , y1 ) , ( x2 , y2 ) ) ( x1 , y1 ) , ( x2 , y2 ) x1 , y1 , x2 , y2 where the arguments are (x1,y1) (x2,y2) opposite corners of the box Boxes are output using the first syntax. The corners are reordered on input to store the upper right corner, then the lower left corner. Other corners of the box can be entered, but the lower left and upper right corners are determined from the input and stored corners. Path path (data type) Paths are represented by connected sets of points. Paths can be open, where the first and last points in the set are not connected, and closed, where the first and last point are connected. Functions popen(p) and pclose(p) are supplied to force a path to be open or closed, and functions isopen(p) and isclosed(p) are supplied to test for either type in a query. path is specified using the following syntax: ( ( x1 , y1 ) , ... , ( xn , yn ) ) [ ( x1 , y1 ) , ... , ( xn , yn ) ] ( x1 , y1 ) , ... , ( xn , yn ) ( x1 , y1 , ... , xn , yn ) x1 , y1 , ... , xn , yn where the arguments are (x,y) End points of the line segments comprising the path. A leading square bracket ([) indicates an open path, while a leading parenthesis (() indicates a closed path. Paths are output using the first syntax. Polygon polygon Polygons are represented by sets of points. Polygons should probably be considered equivalent to closed paths, but are stored differently and have their own set of support routines. polygon is specified using the following syntax: ( ( x1 , y1 ) , ... , ( xn , yn ) ) ( x1 , y1 ) , ... , ( xn , yn ) ( x1 , y1 , ... , xn , yn ) x1 , y1 , ... , xn , yn where the arguments are (x,y) End points of the line segments comprising the boundary of the polygon Polygons are output using the first syntax. Circle circle Circles are represented by a center point and a radius. circle is specified using the following syntax: < ( x , y ) , r > ( ( x , y ) , r ) ( x , y ) , r x , y , r where the arguments are (x,y) center of the circle r radius of the circle Circles are output using the first syntax. Network Address Data Types network addresses PostgreSQL offers data types to store IP and MAC addresses, shown in . It is preferable to use these types over plain text types, because these types offer input error checking and several specialized operators and functions. Network Address Data Types Name Storage Description Range cidr 12 bytes IP networks valid IPv4 networks inet 12 bytes IP hosts and networks valid IPv4 hosts or networks macaddr 6 bytes MAC addresses customary formats

IPv6 is not yet supported. <type>inet</type> inet (data type) The inet type holds an IP host address, and optionally the identity of the subnet it is in, all in one field. The subnet identity is represented by the number of bits in the network part of the address (the netmask). If the netmask is 32, then the value does not indicate a subnet, only a single host. Note that if you want to accept networks only, you should use the cidr type rather than inet. The input format for this type is x.x.x.x/y where x.x.x.x is an IP address and y is the number of bits in the netmask. If the /y part is left off, then the netmask is 32, and the value represents just a single host. On display, the /y portion is suppressed if the netmask is 32. <type>cidr</> cidr The cidr type holds an IP network specification. Input and output formats follow Classless Internet Domain Routing conventions. The format for specifying classless networks is x.x.x.x/y where x.x.x.x is the network and y is the number of bits in the netmask. If y is omitted, it is calculated using assumptions from the older classful numbering system, except that it will be at least large enough to include all of the octets written in the input. shows some examples. <type>cidr</> Type Input Examples CIDR Input CIDR Displayed abbrev(CIDR) 192.168.100.128/25 192.168.100.128/25 192.168.100.128/25 192.168/24 192.168.0.0/24 192.168.0/24 192.168/25 192.168.0.0/25 192.168.0.0/25 192.168.1 192.168.1.0/24 192.168.1/24 192.168 192.168.0.0/24 192.168.0/24 128.1 128.1.0.0/16 128.1/16 128 128.0.0.0/16 128.0/16 128.1.2 128.1.2.0/24 128.1.2/24 10.1.2 10.1.2.0/24 10.1.2/24 10.1 10.1.0.0/16 10.1/16 10 10.0.0.0/8 10/8

<type>inet</type> vs <type>cidr</type> The essential difference between inet and cidr data types is that inet accepts values with nonzero bits to the right of the netmask, whereas cidr does not. If you do not like the output format for inet or cidr values, try the host(), text(), and abbrev() functions. <type>macaddr</></> <indexterm> <primary>macaddr (data type)</primary> </indexterm> <indexterm> <primary>MAC address</primary> <see>macaddr</see> </indexterm> <para> The <type>macaddr</> type stores MAC addresses, i.e., Ethernet card hardware addresses (although MAC addresses are used for other purposes as well). Input is accepted in various customary formats, including <simplelist> <member><literal>'08002b:010203'</></member> <member><literal>'08002b-010203'</></member> <member><literal>'0800.2b01.0203'</></member> <member><literal>'08-00-2b-01-02-03'</></member> <member><literal>'08:00:2b:01:02:03'</></member> </simplelist> which would all specify the same address. Upper and lower case is accepted for the digits <literal>a</> through <literal>f</>. Output is always in the last of the shown forms. </para> <para> The directory <filename class="directory">contrib/mac</filename> in the <productname>PostgreSQL</productname> source distribution contains tools that can be used to map MAC addresses to hardware manufacturer names. </para> </sect2> </sect1> <sect1 id="datatype-bit"> <title>Bit String Types bit strings data type Bit strings are strings of 1's and 0's. They can be used to store or visualize bit masks. There are two SQL bit types: BIT(n) and BIT VARYING(n), where n is a positive integer. BIT type data must match the length n exactly; it is an error to attempt to store shorter or longer bit strings. BIT VARYING data is of variable length up to the maximum length n; longer strings will be rejected. Writing BIT without a length is equivalent to BIT(1), while BIT VARYING without a length specification means unlimited length. If one explicitly casts a bit-string value to BIT(n), it will be truncated or zero-padded on the right to be exactly n bits, without raising an error. Similarly, if one explicitly casts a bit-string value to BIT VARYING(n), it will be truncated on the right if it is more than n bits. Prior to PostgreSQL 7.2, BIT data was always silently truncated or zero-padded on the right, with or without an explicit cast. This was changed to comply with the SQL standard. Refer to for information about the syntax of bit string constants. Bit-logical operators and string manipulation functions are available; see . Using the bit string types CREATE TABLE test (a BIT(3), b BIT VARYING(5)); INSERT INTO test VALUES (B'101', B'00'); INSERT INTO test VALUES (B'10', B'101'); ERROR: Bit string length 2 does not match type BIT(3) INSERT INTO test VALUES (B'10'::bit(3), B'101'); SELECT * FROM test; a | b -----+----- 101 | 00 100 | 101 Object Identifier Types object identifier data type oid regproc regprocedure regoper regoperator regclass regtype xid cid tid Object identifiers (OIDs) are used internally by PostgreSQL as primary keys for various system tables. Also, an OID system column is added to user-created tables (unless WITHOUT OIDS is specified at table creation time). Type oid represents an object identifier. There are also several aliases for oid: regproc, regprocedure, regoper, regoperator, regclass, and regtype. shows an overview. The oid type is currently implemented as an unsigned four-byte integer. Therefore, it is not large enough to provide database-wide uniqueness in large databases, or even in large individual tables. So, using a user-created table's OID column as a primary key is discouraged. OIDs are best used only for references to system tables. The oid type itself has few operations beyond comparison (which is implemented as unsigned comparison). It can be cast to integer, however, and then manipulated using the standard integer operators. (Beware of possible signed-versus-unsigned confusion if you do this.) The oid alias types have no operations of their own except for specialized input and output routines. These routines are able to accept and display symbolic names for system objects, rather than the raw numeric value that type oid would use. The alias types allow simplified lookup of OID values for objects: for example, one may write 'mytable'::regclass to get the OID of table mytable, rather than SELECT oid FROM pg_class WHERE relname = 'mytable'. (In reality, a much more complicated SELECT would be needed to deal with selecting the right OID when there are multiple tables named mytable in different schemas.) Object Identifier Types Type name References Description Value example oid any numeric object identifier 564182 regproc pg_proc function name sum regprocedure pg_proc function with argument types sum(int4) regoper pg_operator operator name + regoperator pg_operator operator with argument types *(integer,integer) or -(NONE,integer) regclass pg_class relation name pg_type regtype pg_type type name integer

All of the OID alias types accept schema-qualified names, and will display schema-qualified names on output if the object would not be found in the current search path without being qualified. The regproc and regoper alias types will only accept input names that are unique (not overloaded), so they are of limited use; for most uses regprocedure or regoperator is more appropriate. For regoperator, unary operators are identified by writing NONE for the unused operand. OIDs are 32-bit quantities and are assigned from a single cluster-wide counter. In a large or long-lived database, it is possible for the counter to wrap around. Hence, it is bad practice to assume that OIDs are unique, unless you take steps to ensure that they are unique. Recommended practice when using OIDs for row identification is to create a unique constraint on the OID column of each table for which the OID will be used. Never assume that OIDs are unique across tables; use the combination of tableoid and row OID if you need a database-wide identifier. (Future releases of PostgreSQL are likely to use a separate OID counter for each table, so that tableoid must be included to arrive at a globally unique identifier.) Another identifier type used by the system is xid, or transaction (abbreviated xact) identifier. This is the data type of the system columns xmin and xmax. Transaction identifiers are 32-bit quantities. In a long-lived database it is possible for transaction IDs to wrap around. This is not a fatal problem given appropriate maintenance procedures; see the &cite-admin; for details. However, it is unwise to depend on uniqueness of transaction IDs over the long term (more than one billion transactions). A third identifier type used by the system is cid, or command identifier. This is the data type of the system columns cmin and cmax. Command identifiers are also 32-bit quantities. This creates a hard limit of 232 (4 billion) SQL commands within a single transaction. In practice this limit is not a problem --- note that the limit is on number of SQL commands, not number of tuples processed. A final identifier type used by the system is tid, or tuple identifier. This is the data type of the system column ctid. A tuple ID is a pair (block number, tuple index within block) that identifies the physical location of the tuple within its table. Pseudo-Types record any anyarray void trigger language_handler cstring internal opaque The PostgreSQL type system contains a number of special-purpose entries that are collectively called pseudo-types. A pseudo-type cannot be used as a column data type, but it can be used to declare a function's argument or result type. Each of the available pseudo-types is useful in situations where a function's behavior does not correspond to simply taking or returning a value of a specific SQL data type. lists the existing pseudo-types. Pseudo-Types Type name Description record Identifies a function returning an unspecified row type any Indicates that a function accepts any input data type whatever anyarray Indicates that a function accepts any array data type void Indicates that a function returns no value trigger A trigger function is declared to return trigger language_handler A procedural language call handler is declared to return language_handler cstring Indicates that a function accepts or returns a null-terminated C string internal Indicates that a function accepts or returns a server-internal data type opaque An obsolete type name that formerly served all the above purposes

Functions coded in C (whether built-in or dynamically loaded) may be declared to accept or return any of these pseudo data types. It is up to the function author to ensure that the function will behave safely when a pseudo-type is used as an argument type. Functions coded in procedural languages may use pseudo-types only as allowed by their implementation languages. At present the procedural languages all forbid use of a pseudo-type as argument type, and allow only void as a result type (plus trigger when the function is used as a trigger). The internal pseudo-type is used to declare functions that are meant only to be called internally by the database system, and not by direct invocation in a SQL query. If a function has at least one internal-type argument then it cannot be called from SQL. To preserve the type safety of this restriction it is important to follow this coding rule: do not create any function that is declared to return internal unless it has at least one internal argument. &array;