1 files changed, 94 insertions, 65 deletions
diff --git a/doc/src/sgml/page.sgml b/doc/src/sgml/page.sgml
index 7551085dc94..d7096a4bbe1 100644
--- a/doc/src/sgml/page.sgml
+++ b/doc/src/sgml/page.sgml
@@ -4,13 +4,17 @@
 
 <abstract>
 <para>
-A description of the database file default page format.
+A description of the database file page format.
 </para>
 </abstract>
 
 <para>
-This section provides an overview of the page format used by <productname>PostgreSQL</productname>
-tables.  User-defined access methods need not use this page format.
+This section provides an overview of the page format used by
+<productname>PostgreSQL</productname> tables and indexes.  (Index
+access methods need not use this page format.  At present, all index
+methods do use this basic format, but the data kept on index metapages
+usually doesn't follow the item layout rules exactly.)  TOAST tables
+and sequences are formatted just like a regular table.
 </para>
 
 <para>
@@ -18,15 +22,13 @@ In the following explanation, a
 <firstterm>byte</firstterm>
 is assumed to contain 8 bits.  In addition, the term
 <firstterm>item</firstterm>
-refers to data that is stored in <productname>PostgreSQL</productname> tables.
+refers to an individual data value that is stored on a page.  In a table,
+an item is a tuple (row); in an index, an item is an index entry.
 </para>
 
 <para>
 
-<xref linkend="page-table"> shows how pages in both normal
- <productname>PostgreSQL</productname> tables and
- <productname>PostgreSQL</productname> indexes (e.g., a B-tree index)
-are structured. This structure is also used for toast tables and sequences.
+<xref linkend="page-table"> shows the basic layout of a page.
 There are five parts to each page.
 
 </para>
@@ -48,12 +50,13 @@ Item
 
 <row>
  <entry>PageHeaderData</entry>
- <entry>20 bytes long. Contains general information about the page to allow to access it.</entry>
+ <entry>20 bytes long. Contains general information about the page, including
+free space pointers.</entry>
 </row>
 
 <row>
-<entry>itemPointerData</entry>
-<entry>List of (offset,length) pairs pointing to the actual item.</entry>
+<entry>ItemPointerData</entry>
+<entry>Array of (offset,length) pairs pointing to the actual items.</entry>
 </row>
 
 <row>
@@ -62,13 +65,14 @@ Item
 </row>
 
 <row>
-<entry>items</entry>
-<entry>The actual items themselves. Different access method have different data here.</entry>
+<entry>Items</entry>
+<entry>The actual items themselves.</entry>
 </row>
 
 <row>
 <entry>Special Space</entry>
-<entry>Access method specific data. Different method store different data. Unused by normal tables.</entry>
+<entry>Index access method specific data. Different methods store different
+data. Empty in ordinary tables.</entry>
 </row>
 
 </tbody>
@@ -78,11 +82,12 @@ Item
  <para>
 
   The first 20 bytes of each page consists of a page header
-  (PageHeaderData). It's format is detailed in <xref
+  (PageHeaderData). Its format is detailed in <xref
   linkend="pageheaderdata-table">. The first two fields deal with WAL
   related stuff. This is followed by three 2-byte integer fields
-  (<firstterm>lower</firstterm>, <firstterm>upper</firstterm>, and
-  <firstterm>special</firstterm>). These represent byte offsets to the start
+  (<structfield>pd_lower</structfield>, <structfield>pd_upper</structfield>,
+  and <structfield>pd_special</structfield>). These represent byte offsets to
+  the start
   of unallocated space, to the end of unallocated space, and to the start of
   the special space. 
   
@@ -104,7 +109,7 @@ Item
   <row>
    <entry>pd_lsn</entry>
    <entry>XLogRecPtr</entry>
-   <entry>6 bytes</entry>
+   <entry>8 bytes</entry>
    <entry>LSN: next byte after last byte of xlog</entry>
   </row>
   <row>
@@ -132,38 +137,51 @@ Item
    <entry>Offset to start of special space.</entry>
   </row>
   <row>
-   <entry>pd_opaque</entry>
-   <entry>OpaqueData</entry>
+   <entry>pd_pagesize_version</entry>
+   <entry>uint16</entry>
    <entry>2 bytes</entry>
-   <entry>AM-generic information. Currently just stores the page size.</entry>
+   <entry>Page size and layout version number information.</entry>
   </row>
  </tbody>
  </tgroup>
  </table>
 
+ <para>
+  All the details may be found in src/include/storage/bufpage.h.
+ </para>
+
  <para>  
   Special space is a region at the end of the page that is allocated at page
   initialization time and contains information specific to an access method. 
-  The last 2 bytes of the page header, <firstterm>opaque</firstterm>,
-  currently only stores the page size.  Page size is stored in each page
-  because frames in the buffer pool may be subdivided into equal sized pages
-  on a frame by frame basis within a table (is this true? - mvo).
-
+  The last 2 bytes of the page header,
+  <structfield>pd_pagesize_version</structfield>, store both the page size
+  and a version indicator.  Beginning with
+  <productname>PostgreSQL</productname> 7.3 the version number is 1; prior
+  releases used version number 0.  (The basic page layout and header format
+  has not changed, but the layout of heap tuple headers has.)  The page size
+  is basically only present as a cross-check; there is no support for having
+  more than one page size in an installation.
  </para>
 
  <para>
 
   Following the page header are item identifiers
-  (<firstterm>ItemIdData</firstterm>).  New item identifiers are allocated
-  from the first four bytes of unallocated space.  Because an item
-  identifier is never moved until it is freed, its index may be used to
-  indicate the location of an item on a page.  In fact, every pointer to an
-  item (<firstterm>ItemPointer</firstterm>, also know as
-  <firstterm>CTID</firstterm>) created by
-  <productname>PostgreSQL</productname> consists of a frame number and an
-  index of an item identifier.  An item identifier contains a byte-offset to
+  (<type>ItemIdData</type>), each requiring four bytes.
+  An item identifier contains a byte-offset to
   the start of an item, its length in bytes, and a set of attribute bits
   which affect its interpretation.
+  New item identifiers are allocated
+  as needed from the beginning of the unallocated space.
+  The number of item identifiers present can be determined by looking at
+  <structfield>pd_lower</>, which is increased to allocate a new identifier.
+  Because an item
+  identifier is never moved until it is freed, its index may be used on a
+  long-term basis to reference an item, even when the item itself is moved
+  around on the page to compact free space.  In fact, every pointer to an
+  item (<type>ItemPointer</type>, also known as
+  <type>CTID</type>) created by
+  <productname>PostgreSQL</productname> consists of a page number and the
+  index of an item identifier.
 
  </para>
 
@@ -171,8 +189,8 @@ Item
  
   The items themselves are stored in space allocated backwards from the end
   of unallocated space.  The exact structure varies depending on what the
-  table is to contain. Sequences and tables both use a structure named
-  <firstterm>HeapTupleHeaderData</firstterm>, describe below.
+  table is to contain. Tables and sequences both use a structure named
+  <type>HeapTupleHeaderData</type>, described below.
 
  </para>
  
@@ -180,20 +198,33 @@ Item
  
   The final section is the "special section" which may contain anything the
   access method wishes to store. Ordinary tables do not use this at all
-  (indicated by setting the offset to the pagesize).
+  (indicated by setting <structfield>pd_special</> to equal the pagesize).
   
  </para>
  
  <para>
 
-  All tuples are structured the same way. A header of around 31 bytes
-  followed by an optional null bitmask and the data. The header is detailed
-  below in <xref linkend="heaptupleheaderdata-table">.  The null bitmask is
-  only present if the <firstterm>HEAP_HASNULL</firstterm> bit is set in the
-  <firstterm>t_infomask</firstterm>. If it is present it takes up the space
-  between the end of the header and the beginning of the data, as indicated
-  by the <firstterm>t_hoff</firstterm> field. In this list of bits, a 1 bit
-  indicates not-null, a 0 bit is a null.
+  All table tuples are structured the same way. There is a fixed-size
+  header (occupying 23 bytes on most machines), followed by an optional null
+  bitmap, an optional object ID field, and the user data. The header is
+  detailed
+  in <xref linkend="heaptupleheaderdata-table">.  The actual user data
+  (fields of the tuple) begins at the offset indicated by
+  <structfield>t_hoff</>, which must always be a multiple of the MAXALIGN
+  distance for the platform.
+  The null bitmap is
+  only present if the <firstterm>HEAP_HASNULL</firstterm> bit is set in
+  <structfield>t_infomask</structfield>. If it is present it begins just after
+  the fixed header and occupies enough bytes to have one bit per data column
+  (that is, <structfield>t_natts</> bits altogether). In this list of bits, a
+  1 bit indicates not-null, a 0 bit is a null.  When the bitmap is not
+  present, all columns are assumed not-null.
+  The object ID is only present if the <firstterm>HEAP_HASOID</firstterm> bit
+  is set in <structfield>t_infomask</structfield>.  If present, it appears just
+  before the <structfield>t_hoff</> boundary.  Any padding needed to make
+  <structfield>t_hoff</> a MAXALIGN multiple will appear between the null
+  bitmap and the object ID.  (This in turn ensures that the object ID is
+  suitably aligned.)
   
  </para>
  
@@ -211,34 +242,34 @@ Item
  </thead>
  <tbody>
   <row>
-   <entry>t_oid</entry>
-   <entry>Oid</entry>
+   <entry>t_xmin</entry>
+   <entry>TransactionId</entry>
    <entry>4 bytes</entry>
-   <entry>OID of this tuple</entry>
+   <entry>insert XID stamp</entry>
   </row>
   <row>
    <entry>t_cmin</entry>
    <entry>CommandId</entry>
    <entry>4 bytes</entry>
-   <entry>insert CID stamp</entry>
+   <entry>insert CID stamp (overlays with t_xmax)</entry>
   </row>
   <row>
-   <entry>t_cmax</entry>
-   <entry>CommandId</entry>
+   <entry>t_xmax</entry>
+   <entry>TransactionId</entry>
    <entry>4 bytes</entry>
-   <entry>delete CID stamp</entry>
+   <entry>delete XID stamp</entry>
   </row>
   <row>
-   <entry>t_xmin</entry>
-   <entry>TransactionId</entry>
+   <entry>t_cmax</entry>
+   <entry>CommandId</entry>
    <entry>4 bytes</entry>
-   <entry>insert XID stamp</entry>
+   <entry>delete CID stamp (overlays with t_xvac)</entry>
   </row>
   <row>
-   <entry>t_xmax</entry>
+   <entry>t_xvac</entry>
    <entry>TransactionId</entry>
    <entry>4 bytes</entry>
-   <entry>delete XID stamp</entry>
+   <entry>XID for VACUUM operation moving tuple</entry>
   </row>
   <row>
    <entry>t_ctid</entry>
@@ -256,30 +287,28 @@ Item
    <entry>t_infomask</entry>
    <entry>uint16</entry>
    <entry>2 bytes</entry>
-   <entry>Various flags</entry>
+   <entry>various flags</entry>
   </row>
   <row>
    <entry>t_hoff</entry>
    <entry>uint8</entry>
    <entry>1 byte</entry>
-   <entry>length of tuple header. Also offset of data.</entry>
+   <entry>offset to user data</entry>
   </row>
  </tbody>
  </tgroup>
  </table>
 
  <para>
- 
-  All the details may be found in src/include/storage/bufpage.h.
-  
+   All the details may be found in src/include/access/htup.h.
  </para>
 
  <para>
  
   Interpreting the actual data can only be done with information obtained
   from other tables, mostly <firstterm>pg_attribute</firstterm>. The
-  particular fields are <firstterm>attlen</firstterm> and
-  <firstterm>attalign</firstterm>. There is no way to directly get a
+  particular fields are <structfield>attlen</structfield> and
+  <structfield>attalign</structfield>. There is no way to directly get a
   particular attribute, except when there are only fixed width fields and no
   NULLs. All this trickery is wrapped up in the functions
   <firstterm>heap_getattr</firstterm>, <firstterm>fastgetattr</firstterm>
@@ -293,7 +322,7 @@ Item
   the next. Then make sure you have the right alignment.  If the field is a
   fixed width field, then all the bytes are simply placed. If it's a
   variable length field (attlen == -1) then it's a bit more complicated,
-  using the variable length structure <firstterm>varattrib</firstterm>.
+  using the variable length structure <type>varattrib</type>.
   Depending on the flags, the data may be either inline, compressed or in
   another table (TOAST).