diff options
Diffstat (limited to 'src')
-rw-r--r-- | src/backend/utils/mmgr/README | 318 |
1 files changed, 145 insertions, 173 deletions
diff --git a/src/backend/utils/mmgr/README b/src/backend/utils/mmgr/README index f97d7653de0..b83b29c268f 100644 --- a/src/backend/utils/mmgr/README +++ b/src/backend/utils/mmgr/README @@ -1,15 +1,7 @@ src/backend/utils/mmgr/README -Notes About Memory Allocation Redesign -====================================== - -Up through version 7.0, Postgres had serious problems with memory leakage -during large queries that process a lot of pass-by-reference data. There -was no provision for recycling memory until end of query. This needed to be -fixed, even more so with the advent of TOAST which allows very large chunks -of data to be passed around in the system. This document describes the new -memory management system implemented in 7.1. - +Memory Context System Design Overview +===================================== Background ---------- @@ -38,10 +30,10 @@ to or get more memory from the same context the chunk was originally allocated in. At all times there is a "current" context denoted by the -CurrentMemoryContext global variable. The backend macro palloc() -implicitly allocates space in that context. The MemoryContextSwitchTo() -operation selects a new current context (and returns the previous context, -so that the caller can restore the previous context before exiting). +CurrentMemoryContext global variable. palloc() implicitly allocates space +in that context. The MemoryContextSwitchTo() operation selects a new current +context (and returns the previous context, so that the caller can restore the +previous context before exiting). The main advantage of memory contexts over plain use of malloc/free is that the entire contents of a memory context can be freed easily, without @@ -60,8 +52,10 @@ The behavior of palloc and friends is similar to the standard C library's malloc and friends, but there are some deliberate differences too. Here are some notes to clarify the behavior. -* If out of memory, palloc and repalloc exit via elog(ERROR). They never -return NULL, and it is not necessary or useful to test for such a result. +* If out of memory, palloc and repalloc exit via elog(ERROR). They +never return NULL, and it is not necessary or useful to test for such +a result. With palloc_extended() that behavior can be overridden +using the MCXT_ALLOC_NO_OOM flag. * palloc(0) is explicitly a valid operation. It does not return a NULL pointer, but a valid chunk of which no bytes may be used. However, the @@ -71,28 +65,18 @@ error. Similarly, repalloc allows realloc'ing to zero size. * pfree and repalloc do not accept a NULL pointer. This is intentional. -pfree/repalloc No Longer Depend On CurrentMemoryContext -------------------------------------------------------- - -Since Postgres 7.1, pfree() and repalloc() can be applied to any chunk -whether it belongs to CurrentMemoryContext or not --- the chunk's owning -context will be invoked to handle the operation, regardless. This is a -change from the old requirement that CurrentMemoryContext must be set -to the same context the memory was allocated from before one can use -pfree() or repalloc(). - -There was some consideration of getting rid of CurrentMemoryContext entirely, -instead requiring the target memory context for allocation to be specified -explicitly. But we decided that would be too much notational overhead --- -we'd have to pass an appropriate memory context to called routines in -many places. For example, the copyObject routines would need to be passed -a context, as would function execution routines that return a -pass-by-reference datatype. And what of routines that temporarily -allocate space internally, but don't return it to their caller? We -certainly don't want to clutter every call in the system with "here is -a context to use for any temporary memory allocation you might want to -do". So there'd still need to be a global variable specifying a suitable -temporary-allocation context. That might as well be CurrentMemoryContext. +The Current Memory Context +-------------------------- + +Because it would be too much notational overhead to always pass an +appropriate memory context to called routines, there always exists the +notion of the current memory context CurrentMemoryContext. Without it, +for example, the copyObject routines would need to be passed a context, as +would function execution routines that return a pass-by-reference +datatype. Similarly for routines that temporarily allocate space +internally, but don't return it to their caller? We certainly don't +want to clutter every call in the system with "here is a context to +use for any temporary memory allocation you might want to do". The upshot of that reasoning, though, is that CurrentMemoryContext should generally point at a short-lifespan context if at all possible. During @@ -102,42 +86,83 @@ context having greater than transaction lifespan, since doing so risks permanent memory leaks. -Additions to the Memory-Context Mechanism ------------------------------------------ - -Before 7.1 memory contexts were all independent, but it was too hard to -keep track of them; with lots of contexts there needs to be explicit -mechanism for that. - -We solved this by creating a tree of "parent" and "child" contexts. When -creating a memory context, the new context can be specified to be a child -of some existing context. A context can have many children, but only one -parent. In this way the contexts form a forest (not necessarily a single -tree, since there could be more than one top-level context; although in -current practice there is only one top context, TopMemoryContext). - -We then say that resetting or deleting any particular context resets or -deletes all its direct and indirect children as well. This feature allows -us to manage a lot of contexts without fear that some will be leaked; we -only need to keep track of one top-level context that we are going to -delete at transaction end, and make sure that any shorter-lived contexts -we create are descendants of that context. Since the tree can have -multiple levels, we can deal easily with nested lifetimes of storage, -such as per-transaction, per-statement, per-scan, per-tuple. Storage -lifetimes that only partially overlap can be handled by allocating -from different trees of the context forest (there are some examples -in the next section). - -Actually, it turns out that resetting a given context should almost -always imply deleting, not just resetting, any child contexts it has. -So MemoryContextReset() means that, and if you really do want a tree of -empty contexts you need to call MemoryContextResetOnly() plus -MemoryContextResetChildren(). +pfree/repalloc Do Not Depend On CurrentMemoryContext +---------------------------------------------------- + +pfree() and repalloc() can be applied to any chunk whether it belongs +to CurrentMemoryContext or not --- the chunk's owning context will be +invoked to handle the operation, regardless. + + +"Parent" and "Child" Contexts +----------------------------- + +If all contexts were independent, it'd be hard to keep track of them, +especially in error cases. That is solved this by creating a tree of +"parent" and "child" contexts. When creating a memory context, the +new context can be specified to be a child of some existing context. +A context can have many children, but only one parent. In this way +the contexts form a forest (not necessarily a single tree, since there +could be more than one top-level context; although in current practice +there is only one top context, TopMemoryContext). + +Deleting a context deletes all its direct and indirect children as +well. When resetting a context it's almost always more useful to +delete child contexts, thus MemoryContextReset() means that, and if +you really do want a tree of empty contexts you need to call +MemoryContextResetOnly() plus MemoryContextResetChildren(). + +These features allow us to manage a lot of contexts without fear that +some will be leaked; we only need to keep track of one top-level +context that we are going to delete at transaction end, and make sure +that any shorter-lived contexts we create are descendants of that +context. Since the tree can have multiple levels, we can deal easily +with nested lifetimes of storage, such as per-transaction, +per-statement, per-scan, per-tuple. Storage lifetimes that only +partially overlap can be handled by allocating from different trees of +the context forest (there are some examples in the next section). For convenience we also provide operations like "reset/delete all children of a given context, but don't reset or delete that context itself". +Memory Context Reset/Delete Callbacks +------------------------------------- + +A feature introduced in Postgres 9.5 allows memory contexts to be used +for managing more resources than just plain palloc'd memory. This is +done by registering a "reset callback function" for a memory context. +Such a function will be called, once, just before the context is next +reset or deleted. It can be used to give up resources that are in some +sense associated with an object allocated within the context. Possible +use-cases include +* closing open files associated with a tuplesort object; +* releasing reference counts on long-lived cache objects that are held + by some object within the context being reset; +* freeing malloc-managed memory associated with some palloc'd object. +That last case would just represent bad programming practice for pure +Postgres code; better to have made all the allocations using palloc, +in the target context or some child context. However, it could well +come in handy for code that interfaces to non-Postgres libraries. + +Any number of reset callbacks can be established for a memory context; +they are called in reverse order of registration. Also, callbacks +attached to child contexts are called before callbacks attached to +parent contexts, if a tree of contexts is being reset or deleted. + +The API for this requires the caller to provide a MemoryContextCallback +memory chunk to hold the state for a callback. Typically this should be +allocated in the same context it is logically attached to, so that it +will be released automatically after use. The reason for asking the +caller to provide this memory is that in most usage scenarios, the caller +will be creating some larger struct within the target context, and the +MemoryContextCallback struct can be made "for free" without a separate +palloc() call by including it in this larger struct. + + +Memory Contexts in Practice +=========================== + Globally Known Contexts ----------------------- @@ -325,83 +350,64 @@ copy step. Mechanisms to Allow Multiple Types of Contexts ---------------------------------------------- -We may want several different types of memory contexts with different -allocation policies but similar external behavior. To handle this, -memory allocation functions will be accessed via function pointers, -and we will require all context types to obey the conventions given here. -(As of 2015, there's actually still just one context type; but interest in -creating other types has never gone away entirely, so we retain this API.) - -A memory context is represented by an object like - -typedef struct MemoryContextData -{ - NodeTag type; /* identifies exact kind of context */ - MemoryContextMethods methods; - MemoryContextData *parent; /* NULL if no parent (toplevel context) */ - MemoryContextData *firstchild; /* head of linked list of children */ - MemoryContextData *nextchild; /* next child of same parent */ - char *name; /* context name (just for debugging) */ -} MemoryContextData, *MemoryContext; - -This is essentially an abstract superclass, and the "methods" pointer is -its virtual function table. Specific memory context types will use +To efficiently allow for different allocation patterns, and for +experimentation, we allow for different types of memory contexts with +different allocation policies but similar external behavior. To +handle this, memory allocation functions are accessed via function +pointers, and we require all context types to obey the conventions +given here. + +A memory context is represented by struct MemoryContextData (see +memnodes.h). This struct identifies the exact type of the context, and +contains information common between the different types of +MemoryContext like the parent and child contexts, and the name of the +context. + +This is essentially an abstract superclass, and the behavior is +determined by the "methods" pointer is its virtual function table +(struct MemoryContextMethods). Specific memory context types will use derived structs having these fields as their first fields. All the -contexts of a specific type will have methods pointers that point to the -same static table of function pointers, which look like - -typedef struct MemoryContextMethodsData -{ - Pointer (*alloc) (MemoryContext c, Size size); - void (*free_p) (Pointer chunk); - Pointer (*realloc) (Pointer chunk, Size newsize); - void (*reset) (MemoryContext c); - void (*delete) (MemoryContext c); -} MemoryContextMethodsData, *MemoryContextMethods; - -Alloc, reset, and delete requests will take a MemoryContext pointer -as parameter, so they'll have no trouble finding the method pointer -to call. Free and realloc are trickier. To make those work, we -require all memory context types to produce allocated chunks that -are immediately preceded by a standard chunk header, which has the -layout - -typedef struct StandardChunkHeader -{ - MemoryContext mycontext; /* Link to owning context object */ - Size size; /* Allocated size of chunk */ -}; - -It turns out that the pre-existing aset.c memory context type did this -already, and probably any other kind of context would need to have the -same data available to support realloc, so this is not really creating -any additional overhead. (Note that if a context type needs more per- -allocated-chunk information than this, it can make an additional -nonstandard header that precedes the standard header. So we're not -constraining context-type designers very much.) - -Given this, the pfree routine looks something like - - StandardChunkHeader * header = - (StandardChunkHeader *) ((char *) p - sizeof(StandardChunkHeader)); - - (*header->mycontext->methods->free_p) (p); +contexts of a specific type will have methods pointers that point to +the same static table of function pointers. + +While operations like allocating from and resetting a context take the +relevant MemoryContext as a parameter, operations like free and +realloc are trickier. To make those work, we require all memory +context types to produce allocated chunks that are immediately, +without any padding, preceded by a pointer to the corresponding +MemoryContext. + +If a type of allocator needs additional information about its chunks, +like e.g. the size of the allocation, that information can in turn +precede the MemoryContext. This means the only overhead implied by +the memory context mechanism is a pointer to its context, so we're not +constraining context-type designers very much. + +Given this, routines like pfree their corresponding context with an +operation like (although that is usually encapsulated in +GetMemoryChunkContext()) + + MemoryContext context = *(MemoryContext*) (((char *) pointer) - sizeof(void *)); + +and then invoke the corresponding method for the context + + (*context->methods->free_p) (p); More Control Over aset.c Behavior --------------------------------- -Previously, aset.c always allocated an 8K block upon the first allocation -in a context, and doubled that size for each successive block request. -That's good behavior for a context that might hold *lots* of data, and -the overhead wasn't bad when we had only a few contexts in existence. -With dozens if not hundreds of smaller contexts in the system, we need -to be able to fine-tune things a little better. +By default aset.c always allocates an 8K block upon the first +allocation in a context, and doubles that size for each successive +block request. That's good behavior for a context that might hold +*lots* of data. But if there are dozens if not hundreds of smaller +contexts in the system, we need to be able to fine-tune things a +little better. -The creator of a context is now able to specify an initial block size -and a maximum block size. Selecting smaller values can prevent wastage -of space in contexts that aren't expected to hold very much (an example is -the relcache's per-relation contexts). +The creator of a context is able to specify an initial block size and +a maximum block size. Selecting smaller values can prevent wastage of +space in contexts that aren't expected to hold very much (an example +is the relcache's per-relation contexts). Also, it is possible to specify a minimum context size. If this value is greater than zero then a block of that size will be grabbed @@ -414,37 +420,3 @@ will not allocate very much space per tuple cycle. To make this usage pattern cheap, the first block allocated in a context is not given back to malloc() during reset, but just cleared. This avoids malloc thrashing. - - -Memory Context Reset/Delete Callbacks -------------------------------------- - -A feature introduced in Postgres 9.5 allows memory contexts to be used -for managing more resources than just plain palloc'd memory. This is -done by registering a "reset callback function" for a memory context. -Such a function will be called, once, just before the context is next -reset or deleted. It can be used to give up resources that are in some -sense associated with an object allocated within the context. Possible -use-cases include -* closing open files associated with a tuplesort object; -* releasing reference counts on long-lived cache objects that are held - by some object within the context being reset; -* freeing malloc-managed memory associated with some palloc'd object. -That last case would just represent bad programming practice for pure -Postgres code; better to have made all the allocations using palloc, -in the target context or some child context. However, it could well -come in handy for code that interfaces to non-Postgres libraries. - -Any number of reset callbacks can be established for a memory context; -they are called in reverse order of registration. Also, callbacks -attached to child contexts are called before callbacks attached to -parent contexts, if a tree of contexts is being reset or deleted. - -The API for this requires the caller to provide a MemoryContextCallback -memory chunk to hold the state for a callback. Typically this should be -allocated in the same context it is logically attached to, so that it -will be released automatically after use. The reason for asking the -caller to provide this memory is that in most usage scenarios, the caller -will be creating some larger struct within the target context, and the -MemoryContextCallback struct can be made "for free" without a separate -palloc() call by including it in this larger struct. |