aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/gist/gist.c
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2010-12-09 13:03:11 -0500
committerTom Lane <tgl@sss.pgh.pa.us>2010-12-09 13:03:11 -0500
commit663fc32e26e8df41434d751e2203c1aa410d1916 (patch)
tree628eabcaacb72a0868df89e809bde93c9406ef33 /src/backend/access/gist/gist.c
parent9975c683b102d06ed5d5ab799eaba0d00a9ff38c (diff)
downloadpostgresql-663fc32e26e8df41434d751e2203c1aa410d1916.tar.gz
postgresql-663fc32e26e8df41434d751e2203c1aa410d1916.zip
Eliminate O(N^2) behavior in parallel restore with many blobs.
With hundreds of thousands of TOC entries, the repeated searches in reduce_dependencies() become the dominant cost. Get rid of that searching by constructing reverse-dependency lists, which we can do in O(N) time during the fix_dependencies() preprocessing. I chose to store the reverse dependencies as DumpId arrays for consistency with the forward-dependency representation, and keep the previously-transient tocsByDumpId[] array around to locate actual TOC entry structs quickly from dump IDs. While this fixes the slow case reported by Vlad Arkhipov, there is still a potential for O(N^2) behavior with sufficiently many tables: fix_dependencies itself, as well as mark_create_done and inhibit_data_for_failed_table, are doing repeated searches to deal with table-to-table-data dependencies. Possibly this work could be extended to deal with that, although the latter two functions are also used in non-parallel restore where we currently don't run fix_dependencies. Another TODO is that we fail to parallelize restore of multiple blobs at all. This appears to require changes in the archive format to fix. Back-patch to 9.0 where the problem was reported. 8.4 has potential issues as well; but since it doesn't create a separate TOC entry for each blob, it's at much less risk of having enough TOC entries to cause real problems.
Diffstat (limited to 'src/backend/access/gist/gist.c')
0 files changed, 0 insertions, 0 deletions