Описание тега transitive-closure-table

A transitive closure table is a method for storing hierarchical information in a flat database. It supports referential integrity like adjacency list, but also supports querying hierarchies of arbitrary depth like nested sets.

A transitive closure table is a method for storing a hierarchical graph in a flat database. It supports referential integrity like adjacency lists, but also supports querying hierarchies of arbitrary depth like nested sets (also known as MPTT), smoothing out some of the difficulties inherent in each in exchange for storage space.

Example hierarchy

Let's start out with a tree like this (IDs are in bold):

  • 1 Pet shop
    • 2 Fish
      • 3 Zebrafish
      • 4 Tilapia
    • 5 Rodents
      • 6 Gerbil
      • 7 Squirrels
        • 8 Flying squirrel
        • 9 Red squirrel
    • 10 Birds
      • 11 Cardinal

Closure table

You need a separate table to store your hierarchical information, unlike either adjacency lists or nested sets. For each node, you store the entire path as a set of individual records. For example, 'Squirrels' (node 7) has the following records:

| ancestor | descendant |
-------------------------
| 1        | 7          |
| 5        | 7          |
| 7        | 7          |
| 7        | 8          |
| 7        | 9          |

You'll note a few things:

  • All the ancestors of node 7 have their own record, so you can find out the path of a node just by going SELECT * FROM closure WHERE descendant = 7 (if you're storing your records in SQL).
  • All the direct and indirect descendants can be retrieved as well, by going SELECT * FROM closure WHERE ancestor = 7.
  • Each node is an ancestor and descendant of itself.

Pros

  • Just as with nested sets, an entire path or tree can be retrieved quite simply with one query.
  • Just as with adjacency lists, referential integrity is preserved.
  • Write queries are quite fast and straightforward -- much simpler than nested sets.

Cons

  • Still more complex to understand than adjacency lists.
  • Sacrifices storage space for efficiency.
  • Requires a separate table (although a database that supports triggers can minimize the maintenance cost of this issue).

References