1

I have inherited a legacy application which makes use of a self-referencing table to facilitate a hierarchical structure. This results in recursive method calls which are creating a "bad smell".

The parent_id column references the primary key of the same table, and there are roughly 25 million records here:

+-------------+---------+------+-----+---------+----------------+
| Field       | Type    | Null | Key | Default | Extra          |
+-------------+---------+------+-----+---------+----------------+
| phase_id    | int(10) | NO   | PRI | NULL    | auto_increment |
| plat_id     | int(10) | YES  | MUL | NULL    |                |
| name        | text    | YES  |     | NULL    |                |
| parent_id   | int(10) | YES  | MUL | NULL    |                |
| plan_id     | int(10) | YES  | MUL | NULL    |                |
+-------------+---------+------+-----+---------+----------------+

mysql> show table status like 'ref'\G
*************************** 1. row ***************************
Name: phase
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 25223658
Avg_row_length: 20
Data_length: 509450960
Max_data_length: 281474976710655
Index_length: 1026267136
Data_free: 0
Auto_increment: 25238013

I have a few questions about this kind of structure:

  • Is it generally bad practice to implement a self-referencing table? The main negative I can think of is that it is difficult/impossible to get the maximum depth of the hierarchy in a single query as there may be an X number children.
  • Is it worth redesigning this? Having so much data makes it more difficult to move it around.
  • What are my options? I have heard a little bit about table partitioning, but don't know if it is suitable in my scenario.

Any pointers would be really appreciated ns

nonshatter
  • 3,347
  • 6
  • 23
  • 27
  • possible duplicate of [What are the Options for Storing Hierarchical Data in a Relational Database?](http://stackoverflow.com/questions/4048151/what-are-the-options-for-storing-hierarchical-data-in-a-relational-database) – Sebas Jul 12 '12 at 14:10
  • Thanks for the link Sebas, I did not find that in my search. The 'Flat Table Model' described in this article sounds very good: http://www.ferdychristant.com/blog//articles/DOMM-7QJPM7 – nonshatter Jul 12 '12 at 14:33
  • 1
    Yes this is a really interesting and challenging subject. – Sebas Jul 12 '12 at 14:35

1 Answers1

0

We ended up scrapping the existing self-referencing table. And created a new table to house a more simple model.

Thanks to Sebas for the Link above. There is a lot of goodness in there!

Community
  • 1
  • 1
nonshatter
  • 3,347
  • 6
  • 23
  • 27