0

Possible duplicate of Can I avoid a relation loop in my database design?, but I'd like to get a broader answer than for that specific design.

The goal in this case is to store automated testing data as it’s generated. A portion of the relationship diagram is shown below.

enter image description here

  • A variable number of tests may be run on each build, hence the direct one-to-many relationship between Builds and Sessions.
  • Each build is made of several hundred parts, and each part number may be used on several hundred builds, hence the many-to-many relationship between Builds and DT_Parts, associated through LT_HeaderParts.
  • If an assembly error is found during testing, a part or parts may be switched out and the unit retested. Instead of duplicating hundreds of part records on each retest, I implement PartsChangeLog to document any changes made after a given session.
  • PartsChangeLog uses DT_Parts as a dictionary to save memory by storing integers instead of the varchar(20) part_number.

LT_HeaderParts and PartsChangeLog both have appear to have valid, non-redundant reasons for using DT_Parts, yet this setup creates a reference loop and poses the danger of creating a false many-to-many bridge from build_id to session_id that would yield incorrect relationships.

Is this an okay structure? Why or why not?

Community
  • 1
  • 1
Alecg_O
  • 892
  • 1
  • 9
  • 19
  • 1
    http://stackoverflow.com/questions/8115748/why-should-i-avoid-loops-when-designing-relationships-for-a-database might be a good start. – JJ32 Aug 18 '16 at 18:58

2 Answers2

1

What you have is two sets of parts associated with a session: build parts (session -> build ->> part) and changed parts (session ->> partschangelog -> part). As the answer to the question linked by JJ32 explains, consistency is the main concern in these situations. In this case, I suspect the set of changed parts should be a subset of the build parts, but your schema doesn't enforce this.

One way of enforcing it is via controlled redundancy. If you include build_id in PartsChangeLog as a non-prime attribute (and modifying the foreign key reference to Sessions accordingly), you can create two composite foreign key constraints referencing LT_HeaderParts (for build_id, part_added and build_id, part_removed).

This eliminates the possibility of associating inconsistent session_id and build_id via the many-to-many bridge; though if no parts were changed, there won't be such a bridge. That's understandable, our goal is not to replace the direct mapping between session_id and build_id, only to ensure consistency. The rest is up to the query developer.

reaanb
  • 9,806
  • 2
  • 23
  • 37
1

Trying to answer the actual title question "When is it okay to have a relationship loop in my database?".

One part of the answer is that it depends on the intended usage of the schema/diagram per se. Is it intended as a conceptual model, with the purpose of illustrating business concepts ? Then basically you can highlight just any relationship you like. By which I mean you can highlight anything in the form of a relationship if you think that relationship is of interest to the intended business audience. Or is it intended as a logical db schema ?

In that case it mostly depends on the precise "semantics" of the relationships. If two relationships are saying things that are semantically distinct, then you can bet your ... that both will be relevant to the business being modeled and that you should be keeping both.

The simplest example of such a loop is a bill-of-materials structure. Such structures have a single "parts" entity, with a many-to-many relationship of "containment". This "containment" relationship gets instantiated as a "containment" entity with two relationships to the "parts" entity. Each of these two relationships has different semantics (one saying "the containing part must be a known part" and the other saying "the contained part must be a known part") and so they should definitely be kept both.

Erwin Smout
  • 18,113
  • 4
  • 33
  • 52