Database Design Using Key-Value Tables - Drawback no. 3: Problematic use of AND operator when relating to more than one attribute in the query criteria
(Page 5 of 5 )
This is the most problematic issue that needs to be taken into consideration. Since it is complex, let’s start with an example. Consider the following report (report 1):
Fetch all employee names and their start working date, that had started working in 2004. The result of course should look like the following table:
NAME | START_WORK_YEAR |
Dany | 2004 |
Brit | 2004 |
In scenario 1, the following query will provide the required results:
SELECT NAME, START_WORK_YEAR
FROM EMPLOYEES
WHERE START_WORK_YEAR = 2004;
In scenario 2, the following query will provide the required results:
SELECT E.NAME, EA.VALUE1 AS START_WORK_YEAR
FROM EMPLOYEES E INNER JOIN EMPLOYEE_ATTRIBUTES EA
ON (E.ID = EA.EMPLOYEE_ID)
WHERE KEY1 = 'START_WORK_YEAR' AND VALUE1 = '2004';
But in case we want to check more than one attribute in the criteria, and the relation between the two attributes would need to be defined by an AND operator, we need to be careful not to fall into the following trap. Consider the following report (report 2):
Fetch all employee names, and their start working date that are males AND had started working in 2004. The result of course should look like the following table:
In scenario 1, the following query will provide the required results:
SELECT NAME, START_WORK_YEAR
FROM EMPLOYEES
WHERE START_WORK_YEAR = 2004 AND SEX = 'MALE';
NAME | START_WORK_YEAR |
Dany | 2004 |
In scenario 2, many developers fall into the trap and introduce the following query as the report implementation:
SELECT E.NAME, EA.VALUE1 AS START_WORK_YEAR
FROM EMPLOYEES E INNER JOIN EMPLOYEE_ATTRIBUTES EA
ON (E.ID = EA.EMPLOYEE_ID)
WHERE (EA.KEY1 = 'START_WORK_YEAR' AND EA.VALUE1 = '2004')
AND (EA.KEY1 = 'SEX' AND EA.VALUE1 = 'MALE');
The above query will provide the following results:
We get an empty result-set!
The reason for it is that no record in the Cartesian product of EMPLOYEES and EMPLOYEE_ATTRIBUTES fulfill both conditions of
(EA.KEY1 = ‘START_WORK_YEAR’ AND EA.VALUE1 = ‘2004’)
AND (EA.KEY1 = ‘SEX’ AND EA.VALUE1 = ‘MALE’)
In order to better understand this issue, let's look at what the database engine is doing:
First the engine creates the following Cartesian product of EMPLOYEES and EMPLOYEE_ATTRIBUTES:
E.ID | E.NAME | E.SALARY | EA.EMPL- OYEE_ID | EA.KEY | EA.VALUE |
1 | Dany | 4000 | 1 | HAS_COMPANY_VEHICLE | 1 |
1 | Dany | 4000 | 1 | START_WORK_YEAR | 2004 |
1 | Dany | 4000 | 1 | SEX | MALE |
2 | Brit | 5000 | 2 | HAS_COMPANY_VEHICLE | 0 |
2 | Brit | 5000 | 2 | START_WORK_YEAR | 2004 |
2 | Brit | 5000 | 2 | SEX | FEMALE |
3 | Sharon | 1000 | 3 | HAS_COMPANY_VEHICLE | 1 |
3 | Sharon | 1000 | 3 | START_WORK_YEAR | 2005 |
3 | Sharon | 1000 | 3 | SEX | FEMALE |
Then the database engine removes records which do not comply with the where criteria. As we can see, there is no record which complies with both:
(EA.KEY1 = ‘START_WORK_YEAR’ AND EA.VALUE1 = ‘2004’)
AND (EA.KEY1 = ‘SEX’ AND EA.VALUE1 = ‘MALE’)
The above pitfall can be workaround using the following query:
SELECT E.NAME, EA1.VALUE1 AS START_WORK_YEAR
FROM EMPLOYEES E INNER JOIN EMPLOYEE_ATTRIBUTES EA1
ON (E.ID = EA1.EMPLOYEE_ID)
INNER JOIN EMPLOYEE_ATTRIBUTES EA2
ON (EA1.EMPLOYEE_ID = EA2.EMPLOYEE_ID)
WHERE (EA1.KEY1 = 'START_WORK_YEAR' AND EA1.VALUE1 = '2004')
AND (EA2.KEY1 = 'SEX' AND EA2.VALUE1 = 'MALE');
In this case the database engine will create the following Cartesian product and select the marked record using the criteria clause:
E. ID | E. NAME | E. SALARY | EA1/2. EMPL- OYEE_ID | EA1. KEY | EA1. VALUE | EA2. KEY | EA2. VALUE |
1 | Dany | 4000 | 1 | SEX | MALE | SEX | MALE |
1 | Dany | 4000 | 1 | SEX | MALE | START_WORK_YEAR | 2004 |
1 | Dany | 4000 | 1 | START_WORK_YEAR | 2004 | SEX | MALE |
1 | Dany | 4000 | 1 | START_WORK_YEAR | 2004 | START_WORK_YEAR | 2004 |
2 | Brit | 5000 | 2 | SEX | FEMALE | SEX | FEMALE |
2 | Brit | 5000 | 2 | SEX | FEMALE | START_WORK_YEAR | 2004 |
2 | Brit | 5000 | 2 | START_WORK_YEAR | 2004 | SEX | FEMALE |
2 | Brit | 5000 | 2 | START_WORK_YEAR | 2004 | START_WORK_YEAR | 2004 |
3 | Sharon | 1000 | 3 | SEX | FEMALE | SEX | FEMALE |
3 | Sharon | 1000 | 3 | SEX | FEMALE | START_WORK_YEAR | 2005 |
3 | Sharon | 1000 | 3 | START_WORK_YEAR | 2005 | SEX | FEMALE |
3 | Sharon | 1000 | 3 | START_WORK_YEAR | 2005 | START_WORK_YEAR | 2005 |
Conclusion
We can use key-value tables in a database design, but we should keep in mind that the same flexibility which has been enabling us to add attributes in a quick and easy way can also cause data integrity errors.
Furthermore, using key-value tables is recommended mostly with attributes which are required for long descriptions, rather than attributes which are used for data filtering.
| DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware. |