Ik weet wel zeker dat de performance uiteindelijk ruk is. Je queries worden namelijk onmogelijk ingewikkeld met ladingen (onnodige) joins.
Do not use Generic Data Models
Frequently I see applications built on a generic data model for "maximum
flexibility" or applications built in ways that prohibit performance. Many times
- these are one in the same thing! For example, it is well known you can
represent any object in a database using just four tables:
Create table objects ( oid int primary key, name varchar2(255) );
Create table attributes
( attrId int primary key, attrName varchar2(255),
datatype varchar2(25) );
Create table object_Attributes
( oid int, attrId int, value varchar2(4000),
primary key(oid,attrId) );
Create table Links ( oid1 int, oid2 int,
primary key (oid1, oid2) );
That's it - no more CREATE TABLE for me! I can fill the attributes table up with
rows like this:
insert into attributes values ( 1, 'DATE_OF_BIRTH', 'DATE' );
insert into attributes values ( 2, 'FIRST_NAME', 'STRING' );
insert into attributes values ( 3, 'LAST_NAME', 'STRING' );
commit;
And now I'm ready to create a PERSON record:
insert into objects values ( 1, 'PERSON' );
insert into object_Attributes values( 1, 1, '15-mar-1965' );
insert into object_Attributes values( 1, 2, 'Thomas' );
insert into object_Attributes values( 1, 3, 'Kyte' );
commit;
insert into objects values ( 2, 'PERSON' );
insert into object_Attributes values( 2, 1, '21-oct-1968' );
insert into object_Attributes values( 2, 2, 'John' );
insert into object_Attributes values( 2, 3, 'Smith' );
commit;
And since I'm good at SQL, I can even query this record up to get the FIRST_NAME
and LAST_NAME of all PERSON records:
ops$tkyte@ORA920> select
max( decode(attrName, 'FIRST_NAME', value, null )) first_name,
2 max( decode( attrName, 'LAST_NAME', value, null ) ) last_name
3 from objects, object_attributes, attributes
4 where attributes.attrName in ( 'FIRST_NAME', 'LAST_NAME' )
5 and object_attributes.attrId = attributes.attrId
6 and object_attributes.oid = objects.oid
7 and objects.name = 'PERSON'
8 group by objects.oid
9 /
FIRST_NAME LAST_NAME
-------------------- --------------------
Thomas Kyte
John Smith
Looks great, right? I mean, the developers don't have to create tables anymore,
we can add columns at the drop of a hat (just requires an insert into the
ATTRIBUTES table). The developers can do whatever they want and the DBA can't
stop them. This is ultimate "flexibility". I've seen people try to build entire
systems on this model.
But, how does it perform? Miserably, terribly, horribly. A simple "select
first_name, last_name from person" query is transformed into a 3-table join with
aggregates and all. Further, if the attributes are "NULLABLE" - that is, there
might not be a row in OBJECT_ATTRIBUTES for some attributes, you may have to
outer join instead of just joining which in some cases can remove more optimal
query plans from consideration.
Writing queries might look pretty straightforward, but it's impossible to do in
a performant fashion. For example, if we wanted to get everyone that was born in
MARCH or has a LAST_NAME = 'SMITH', we could simply take the query from above
and just wrap an inline view around that:
ops$tkyte@ORA920> select *
2 from (
3 select
max(decode(attrName, 'FIRST_NAME', value, null)) first_name,
4 max(decode(attrName, 'LAST_NAME', value, null)) last_name,
5 max(decode(attrName, 'DATE_OF_BIRTH', value, null))
date_of_birth
6 from objects, object_attributes, attributes
7 where attributes.attrName in ( 'FIRST_NAME',
'LAST_NAME', 'DATE_OF_BIRTH' )
8 and object_attributes.attrId = attributes.attrId
9 and object_attributes.oid = objects.oid
10 and objects.name = 'PERSON'
11 group by objects.oid
12 )
13 where last_name = 'Smith'
14 or date_of_birth like '%-mar-%'
15 /
FIRST_NAME LAST_NAME DATE_OF_BIRTH
-------------------- -------------------- --------------------
Thomas Kyte 15-mar-1965
John Smith 21-oct-1968
So, it looks "easy" to query, but think about the performance! If you had a
couple thousand OBJECT records, and a couple tens of thousands of
OBJECT_ATTRIBUTES - Oracle would have to process the entire inner group by query
first and then apply the WHERE clause.
This is not a made up data model, one that I crafted just to make a point. This
is an actual data model that I've seen people try to use. Their goal is ultimate
flexibility. They don't know what OBJECTS they need, they don't know what
ATTRIBUTES they will have. Well - that is what the database was written for in
the first place: Oracle implemented this thing called SQL to define OBJECTS and
ATTRIBUTES and lets you use SQL to query them. You are trying to put a generic
layer on top of a generic layer - and it fails each and every time except for
the most trivial of applications.
[
Voor 5% gewijzigd door
JaQ op 11-04-2006 00:18
]