This is an SqlPattern
Construct complex tablic result by using a series of queries adding columns to a growing core table.
You need to extract complex tablic result generated by several groupings and joins from a relational database.
- Sql does not allow you to express what you need in one query
- Mass operation are more efficient in the database than in the language that will be used to process the result
Retrieve result from a stored procedure that creates a temporary table and fills it using several queries, each filling one or more columns with data.
The temporary table must be created with create table
statement defining the entire structure, allowing NULLs in columns filled with the queries after the first. The first query is insert into .. select ..
type statement, establishing 'the core' result: the columns uniquely identifying the resulting records with all auxiliary info that can be gathered in one statement. Subsequent queries (again insert into .. select ..
) do not add new rows to the table, but are joins of the temporary table with others, using existing key columns for join condition.
The stored procedure returns the result built from several queries.
You manage to execute much more complex operations in the database, avoiding:
- Slow mass-processing on the client side
- Building the result with inefficient nonrelational operations as cursors
- the entire result is built before the client can process the first row.
- if the result set is very large, you may want to create index(es) on the temporary table to speed up joins.
- care must be taken, if multiple parallel accesses to the procedure are possible.
- each change of the logic requires a change of the stored procedure.
Implementing a complex composition of relational operations can be difficult with the limited expressive power of SQL. By building the result from vertical slices you manage the complexity of the result, building it on several steps.
This is a CodeSmell that often indicates a dysfunctional database schema. On occasions it is necessary, since we don't all have the luxury of changing the schema. Bad things to look for are use of temporary tables for selecting max() elements (as in time-series extractions of rows with latest date on some subkey), and grouped keys.
Well, the transaction processing part of your program would never need this if it is an OO program using relational database for persistance. If you have to make some analytical reports but full-scale OLAP is overkill, it becomes normal that SQL is too limited to get everything you need at once, even though the database is normalized and fine. -- NikolaToshev
This sounds like trying to fit a StarSchema in one table? Is the point of this to create one-temp-table-per-query instead of one-temp-table-per-dimension? Please compare and contrast with StarSchema.
is about creating a schema suitable for analytical querying (no temporary tables here). SliceResultVertically
is about constructing a query when you have the schema given (suitable to your needs to various degree).
Let me try to show that you may need to SliceResultVertically
with a good database design - even design suitable for analysis. Suppose you work with the schema from the StarSchema
description (see http://c2.com/ppr/stars.html#5
Now, you are asked to produce a report showing monthly sales of Gum Balls as a percentage from the yearly sales:
Customer_Name Month Year Monthly_Sales Percentage_From_Yearly_Sales
You have to write a stored procedure that returns a single result set (it will be fed in the reports engine). Meaning of the fields should be obvious, Percentage_From_Yearly_Sales being Monthly_Sales/Yearly_Sales for that customer. The company has 1000 customers and operational data for 20 years. No additional filtering just to simplify things.
There are different solutions, can you see them? (I'll post one applying the pattern in a couple of days - I am not sure if it is the best in this case). -- NikolaToshev
It is probably the simplest and thus understandable approach. Nevertheless, temporary tables can unreasonably load the database since reports are often huge. So in this case I probably would generate the smallest table that does the job which would be yearly sales per customer per product. 20000 short rows. As opposed to the naive approach which would generate a much larger temporary table of 240000 longer rows. The resulting query requires a join to the synthesized table so it should be indexed appropriately. Since it is smaller it may remain in cache making the join almost free. I suppose my point is that this technique can be expensive, and can make it difficult to run OLAP with DSS in the same database. -- RichardHenderson.
Funny, there's no pages on OLAP (OnLineAnalyticalProcessing
) or DSS (DecisionSupportSystems?
I'm experiencing a similar context right now. I've got an online report which lists players in a trivia game, along with their scores for each of the rounds of trivia completed so far. Since each round is generated weekly, the number of columns for each player keeps expanding. I'm currently generating the contest scores by running a query once for each player to retrieve all the player's scores, then turning all those rows into columns to fit into the final report (one player per row). But since I run the score-generating query once for each player, as the number of players grows, I'm quickly reaching ASP's default 90 second time limit. I could always up the time limit, but I'd like to find a better solution that would not take too long to implement. I'm also limited in technology since the client's website only supports Access97 and ASP. I don't believe it's possible to dynamically generate temp tables in Access. Perhaps the best solution with these limitations is to run the query once for all players (rather than once for each player) and 'manually' join the columns in code? Any suggestions?