MySQL reporting question

Joseph Sinclair plug-discussion at stcaz.net
Thu Jun 24 09:02:56 MST 2010


der.hans wrote:
> Am 22. Jun, 2010 schwätzte Joseph Sinclair so:
> 
>> As I understand your structure and requirements (it's a bit fuzzy,
>> especially the t2.varchar2 == t2.varchar2)
>> Create two *materialized* views and refresh both views (1 then 2)
>> every reporting period (daily?).
> 
> When a batch is run or when the CEO asks :). Essentially he just needs to
> know statuses from our batches, so the former might be sufficient.
> 
>>  View1 : select status, varchar1, varchar2, max(timestamp) from t2
>> where varchar1 != NULL AND max(timestamp) GROUP BY status, varchar1,
>> varchar2;
> 
> I think I would first need a view that gets varchar2 and status where
> max(timestamp), group by varchar2. varchar1 can be NULL in t2.

All of the NULL columns disappear when joined to t1, so it's more efficient to take them out up front and reduce the size of the materialized view.

I also made an error in the View1 define; the "AND max(timestamp)" clause should be removed (the group-by handles getting max(timestamp)).

> 
> We've solved much of the problem by deciding to add columns to t1. I think
> the program that builds t1 can also get varchar2. Even if it can't, I
> can add varchar2 from my logs that have varchar1 in them. It'll mean t1
> doesn't get some updates, but business analytics group doesn't care about
> any of those updates anyway.  All the updates they care about are from
> after varchar1 shows up in my logs.

This is probably not ideal.  You're repeating data across the tables, and that tends to make performance worse, not better.

> 
> I will then do update t1 adding each new status based on varchar2.
> Theoretically my logs should never get an intermediate status after a
> final status, so last one in will be the winner.
> 
> With varchar2 in both tables, it will be much easier to do sanity checks
> for monitoring to make sure data is staying in sync and the last status in
> is truly a terminal status.
> 
> Rather than doing the t1 update, would it be more efficient to use a
> trigger for the updates?

A bulk update (particularly if you can disable transactions for the update) is almost always faster, in aggregate, than a trigger.

> 
> ciao,
> 
> der.hans

==Joseph++


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20100624/7187411f/attachment.pgp>


More information about the PLUG-discuss mailing list