8 Replies Latest reply on Apr 5, 2012 7:06 PM by markaddleman

    Another question about GROUP BY & constant inline view optimization

    markaddleman

      More related to TEIID-1990 & TEIID-1991:

       

      I have the following query:

      SELECT u.datamart_name FROM (SELECT 'abc' AS DATAMART_NAME, a.evttypecode FROM SECURITY_DM_DATACOM_DEV_R140_XE59_CMGRD1.ADMACCOUNT AS a) AS u GROUP BY u.datamart_name

       

      The final line of the debug plan output is:

      OPTIMIZATION COMPLETE:

      PROCESSOR PLAN:

      ProjectNode(1) output=['abc' AS DATAMART_NAME] ['abc' AS DATAMART_NAME]

        LimitNode(2) output=[] limit 1

          AccessNode(3) output=[] SELECT 1 FROM SECURITY_DM_DATACOM_DEV_R140_XE59_CMGRD1.ADMACCOUNT AS g_0

       

      According to the XML plan, the data source receives query

      SELECT 1 FROM SECURITY_DM_DATACOM_DEV_R140_XE59_CMGRD1.ADMACCOUNT AS g_0

       

      However, it appears that the LIMIT 1 is not pushed down to the data source.  This is Teiid 7.7.  The execution factory directly subclasses JDBCExecutionFactory and supports everything except row offset.

       

      I've tried a couple of forms of the query and can't get the LIMIT 1 pushed down.

       

      Any hints on what could be going wrong?  Debug plan is attached.

        • 1. Re: Another question about GROUP BY & constant inline view optimization
          shawkins

          Mark,

           

          RulePushLimit is running, but does not push the LimitNode below the access node.  You should check your support for ExecutionFactory.supportsRowLimit, which is not supported by default.

           

          Steve

          • 2. Re: Another question about GROUP BY & constant inline view optimization
            markaddleman

            You are right.  After adding supportsRowLimit and rewriting the query to avoid the cross-join, the limit is pushed.  The rewritten query is:

            select datamart_name from (

                select datamart_name, evttypecode, evtversion , evtcatcode , evtsysid , evtsysplex , evtutctod , evtsystod , evtesmcode , evtpuuid ,

                evtuserid , evtusername, evtjobname , evtsource , admtarget , vio1code    , vio2code , usrgroup , usrfacility,

                usrseclabel, evttypedesc, evtcatdesc

                FROM (

                        select * from (

                            select 'abc' as DATAMART_NAME , evttypecode, evtversion , evtcatcode , evtsysid , evtsysplex , evtutctod , evtsystod , evtesmcode , evtpuuid ,

                            evtuserid , evtusername, evtjobname , evtsource , admtarget , vio1code    , vio2code , usrgroup , usrfacility,

                            usrseclabel, evttypedesc, evtcatdesc

                            from SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.VIEWSYSACCESS ) t

                        WHERE DATAMART_NAME='abc'

                    ) AS t

            ) t GROUP BY datamart_name

             

            I believe the original query should have also worked since the cross join was against a constant but it's easy for us to rewrite the query this way.

            • 3. Re: Another question about GROUP BY & constant inline view optimization
              markaddleman

              Checking against our more complex case, inline view of unioned selects, I see the limit still isn't pushed and, my reading of the debug plan, indicates that the planner doesn't realize that it could use a limit.

               

              The query is:

              select datamart_name from (

                  select datamart_name, "__objecttype__", evttypecode, evtversion , evtcatcode , evtsysid , evtsysplex , evtutctod , evtsystod ,

                  evtesmcode , evtpuuid , evtuserid , evtusername, evtjobname , evtsource , admtarget , Commandstr, oprcode  ,

                  vio1code , vio2code , evttypedesc , evtcatdesc

                  FROM (

                      select u.datamart_name, u."__objecttype__", u.evttypecode, u.evtversion , u.evtcatcode , u.evtsysid , u.evtsysplex , u.evtutctod , u.evtsystod ,

                      u.evtesmcode , u.evtpuuid , u.evtuserid , u.evtusername, u.evtjobname , u.evtsource , u.admtarget , u.Commandstr, u.oprcode  ,

                      u.vio1code , u.vio2code , u.evttypedesc , u.evtcatdesc

                      FROM (

                          select * from (

                              select 'abc' as DATAMART_NAME, 'SECCVIEWADMACCOUNT' as "__objecttype__", u.evttypecode , evtversion , u.evtcatcode , evtsysid ,

                              evtsysplex , evtutctod , evtsystod , evtesmcode , evtpuuid , evtuserid , evtusername, evtjobname ,

                              evtsource , admtarget , admacctstr as Commandstr, oprcode , vio1code , vio2code  , evttypedesc , evtcatdesc

                              from SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.ADMACCOUNT u

                              left join SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.CONEVTTYPE tp on

                              (tp.evttypecode=u.evttypecode)

                              left join SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.CONEVTCATEGORY ct on

                              (ct.evtcatcode=u.evtcatcode)

                          ) t

                          WHERE DATAMART_NAME='abc'

                          UNION ALL

                          select * from (

                              select 'abc' as DATAMART_NAME, 'SECCVIEWADMPOLICY' as "__objecttype__", u.evttypecode , evtversion , u.evtcatcode , evtsysid ,

                              evtsysplex , evtutctod , evtsystod , evtesmcode , evtpuuid , evtuserid , evtusername, evtjobname ,

                              evtsource , admtarget , admpolicystr as Commandstr, oprcode  , vio1code , vio2code , evttypedesc , evtcatdesc

                              from SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.ADMPOLICY u

                              left join SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.CONEVTTYPE tp on

                              (tp.evttypecode=u.evttypecode)

                              left join SECURITY_DM_DATACOM_DEV_R140_DE29_CMGRD1.CONEVTCATEGORY ct on

                              (ct.evtcatcode=u.evtcatcode)

                          ) t

                          WHERE DATAMART_NAME='abc'

               

                      ) as u

                  ) AS t

              ) t GROUP BY datamart_name

               

              Plan is attached

              • 4. Re: Another question about GROUP BY & constant inline view optimization
                markaddleman

                Actually, now that I read the plan a little more closely, I realize that that it could push the GROUP BY all the way down as a SELECT DISTINCT but isn't.  Since the execution factory  subs from JDBCExecutionFactory, it supports select distinct and aggregates distinct.  Is there some other trick for pushing distinct?  Would this be due to not supporting functionsInGroups?

                • 5. Re: Another question about GROUP BY & constant inline view optimization
                  markaddleman

                  Checking the same query against the DB2ExecutionFactory, I see that Teiid generating

                  SELECT v_0.c_0 FROM (SELECT 'abc' AS c_0 FROM SECURITY_DM_DB2_DA0GPTIB_CMGRQ1.ADMACCOUNT AS g_1 WHERE 1 = 1 UNION ALL SELECT 'abc' AS c_0 FROM SECURITY_DM_DB2_DA0GPTIB_CMGRQ1.ADMPOLICY AS g_0 WHERE 1 = 1) AS v_0 GROUP BY v_0.c_0

                   

                  So, it is properly pushing down the GROUP BY by using an inline view.  Unfortunately, our datasource does not support inline views.  It does support DISTINCT.  In this case, we're grouping on the same columns that are being selected so the DISTINCT should be equivalent, no?

                  • 6. Re: Another question about GROUP BY & constant inline view optimization
                    shawkins

                    Yes, selecting exactly what is in a group by list is logically the same as distinct.  We do not have general logic that checks for alternative pushdown based upon that equivalence though.  It's slightly more complicated though with the interviening view, but yes you could log an enhancement for this case.

                     

                    Do you expect your datamart names to be the same across union branches?  If so was that the reason for using a union instead of a union all initially?

                     

                    Steve

                    • 7. Re: Another question about GROUP BY & constant inline view optimization
                      markaddleman

                      A deeper explanation of the data structure would be helpful:  We have multiple datamarts, all identical to each other in terms of schema that are distinguished from one another through DATAMART_NAME.  Within each datamart is a set of event tables that have a common subset of columns.  For each datamart, we construct a select statement from the common columns and union those event tables.  Finally, we construct a query that is the union together all of the common event tables together across all datamarts.  For all UI queries, we select against the union of unions query as an inline view.  Whew...

                       

                      The query in comment @ Apr 5, 2012 9:05 AM is an example of a UI query against the union of unions inline view where only one datamart is defined.  In a more complex case, you would see a UNION ALL in the FROM clause of the first inline view.

                       

                      Originally, we were using UNIONs to strip out duplicates rows between the event tables but that turns out to be unnecessary.

                       

                      It occurs to me now that we could factor out the DATAMART_NAME from the inner most inline views.  That might give the planner a shot at producing a very efficient plan...

                      • 8. Re: Another question about GROUP BY & constant inline view optimization
                        markaddleman

                        Issue logged as TEIID-1993