2 Replies Latest reply on Jun 14, 2015 3:47 PM by markaddleman

    Feature request - Lazy, Incremental Materialized Views

    markaddleman

      Use case is a data source with a large enough number of rows that the existing internal/external materialization system doesn't make sense but the source is expensive to access.  I'd like a leverage point within Teiid to build a cache with the following logic:

      1. Examine the query and determine if any portion of it can be satisfied through the cache.  If so, return the cached results
      2. Rewrite the query to obtain the portion from the source which cannot be satisfied through the cache.
      3. Store the results from #2 in the cache
      4. Return the results from #2

      I'd like the solution to have much of the same qualities of materialized views:  I'd like to use either internal or external materialization, TTL, invalidation, updatable, etc.

       

      To spitball one one possible solution:  an API to manipulate the plan against the incremental materialized view.  The planner would execute an API as it creates the node against the view.  The API would return return the view definition (a SQL string? an AST?) that the planner would then use to prepare the remainder of the plan. 

       

      In the lazy, incremental materialized views use case, the user defined code would examine the pushed qualifiers to view (perhaps this only makes sense in the context of a dependent join?) and bifurcate it as a UNION between a query against the cache and a query against the source.  Somehow, the results of the source must be inserted into the cache.  Would it make sense to leverage anonymous blocks?  Perhaps expand the syntax to support something like the following:

      SELECT blah FROM cache WHERE ...

      UNION

      BEGIN

      INSERT INTO cache SELECT blah FROM source WHERE ...;

      SELECT blah FROM cache WHERE ... WITH RETURN;

      END

        • 1. Re: Feature request - Lazy, Incremental Materialized Views
          shawkins

          Generally determining when predicates are satisfyible with existing cache can be difficult.  Is it possible that a partitioning strategy would make the most sense here, at least as a simplification?

           

          That is rather than fully materializing a view, you could materialize it in partitions with a top level view accessing all of those partitions with a select or where clause partitioning that can then be used by the optimizer?  This could be also be more easily automated based upon just knowing the partitioning strategy / values.

          • 2. Re: Feature request - Lazy, Incremental Materialized Views
            markaddleman

            That's clever but the problem I see is that the partitioning strategy is data dependent.  In order to apply this strategy, I would have to regularly create new materialized views and alter the top level view.  That would eliminate internal materialized views but a clever use of external materialized views might work.  I'll play around with this.