sql >> Base de Datos >  >> RDS >> PostgreSQL

Optimizar agregados lentos en unión LATERAL

Esta debería ser una variante más rápida con LATERAL subconsultas Sin probar.

SELECT s.record_id, s.security_id, s.date
     , s.price / l.pmax   AS price_to_peak_earnings
     , s.price / l.pmin   AS price_to_minimum_earnings
  -- , ...
     , s.price / l.cape1  AS cape1
     , s.price / l.cape2  AS cape2
  -- , ...
     , s.price / l.cape10 AS cape10
     , s.price / l.capb1  AS capb1
     , s.price / l.capb2  AS capb2
  -- , ...
     , s.price / l.capb10 AS capb10
  -- , ...
FROM  (
   SELECT *
        , (date - interval  '1 y')::date AS date1
        , (date - interval  '2 y')::date AS date2
        -- ...
        , (date - interval '10 y')::date AS date10
   FROM  (
      SELECT *, min(date) OVER (PARTITION BY security_id) AS min_date
      FROM   security_data
      ) s1
   ) s
LEFT   JOIN LATERAL (
   SELECT CASE WHEN s.date10 >= s.min_date THEN NULLIF(max(earnings)                               , 0) END AS pmax
        , CASE WHEN s.date10 >= s.min_date THEN NULLIF(min(earnings)                               , 0) END AS pmin
        -- ...
        ,                                       NULLIF(avg(earnings) FILTER (WHERE date >= s.date1), 0)     AS cape1   -- no case
        , CASE WHEN s.date2  >= s.min_date THEN NULLIF(avg(earnings) FILTER (WHERE date >= s.date2), 0) END AS cape2
        -- ...
        , CASE WHEN s.date10 >= s.min_date THEN NULLIF(avg(earnings)                               , 0) END AS cape10  -- no filter

        ,                                       NULLIF(avg(book)     FILTER (WHERE date >= s.date1), 0)     AS capb1
        , CASE WHEN s.date2  >= s.min_date THEN NULLIF(avg(book)     FILTER (WHERE date >= s.date2), 0) END AS capb2
        -- ...
        , CASE WHEN s.date10 >= s.min_date THEN NULLIF(avg(book)                                   , 0) END AS capb10
        -- ...
   FROM   security_data 
   WHERE  security_id = s.security_id
   AND    date >= s.date10
   AND    date <  s.date
   ) l ON s.date1 >= s.min_date  -- no computations if < 1 year of trailing data
ORDER  BY s.security_id, s.date;

Todavía no va a ser increíblemente rápido, ya que cada fila necesita múltiples agregaciones separadas. El cuello de botella aquí será la CPU.

También vea el seguimiento con un enfoque alternativo (ÚNETE al calendario generado + funciones de ventana):