hiveql - In hive, how to do a calculation among 2 rows? -
i have table.
+------------------------------------------------------------+ | ks | time | val1 | val2 | +-------------+---------------+---------------+--------------+ | | 1 | 1 | 1 | | b | 1 | 3 | 5 | | | 2 | 6 | 7 | | b | 2 | 10 | 12 | | | 4 | 6 | 7 | | b | 4 | 20 | 26 | +------------------------------------------------------------+
what want each row,
ks | time | val1 | val1 of next ts of same ks |
to clear, result of above example should be,
+------------------------------------------------------------+ | ks | time | val1 | next.val1 | +-------------+---------------+---------------+--------------+ | | 1 | 1 | 6 | | b | 1 | 3 | 10 | | | 2 | 6 | 6 | | b | 2 | 10 | 20 | | | 4 | 6 | null | | b | 4 | 20 | null | +------------------------------------------------------------+
(i need same next value2 well)
i tried lot come hive query this, still no luck. able write query in sql mentioned here (quassnoi's answer), couldn't create equivalent in hive because hive doesn't support subqueries in select.
can please me achieve this?
thanks in advance.
edit:
query tried was,
select ks, time, val1, next[0] next.val1 (select ks, time, val1 coalesce( ( select val1, time mytable mi mi.val1 > m.val1 , mi.ks = m.ks order time limit 1 ), cast(0 bigint)) next mytable m order time) t2;
your query seems quite similar "year ago" reporting ubiquitous in financial reporting. think left outer join
looking for.
we join table mytable
itself, naming 2 instances of same table m
, n
. every entry in first table m
attempt find matching record in n
same ks
value incremented value of time
. if record not exist, column values n
null
.
select m.ks, m.time, m.val1, n.val1 next_val1, m.val2, n.val2 next_val2 mytable m left outer join mytable n on ( m.ks = n.ks , m.time + 1 = n.time );
returns following.
ks time val1 next_val1 val2 next_val2 1 1 6 1 7 2 6 6 7 7 3 6 null 7 null b 1 3 10 5 12 b 2 10 20 12 26 b 3 20 null 26 null
hope helps.
Comments
Post a Comment