mysql - Detecting near duplicates above a threshold -
i want able query table records suspect may duplicates.
i've racked brains can't think begin one, i've simplified problem as possible, , came ask here!
here's simplified table:
create table sales ( `id1` int auto_increment primary key, `amount` decimal(6,2), `date` datetime ); here's test values:
insert sales (`amount`, `date`) values (10, '2013-05-15t11:11:00'), (11, '2013-05-15t11:11:11'), (20, '2013-05-15t11:22:00'), (3, '2013-05-15t12:12:00'), (4, '2013-05-15t12:12:12'), (45, '2013-05-15t12:22:00'), (4, '2013-05-15t12:24:00'), (8, '2013-05-15t13:00:00'), (9, '2013-05-15t13:01:00'), (10, '2013-05-15t14:00:00'); the problem
i want return sales above amount y, have neighbour sales above y recorded within x minutes of each other.
ie, data:
amt, date (10, '2013-05-15t11:11:00'), (11, '2013-05-15t11:11:11'), (20, '2013-05-15t11:22:00'), (3, '2013-05-15t12:12:00'), (4, '2013-05-15t12:12:12'), (45, '2013-05-15t12:22:00'), (4, '2013-05-15t12:24:00'), (8, '2013-05-15t13:00:00'), (9, '2013-05-15t13:01:00'), (10, '2013-05-15t14:00:00'); where @yval = 5 , @xmins = 10
expected result be:
(10, '2013-05-15t11:11:00'), (11, '2013-05-15t11:11:11'), (20, '2013-05-15t11:22:00'), (8, '2013-05-15t13:00:00'), (9, '2013-05-15t13:01:00'), i've put above fiddle: http://sqlfiddle.com/#!2/cf8fe
any appreciated!
try somthing this:
select distinct s1.* sales s1 left join sales s2 on (s1.id1 != s2.id1 , s1.amount >= s2.amount - @xval , s1.amount <= s2.amount + @xval , s1.date >= date_sub(s2.date, interval @xmins minute) , s1.date <= date_add(s2.date, interval @xmins minute) ) s2.id1 not null extends
fix errors
result data looks like:
+-----+--------+---------------------+ | id1 | amount | date | +-----+--------+---------------------+ | 1 | 10.00 | 2013-05-15 11:11:00 | | 2 | 11.00 | 2013-05-15 11:11:11 | | 4 | 3.00 | 2013-05-15 12:12:00 | | 5 | 4.00 | 2013-05-15 12:12:12 | | 8 | 8.00 | 2013-05-15 13:00:00 | | 9 | 9.00 | 2013-05-15 13:01:00 | +-----+--------+---------------------+ extends 2
select distinct s1.* sales s1 left join sales s2 on (s1.id1 != s2.id1 , s2.amount >= @xval , s1.date >= date_sub(s2.date, interval @xmins minute) , s1.date <= date_add(s2.date, interval @xmins minute) ) s2.id1 not null , s1.amount >= @xval
Comments
Post a Comment