Content of the count_uniq.awk script:
BEGIN {
FS=",";
}
{
if ($0 in seen == 0){arr[$2]++} else {seen[$0]=1}
seen[$0]=1;
}
END {
for (a in arr) print a, arr[a]
}
What it does?
count how many accounts does each customer has, ignoring duplicated rows. Details as in image below:
Customer "a" has 3 different accounts: [1, 2, 3]
Customer "b" has 1 account: [2]
Customer "c" has 2 accounts: [1, 2]