MonadLUG man page of the month presentation: uniq
8 February 2007
uniq removes duplicate lines from a sorted file.
sample file: in.txt
a, one
a, one
b, two
B, Two
c, three
b, one
Simplest example: uniq <<n.txt
a, one
b, two
B, Two
c, three
b, one
But there's much more!
Show how many times each line occurs: uniq -c <in.txt
2 a, one
1 b, two
1 B, Two
1 c, three
1 b, one
(yes, the indent is in the generated output file)
Print only the duplicate lines: uniq -d <in.txt
a, one
Count lines while ignoring case: uniq -d -c <in.txt
2 a, one
Print unique lines -- suppress duplicates:uniq -u <in.txt
b, two
B, Two
c, three
b, one
And with case-insensitivity: uniq -u -i <in.txt
c, three
b, one
Skip first field, check uniq on the second: uniq -f 1 <in.txt
a, one
b, two
B, Two
c, three
b, one
Now sort case insensitive on second field: sort -k 2 <in.txt | uniq -f 1
Note field/column origins different for sort and uniq.
B, Two
a, one
c, three
b, two
Of course, we could have also used plain sort: sort -k 2 -u <in.txt
B, Two
a, one
c, three
b, two
Plus case insensitivity and count: cat in.txt | sort -k 2 -f | uniq -c -f 1 -i
3 a, one
1 c, three
2 B, Two
And, for our finale, sort ignoring leading blanks and numerically to get
uniq lines listed by number of times they appear.
cat in.txt | sort -k 2 -f | uniq -c -f 1 -i | sort -b -n
1 c, three
2 B, Two
3 a, one
Refer to your friendly man page or Google for
linux uniq tutorial
for lots of information.
Presented by Ray Côté
Appropriate Solutions, Inc.
"We Build Software"
http://www.AppropriateSolutions.com
--
TedRoche - 08 Mar 2007