Monday, December 1, 2008

Classifying data to class interval using AWK

Recently, I needed to do a 'Simulation And Modeling' assignment which required me to classify the discrete datas into class interval.
Thanks to my counting skill (or probably it was concentration), I ended up finding different frequencies for the class-intervals each time I counted.
I knew I would never complete the assignment if I continue counting. I thought I would better go for a script to do counting for me. And hence, I wrote a small one for myself using awk.
I copied the discrete data in a plain text file called dataset. The data was separated by a blank space and looked like this:

07 05 96 14 10 90 ..............

Had to write only a few lines of code to do the trick for me. In the class interval, I had to include the maximum class limit and exclude the minimum class limit. Wrote the following code in a file called class.sh and executed it.


#!/bin/bash
awk '
{
for(j=1;j<=NF;j++){
p[int(($j-1)/10)]++;
}
print "Class Inverval","\t","Frequency";
for(k=0;k<100;k+=10){
printf("%2d lt; r <= %3d",k,(k+10));
printf("\t\t%2d\n",p[int($k/10)]);
}
}' dataset

The output of the script would look like this:

Class Inverval Frequency
0 < r <= 10 8
10 < r <= 20 13
20 < r <= 30 8
30 < r <= 40 12
40 < r <= 50 7
50 < r <= 60 8
60 < r <= 70 8
70 < r <= 80 8
80 < r <= 90 12
90 < r <= 100 11



Easy, huh? Isn't it.

3 comments:

Jwalanta Shrestha said...

mighty awk.. neat!

Anonymous said...

Nothing to beat the UNIX tools !

Jitendra Harlalka said...

Thanks for comment guys. But sadly the post is badly rendered by Ubuntu Planet.