Techwithfun-Learning Technology with fun: Classifying data to class interval using AWK

Monday, December 1, 2008

Classifying data to class interval using AWK

Recently, I needed to do a 'Simulation And Modeling' assignment which required me to classify the discrete datas into class interval.
Thanks to my counting skill (or probably it was concentration), I ended up finding different frequencies for the class-intervals each time I counted.
I knew I would never complete the assignment if I continue counting. I thought I would better go for a script to do counting for me. And hence, I wrote a small one for myself using awk.
I copied the discrete data in a plain text file called dataset. The data was separated by a blank space and looked like this:

07 05 96 14 10 90 ..............

Had to write only a few lines of code to do the trick for me. In the class interval, I had to include the maximum class limit and exclude the minimum class limit. Wrote the following code in a file called class.sh and executed it.



#!/bin/bash
awk '
{
for(j=1;j<=NF;j++){
   p[int(($j-1)/10)]++;
 }
 print "Class Inverval","\t","Frequency";
 for(k=0;k<100;k+=10){
  printf("%2d lt; r <= %3d",k,(k+10));
   printf("\t\t%2d\n",p[int($k/10)]);
 }
}' dataset

The output of the script would look like this:


Class Inverval     Frequency
 0 < r <=   10             8
10 < r <=   20            13
20 < r <=   30             8
30 < r <=   40            12
40 < r <=   50             7
50 < r <=   60             8
60 < r <=   70             8
70 < r <=   80             8
80 < r <=   90            12
90 < r <=  100            11

Easy, huh? Isn't it.