Page 1 of 1

count of duplicates and % of total

PostPosted: Sat Jul 31, 2010 12:40 pm
by kamal
Hi
I have a file as shown below:

A
A
B
B
C
C
C
D
D


Using above file as input , I want to create a o/p file as:
A 0002 0009 22.22
B 0002 0009 22.22
C 0003 0009 33.33
D 0002 0009 22.22

I could create only the counts using sumfields.
Can anyone suggest , how to add total of all records and % in the same record.
LRECL is no issue .Please assume FB 80 for input. for output any Lrecl or format is ok.

Thanks.

Re: count of duplicates and % of total

PostPosted: Mon Aug 02, 2010 8:58 pm
by kamal
I think I have not not explained it properly.

Input file has total 9 records.
A- 2 occurences
B- 2 occurences
C- 3 occurences
D- 2 occurences
total 9
So I am trying to create o/p file which will contain ,
A - 2occurences - out of 9 - 2/9=22.22%
i.e.

A 0002 0009 22.22

Please guide if this can be done thru SORT.
I could create following o/p by adding 001 in each record of i/p file and sumfields on first column.

A 002
B 002
C 003
D 002

Re: count of duplicates and % of total

PostPosted: Mon Aug 02, 2010 9:36 pm
by skolusu
kamal,

The following DFSORT JCL will give you the desired results

//STEP0100 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD DSN=Your input FB 80 byte dataset,DISP=SHR     
//SORTOUT  DD DSN=&&S,DISP=(,PASS),SPACE=(TRK,(1,0),RLSE)   
//SYSIN    DD *                                             
  SORT FIELDS=COPY                                           
  OUTFIL REMOVECC,NODETAIL,BUILD=(80X),                     
  TRAILER1=('REC_COUNT,+',COUNT=(M11,LENGTH=8))             
//*   
//STEP0200 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SYMNAMES DD DSN=&&S,DISP=SHR
//SORTIN   DD DSN=Your input FB 80 byte dataset,DISP=SHR
//SORTOUT  DD SYSOUT=*                                           
//SYSIN    DD *                                                 
  INREC OVERLAY=(10:7C'0',C'1')                                 
  SORT FIELDS=(1,1,CH,A)                                         
  SUM FIELDS=(10,8,ZD)                                           
  OUTREC BUILD=(1,1,X,10,8,X,REC_COUNT,M11,LENGTH=8,X,           
               (10,8,ZD,MUL,+10000),DIV,REC_COUNT,EDIT=(IIT.TT))
//*



The output from this job is
A 00000002 00000009  22.22
B 00000002 00000009  22.22
C 00000003 00000009  33.33
D 00000002 00000009  22.22

Re: count of duplicates and % of total

PostPosted: Wed Aug 04, 2010 10:54 am
by kamal
Wow! It worked.
Thank you so much.