Page 1 of 4

Sample code: Match/merge two files

PostPosted: Thu Jul 31, 2008 12:18 am
by dick scherrer
Hello,

Attached to this "sticky" is a small sample program that matches/merges 2 sequential files that have been previously put "in sequence".

Depending on your exact requirement, changes may need to be made, but the overall process works for most cases. If your files have a 1-to-1 or a 1-to-many relationship, the model should work for you. One case where additional code may be needed is when both files might have duplicate key values. The issue may be how to determine how the duplicates should "sync up".

Something to keep in mind is that to keep things more manageable you do not want to code compares for multiple keys. As you read records (before any comparing) combine the "key" fields into a ws field (one for each file) so the compare does not become other than simple.

      IDENTIFICATION DIVISION.
       PROGRAM-ID.  SAMPMTCH.
       DATE-COMPILED.
      *---------------------------------------------------------------*
      *                                                               *
      * THIS PROGRAM COMPARES THE CAP INFO AGAINS THE ELIG INFO AND   *
      *  CREATES A FILE OF MATCHED PROVIDER, PERSON, RATE INFO.       *
      *  THE NEW-RATE FILE IS TAB-DELIMITED FOR DOWNLOAD AND USE      *
      *  WITH A WIN-BASED SYSTEM.                                     *
      *                                                               *
      * THIS CODE IS A STRIPPED DOWN VERSION OF AN ACTUAL PRODUCTION  *
      *  PROGRAM - NEARLY ALL OF THE BUSINESS RULES LOGIC HAS BEEN    *
      *  REMOVED TO MAKE THE EXAMPLE EASIER TO READ.                  *
      *                                                               *
      * IF YOU CHOOSE TO USE THIS MODEL, PLEASE MAKE SURE YOU         *
      *  THOROUGHLY TEST YOUR VERSION BEFORE USING FOR SOME BUSINESS  *
      *  REQUIREMENT.                                                 *
      *                                                               *
      *---------------------------------------------------------------*
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
      *
           SELECT COMP-CAP      ASSIGN TO UT-S-CAP.
           SELECT COMP-ELG      ASSIGN TO UT-S-ELIG.
           SELECT NEW-RATE      ASSIGN TO UT-S-NEWRATE.
      *
       DATA DIVISION.
       FILE SECTION.
      *
       FD  COMP-CAP
           LABEL RECORDS ARE STANDARD
           RECORDING MODE IS F
           BLOCK CONTAINS 0 RECORDS.

       01  COMP-CAP-REC.
           05 CCR-PROV          PIC X(9).
           05 CCR-FILL          PIC X.
           05 CCR-RATE          PIC X(5).
           05 FILLER            PIC X(65).
      *
       FD  COMP-ELG
           RECORDING MODE IS F
           BLOCK CONTAINS 0 RECORDS
           LABEL RECORDS ARE STANDARD.

       01  COMP-ELG-REC.
           05 CER-PRSN          PIC X(12).
           05 FILLER            PIC X.
           05 CER-PROV          PIC X(9).
           05 FILLER            PIC X(58).

       FD  NEW-RATE
           RECORDING MODE IS F
           BLOCK CONTAINS 0 RECORDS
           LABEL RECORDS ARE STANDARD.

       01  NEW-RATE-REC         PIC X(80).

       WORKING-STORAGE SECTION.
       77  WKS-MESSAGE         PIC X(23) VALUE
                                   'WORKING-STORAGE SECTION'.

       77  CAP-READ               PIC 9(7) COMP-3 VALUE 0.
       77  ELG-READ               PIC 9(7) COMP-3 VALUE 0.
       77  NEW-RATE-RECS          PIC 9(7) COMP-3 VALUE 0.
       77  RATE-ZEROED            PIC 9(7) COMP-3 VALUE 0.
       77  TOT-MONEY              PIC 9(7)V99 COMP-3 VALUE 0.
      *
       01  MATCH-FILES.
           05 NEED-CAP              PIC X VALUE 'Y'.
           05 NEED-ELG              PIC X VALUE 'Y'.
           05 EOF-CAP               PIC X VALUE 'N'.
           05 EOF-ELG               PIC X VALUE 'N'.
           05 COMP-CAPP             PIC 9(9) VALUE ZEROS.
           05 COMP-ELIG             PIC 9(9) VALUE ZEROS.
      *
       01  NEW-RATE-REC-WORK.
           05 NRR-PROV          PIC X(9).
           05 FILLER            PIC X VALUE X'05'.
           05 NRR-PRSN          PIC X(12).
           05 FILLER            PIC X VALUE X'05'.
           05 NRR-RATE          PIC X(5).
           05 FILLER REDEFINES NRR-RATE.
              10 FILLER         PIC X.
              10 NRR-DLR        PIC 9.
              10 FILLER         PIC X.
              10 NRR-CENTS      PIC 99.
           05 FILLER            PIC X(52).
      *
       01  WORK-MONEY           PIC 9V99.
       01  WORK-MONEY-R REDEFINES WORK-MONEY.
           05 WM-DLR            PIC 9.
           05 WM-CENTS          PIC 99.
      *
       PROCEDURE DIVISION.
       010-OPEN-FILES.
           OPEN INPUT  COMP-CAP
                       COMP-ELG
                OUTPUT NEW-RATE.
      *
       020-READ-CAP-RECORDS.
           IF EOF-CAP = 'Y' OR
              NEED-CAP = 'N'
              GO TO 030-READ-ELG.
           READ COMP-CAP AT END
                MOVE 'Y' TO EOF-CAP
                MOVE 'N' TO NEED-CAP
                MOVE 999999999 TO COMP-CAPP
                MOVE ALL 'Z' TO COMP-CAP-REC
                GO TO 030-READ-ELG.
           IF CCR-PROV NOT NUMERIC
              DISPLAY 'CAP PROVIDER NOT NUMERIC - SKIPPED = '
                      COMP-CAP-REC
              GO TO 020-READ-CAP-RECORDS.
           MOVE CCR-PROV TO COMP-CAPP.
           ADD 1 TO CAP-READ.
           MOVE 'N' TO NEED-CAP.
      *
       030-READ-ELG.
           IF EOF-ELG = 'Y' OR
              NEED-ELG = 'N'
              GO TO 040-MATCH-FILES.
           READ COMP-ELG AT END
                MOVE 'Y' TO EOF-ELG
                MOVE 'N' TO NEED-ELG
                MOVE 999999999 TO COMP-ELIG
                MOVE ALL 'Z' TO COMP-ELG-REC
                GO TO 040-MATCH-FILES.
           IF CER-PROV NOT NUMERIC
              DISPLAY 'ELG PROVIDER NOT NUMERIc - SKIPPED'
              GO TO 030-READ-ELG.
           MOVE CER-PROV TO COMP-ELIG.
           ADD 1 TO ELG-READ.
           MOVE 'N' TO NEED-ELG.


       040-MATCH-FILES.
           IF EOF-CAP = 'Y' AND
              EOF-ELG = 'Y'
              GO TO 990-PUBLISH-STATS.
      *
           IF COMP-CAPP = COMP-ELIG GO TO 100-CAP-ELIG-MATCH.
      * these compares/comments change dependng on requirements.
           IF COMP-CAPP < COMP-ELIG GO TO 120-CAP-NOT-USED.
      *    IF COMP-CAPP < COMP-ELIG
      *       MOVE 'Y' TO NEED-CAP
      *       GO TO 020-READ-CAP-RECORDS.
           IF COMP-CAPP > COMP-ELIG GO TO 140-GET-RATE.
      *    IF COMP-CAPP > COMP-ELIG
      *       MOVE 'Y' TO NEED-ELG
      *       DISPLAY 'MISSING CLAIM DATA '
      *       GO TO 020-READ-CAP-RECORDS.

      *  WE SHOULD NOT BE ABLE TO GET HERE. . . .
           DISPLAY ' 040-MATCH-FILES FATAL ERROR'.
           DISPLAY ' CAP=' CCR-PROV ' ELIG=' CER-PROV.
           DISPLAY ' RUN TERMINATED.'.
           GOBACK.
      *
       100-CAP-ELIG-MATCH.
      * these may change depending on how duplicates are handled.
      *    MOVE 'Y' TO NEED-CAP, NEED-ELG.
           MOVE 'Y' TO NEED-ELG.
      *
           MOVE CER-PROV TO NRR-PROV.
           MOVE CER-PRSN TO NRR-PRSN.
           MOVE CCR-RATE TO NRR-RATE.
           MOVE NRR-DLR TO WM-DLR.
           MOVE NRR-CENTS TO WM-CENTS.
           COMPUTE TOT-MONEY = TOT-MONEY + WORK-MONEY.
           WRITE NEW-RATE-REC FROM NEW-RATE-REC-WORK.
           COMPUTE NEW-RATE-RECS = NEW-RATE-RECS + 1.
      *
           GO TO 020-READ-CAP-RECORDS.
      *
       120-CAP-NOT-USED.
           MOVE 'Y' TO NEED-CAP.
      *
      *    DISPLAY 'CAP RECORD NOT USED = ' COMP-CAP-REC.
      *
      *    MOVE CER-PROV TO NRR-PROV.
      *    MOVE CER-PRSN TO NRR-PRSN.
      *    MOVE CCR-RATE TO NRR-RATE.
      *    WRITE NEW-RATE-REC FROM NEW-RATE-REC-WORK.
      *    COMPUTE NEW-RATE-RECS = NEW-RATE-RECS + 1.
      *
           GO TO 020-READ-CAP-RECORDS.
      *
       140-GET-RATE.
           MOVE 'Y' TO NEED-ELG.
      *
           MOVE CER-PROV TO NRR-PROV.
           MOVE CER-PRSN TO NRR-PRSN.
           MOVE ' 0.00'  TO NRR-RATE.
           WRITE NEW-RATE-REC FROM NEW-RATE-REC-WORK.
           COMPUTE NEW-RATE-RECS = NEW-RATE-RECS + 1.
           COMPUTE RATE-ZEROED   = RATE-ZEROED   + 1.
      *
           GO TO 020-READ-CAP-RECORDS.
      *
       990-PUBLISH-STATS.
           DISPLAY 'CAP RECS READ = ' CAP-READ.
           DISPLAY 'ELG RECS READ = ' ELG-READ.
           DISPLAY 'NEW RECS      = ' NEW-RATE-RECS.
           DISPLAY 'ZEROED RATES  = ' RATE-ZEROED.
           DISPLAY 'TOTAL MONEY   = ' TOT-MONEY.
      *
       9999-STOP.
           CLOSE COMP-CAP COMP-ELG NEW-RATE.
           GOBACK.
 


If you find a typo or a "real" error, please let me know via PM. If you prefer a different way, fine, but this is an approach that works many, many places.

I cannot emphasize enough that it is critical that you thoroughly test your version of the code before production implementation.

Re: 2-file match/merge sample code

PostPosted: Thu Jul 31, 2008 12:42 am
by ranga_subham
Thanks d.

Re: 2-file match/merge sample code

PostPosted: Thu Jul 31, 2008 12:48 am
by dick scherrer
You're welcome - good luck :)

d

Re: 2-file match/merge sample code

PostPosted: Mon Sep 07, 2009 3:53 pm
by panda
i test it.
thanks.

Re: 2-file match/merge sample code

PostPosted: Mon Sep 07, 2009 5:13 pm
by panda
HELP ME!

When i compiler this program.It complier successful,but it cannot run.
It shows that "There was an unsuccessful OPEN or CLOSE of file CAP in program SAMPMTCH
Neither FILE STATUS nor an ERROR declarative were specified. The status
From compile unit SAMPMTCH at entry point SAMPMTCH at compile unit offs
at address 19F017E6.

What is that? I donnot understand. I am a beginner. "

Re: 2-file match/merge sample code

PostPosted: Tue Sep 08, 2009 2:07 am
by dick scherrer
Hello,

The sample code must be modified to reflect the same dcb info as the files being used. . .

Some amount of diagnostic info was generated by the abended run (some message id and error text and/or some system abend code). If you post the error, someone should be able to help.

Re: 2-file match/merge sample code

PostPosted: Wed Sep 09, 2009 5:03 pm
by GuyC
This kind of code is fine and has proven it's worth in the past.

BUT I wouldn't hire ayone who still delivers this kind of code.

1. GOTO's : any go to ( which doesn't go to exit-paragraph) is reason for immediate dismissal.
2. conditions without a matching END- : IF without END-IF, READ...AT END without END-READ
3. program flows continues over paragraphs. (030-READ-ELG gets executed , just because program flow reaches the end of 020-READ-ELG)

I'll try to find some alternative source where more modern programming techniques are demonstrated.

Re: 2-file match/merge sample code

PostPosted: Thu Sep 10, 2009 12:32 am
by dick scherrer
Hello,

1. GOTO's : any go to ( which doesn't go to exit-paragraph) is reason for immediate dismissal.
Only an opinion and has fallen from favor consistently over the last 20 years. . .

In the 70's and much of the 80's this was true (indeed it was gospel and people were failed in formal classes for using GOTO in their class exercises), but the nonsense about GOTO-less programming has rapidly fallen from favor (at least in the more than 100 data centers i've supported in the last 25 years. . .).

It has been demonstrated that really good code may be delivered with some goto's and really awful code can be delivered with none.

Re: 2-file match/merge sample code

PostPosted: Thu Sep 10, 2009 6:17 pm
by Anuj Dhawan
I believe recent discussion, actually, falls under "what standards your shop follows".

I'm from such a shop where GOTOs are big NO in production though other client in out sourcing unit allows GOTOs well in production. Well opinion differs, probably, shop to shop.

Re: 2-file match/merge sample code

PostPosted: Thu Apr 01, 2010 4:37 am
by chaat
A larger issue for me is that the program does NOT appear to do SEQUENCE CHECKING on the two input files.

I always insist that any "matching" program that I review for production, must include sequence checking. The reason for this is that if somehow the data is NOT in sequence, we want the process to abend. Without sequence checking the output data would not be correct and no one would be alerted to that fact.

I've seen where in a recovery / restart, a programmer concatenated another input file in the match program rather than combining them in a sort step prior to the match program. The recovery from this simple mistake was a weeks worth of work from many people as it fed into a purge process which ended up deleting much more data than it was supposed to.

Chuck H.