Page 1 of 1

Get index of different characters while comparing two string

PostPosted: Tue Sep 17, 2013 6:26 pm
by vivek88
Let's say there are two strings - PR-ACT-SOURCE-DETAIL-1 and PR-ACT-SOURCE-DETAIL-2. i want to compare these two string and find out the position where the difference is found.

I tried to handle the scenario something like this way -

 PERFORM VARYING N FROM 1 BY 1 UNTIL N > 5000                                                                 
    IF PR-ACT-SOURCE-DETAIL-1 OF TRANSACTION-RECORD-1(N:1)   
        IS NOT EQUAL TO                                   
       PR-ACT-SOURCE-DETAIL-2 OF TRANSACTION-RECORD-2(N:1) 

        MOVE 'Y' TO WS-DIFF-FOUND   
        DISPLAY 'DIFFERENCE FOUND AT POSITION' N
    END-IF
 END-PERFORM



Problem with the above code is that perform loop occurs 5000 times and if I need to compare such 10,000 strings so the execution time becomes too high.

Is there any other way to do the same thing which will require lesser execution time?

Re: Get index of different characters while comparing two st

PostPosted: Tue Sep 17, 2013 7:11 pm
by Robert Sample
1. You're talking about 50,000,000 comparisons -- so it will take some time, no matter what.
2. You search all 5000 characters regardless -- why not change your code to stop the comparison loop when you find a difference, unless you need to know all such differences?
3. You might get better results by comparing the two variables and only if they are not equal perform the loop.

Re: Get index of different characters while comparing two st

PostPosted: Tue Sep 17, 2013 7:18 pm
by vivek88
Actually i can not stop the comparison loop when a difference found because I have to find all position number of string(which is lumps of data) when a difference is found.

Re: Get index of different characters while comparing two st

PostPosted: Tue Sep 17, 2013 7:39 pm
by Robert Sample
In that case, there is not much you're going to be able to do to speed up the process -- you have to do 50,000,000 comparisons. If a lot of the variables match, coding a check for them to be equal before starting the loop would take some of the comparisons out of the process; otherwise, plan on it taking a bit of time.

Re: Get index of different characters while comparing two st

PostPosted: Tue Sep 17, 2013 8:26 pm
by dick scherrer
Hello and welcome to the forum,

3. You might get better results by comparing the two variables and only if they are not equal perform the loop.
If you do this, you may avoid much of the overhead. Unless nearly every record will have 1 or more mis-matches.

If you only do the loop for 1 or 2 % of the records (the ones with adifference), the overhead will be drastically less than doing the whole loop for every record. If a high % of the data has mis-matches, the volume will prevent people from using difference info.

Re: Get index of different characters while comparing two st

PostPosted: Tue Sep 17, 2013 8:41 pm
by BillyBoyo
Don't post in multiple places at the same time please. Stackoverflow for this one.

Re: Get index of different characters while comparing two st

PostPosted: Wed Sep 18, 2013 2:33 am
by chaat
is this on Enterprise COBOL ? if so is your subscript defined as PIC S9(8) COMP ?

that can make is pretty large difference if you had it as PIC 9(4) ===> usage display subscripts perform probably 5 times worse than binary subscripts

also the compiler option TRUNC(OPT) would help as well.