i have the data like the following example,
1 ab u
2 cd 1 ….
3 cd 2 ….
4 cd 3 ….
5 cd 4 ….
6 ab v
7 cd 1 ….
8 cd 2 ….
9 cd 3 ….
10 cd 4 ….
11 ab w
12 cd 1 ….
13 cd 2 ….
14 cd 3 ….
15 cd 4 ….
16 cd 2 ….
17 cd 2 ….
18 cd 3 ….
19 cd 4 ….
20 ab x
21 cd 1 ….
22 cd 2 ….
23 cd 3 ….
24 cd 4 ….
2 cd 1 ….
3 cd 2 ….
4 cd 3 ….
5 cd 4 ….
6 ab v
7 cd 1 ….
8 cd 2 ….
9 cd 3 ….
10 cd 4 ….
11 ab w
12 cd 1 ….
13 cd 2 ….
14 cd 3 ….
15 cd 4 ….
16 cd 2 ….
17 cd 2 ….
18 cd 3 ….
19 cd 4 ….
20 ab x
21 cd 1 ….
22 cd 2 ….
23 cd 3 ….
24 cd 4 ….
Please Note: the numbers in front are added by me for reference. the DOTS are data and the rest is the key.
I need to remove the duplicate records for the group of records from record number 11 to record number 19.
To acheive this,
- 1.im grouping the records from record numbers 11 and 20 by adding a flag "1" to the end of all these records using When Group statment.
2.removing the flag form record number 20 using include condition.
3.creating two seperate files (a) records with out flag (b)records with flag in last field.(in 2 SORT steps)
4.removing the duplicates from the (b) file.
5.merging the (a) file and file from previous step.
So now it takes me 6 sort steps to complete the task and also the records from record numbers 11 to 19 are pushed to the end of the file.
It would be great if any of u could tell me if there is a simple way to do this.
Regards,
Nirmal.