Page 1 of 1

Searching number with various formats

PostPosted: Fri Jul 26, 2013 1:44 pm
by nikesh_rai
Hi guys,

I need a suggestion from you. I have a requirement where I will have a 2 lists of dial-able numbers. In first list, the dial-able number will be present with specified format e.g.;

2260018000
3322619948
3323513216

now the second list will also consist of dial-able number but not in fixed format. The list may looks like

+912260018000
00913322623148
23513216
3322619948

Now the thing is.. I have to search the numbers in list 2 in list 1. I tried with striping the numbers in list 1 and 2 up to 8 digit and then searching it, but it is taking two much CPU..

Can anyone please suggest me a better approach for this.

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 2:00 pm
by prino
And no doubt you are using a linear search on two unsorted lists?

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 2:15 pm
by nikesh_rai
Yes I am using linear search, but only first list is sorted.. second list is not sorted..

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 2:54 pm
by NicC
so - do you search the sorted list for a number from the unsorted list or do you search the unsorted list?

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 4:03 pm
by nikesh_rai
i am searching numbers from unsorted list List 2 in sorted list List 1

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 4:40 pm
by BillyBoyo
Before doing any coding, you need to design, design, design.

Spend an hour concentrating on File 1, and describing everything you'd need to do, as a human, to process each record so that it could be matched against File 2 if it were presented as the simple subscriber's telephone number (no country code, no local code, just their personal number).

Spend two hours doing the same thing with File 2, everthing you'd need to do to match that to the ideal File 1, doing it as a human.

Start getting to know your data. Know it well.

Don't think about programming yet. Think about how you'd need to manipulate the data. Make some good samples for both files. Make some expected output. Work out how you'd get from one to the other.

You may get to the stage that you don't think it can be done. Then post your findings here.

If you get to the stage of a solution, post your solution here.

The reason for not just giving you an answer is because the process of the deisgn is a really good exercise for you. Going through it will pay you back many times over in your career - if you learn how to do it.

There is an answer. See if you can find it. Once you have the answer, the programming is very easy. That's the point of doing the design.

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 7:54 pm
by richiewu
nikesh_rai wrote:Hi guys,

I need a suggestion from you. I have a requirement where I will have a 2 lists of dial-able numbers. In first list, the dial-able number will be present with specified format e.g.;

2260018000
3322619948
3323513216

now the second list will also consist of dial-able number but not in fixed format. The list may looks like

+912260018000
00913322623148
23513216
3322619948

Now the thing is.. I have to search the numbers in list 2 in list 1. I tried with striping the numbers in list 1 and 2 up to 8 digit and then searching it, but it is taking two much CPU..

Can anyone please suggest me a better approach for this.


Base on list 2, generate a new list 3 that only contains fixed format numbers, then search it in list 1

Re: Searching number with various formats

PostPosted: Fri Jul 26, 2013 8:38 pm
by BillyBoyo
OK, but look at the four examples for list 2. One has "+" for the "dial-out-of-country" code, another has 00, it would be possible (at least in the past) to hav 99.

Then, one has a country-code, another no country-code. Country-codes are 1 to 4 digits (perhaps longer, I haven't checked recently). Then the final one has no country-code, but does it have an area-code?

Are all the subscriber numbers for all countries the same length? No. (so list 1 is not so fixed).

Do all countries have an area-code? No.

OK, so how to "normalise" lists 1 and 2 so they can be matched?