MS Excel tool to compare and identify like and dislike mailing addresses

Completed Posted Feb 5, 2012 Paid on delivery
Completed Paid on delivery

This project consist of comparing two worksheets within a single MS Excel file. Each worksheet contains mailing addresses from different sources.

The tool is extract those addresses that match 100% between the two sheets and copy them into a new and third worksheet. All address that have no match with 100% certainty, are to be copied into a new and fourth worksheet. A final and fifth worksheet should have the data from both raw data worksheets where there is suspect of probable address match that require human intervention. The probable match worksheet should contain no more than 5% of the total number of mailing addresses.

## Deliverables

**Software**: MS Excel - 2010 (Windows), MS Excel 2011 (Mac)

**OS**: Windows 7, Mac OS X

Architecture: all code to be done in MS Excel's built-in Visual Basic. It is to be fully modularised complete with comments and clear use of sub and functions.

**GUI**: NONE - this project will not need any graphical user interface. I only need to start a macro to run the code. A future project may be opened to create a new and full GUI.

**Tool Objectives**: compared mailing addresses between two sheets and identify;

1. mailing addresses that match 100% (safe for some exceptions below)

2. mailing addresses that are unique between the two sheets

3. mailing addresses that are probable match, but require human intervention to confirm.

**Tool Requirements**:

1. The script must take no more than approximately two minutes to run

2. The code must be free of any traps

3. The code must be fully commented

4. The code must have a robust system of error recovery

5. All must run within a single MS Excel file

6. The accuracy of the MATCH and UNIQUE records must be 100%

7. The accuracy of the INPUT REQUIRED records must be less than 95%

**Input**:

1. There could be up to 12,000 mailing addresses in each of the two raw data worksheets

2. Column A = store name, Column B = street address1, Column C = street address2, Column D = city, Column E = province, Column F = postal code, Column G = telephone number, Column H = fax number

3. The mailing addresses (street number, city and postal code) are the three critical input for the comparison. The store name is secondary.

4. The phone numbers and fax numbers can be used as a last comparator but CANNOT be the determining factor as telephone and fax numbers are often changed.

**Output**: the script should produce the following results

1. New worksheet called "MATCH"; contains the mailing addresses that match 100% plus;

1. if the store name matches 100%, paste store name in column A along with the remaining information as per the source sheets

2. if there is a store name mismatch greater than 50%, insert the store name that is longest (most number of characters) in column A along with the remaining information as per the source sheets

3. if the store names matches less than 50%, place the store name from the first sheet in column A, the second sheet in column B along with the remaining information as per the source sheets

2. New worksheet called "UNIQUE"; contains the mailing addresses that are unique between each of the two source files

3. New worksheet called "INPUT REQUIRED"; contains the remaining records where there are possible matches but human intervention is required to confirm the match or mismatch.

1. The records from each sheet must be in two separate rows but one under the other (i.e., record from source A is in row #55, record from source B is in row #56)

2. Matching characters for all field (e.g., name, street number, city, etc.) must be highlighted in **bold red characters** to make identification between the two rows

3. Column A must be reserved for user input. The user will type in any character beside one of the two rows he/she deems to be the most reliable address. Once the user is satisfied all the addresses have been analyzed, he/she will run a script (created by this project) to collect the identified row and paste and add them in the "MATCH" worksheet

>

Project ID: #2708571

About the project

2 proposals Remote project Active Feb 9, 2012

Awarded to:

gribfritz

See private message.

$637.5 USD in 165 days
(486 Reviews)
7.6

2 freelancers are bidding on average $409 for this job

sbhask2k

See private message.

$180.2 USD in 165 days
(35 Reviews)
4.8