Find Jobs
Hire Freelancers

Cleanup names - repost 2

$30-250 USD

Cancelled
Posted about 10 years ago

$30-250 USD

Paid on delivery
Round 2 pending- Round 1 already complete. I have a list of 50,000 names, with IDs associated to each name. This is the output of a computer program that tried to IDs to names. Similar names have been clubbed together by the program, however there are substantial errors as well. I have sorted the file in excel by lastname, then by firstname and then by ID. I need someone to go over this sorted file once, and: 1. Assign a status of 1 to name spellings that are dissimilar, but have the same ID. This will be in a new column "Status". Example before processing: ExistingID Lastname FirstName 32 WAY JAMES CREIGHTON 32 WESEMAN JAMES C 32 WILSON JAMES C 32 WRAY J C After processing, the names should look like: ExistingID Lastname FirstName ManualID Status 32 WAY JAMES CREIGHTON 32 WESEMAN JAMES C 1 32 WILSON JAMES C 1 32 WRAY J C The above indicates that lines 1 and 4 can continue with the same ID(32), but lines 2 and 3 need to be assigned fresh IDs. 2. Assign a status of 2 to name spellings that can easily be seen to be belonging to some other ID that is similar to the current row. Also make note of the new ID to which it should be changed: example before processing(below are two consecutive entries in the file, I expect the person working on this to remember atleast the last 50 lines and match): ExistingID LastName FirstName 647 AAGAARD ERIC J 4154 AAGAARD ESQ ERIC J As can be seen, the two names are practically the same(and this requires some knowledge of American names), so I expect the larger ID to be reassigned the smaller ID, and the entries after processing should look like this: ExistingID LastName FirstName ManualID Status 647 AAGAARD ERIC J 4154 AAGAARD ESQ ERIC J 647 2 The above indicates that similar names already exist(status=2), and 4154 should be reassigned to 647. Another example (this time, different IDs due to a misspelling) before processing: 3685 ACKERMAN JEOL G 3052 ACKERMAN JOEL C The vast majority of errors are expected to be of the second type. I expect the person taking up the work to submit as a sample the output of processing the 1000 lines in the attached file as a test of his/her skill. I would, after the work is complete, also undertake a random sample of another 1000 to make sure there are no huge errors (less than 50 in the sample of 1000) before releasing payment. The work needs to get done in another 2 weeks. Please respond if interested with the completed sample work.
Project ID: 5346339

About the project

Remote project
Active 10 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

About the client

Flag of INDIA
India
0.0
0
Member since Jan 22, 2014

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.