Data extraction from Text Area in an HTML Page / Text Parsing

Completed Posted Sep 20, 2008 Paid on delivery
Completed Paid on delivery

Need to extract information from a web page using JavaScript. You can write small JS functions to get the information described below and save it in a variable that will be assigned to a text field in the form. The text field may already have a value and can be over written (can be reset by using form reset). All other Form requirements are complete using ASP.NET, All I need is javascript parsing/extraction functions.

## Deliverables

Need to extract information from a web page using JavaScript:

* Please note that all information is optional and may or may not be in the body of the HTML page.

* I need a set of JS functions that I will include in an ASP.NET page.

* The head/main function will be called by clicking a button.

* Value displayed on the page in form of body of email will be passed to that main function. It could be a small text of 100 characters or a large text of about few thousand characters.

* The main function will invoke further sub functions to extract information from that text.

* The information that needs to be picked does not have any set format as this is a collection from an inbox and different people have different ways of writing address, phone numbers, fax numbers etc. Only few are easily picked like email, website address etc.

* The value returned from the sub function will be saved in a form input box already showing on the HTML Page and will be saved in a database by the click of the button. All of that functionality is already handled by ASP.NET. All you will need to write is these extraction functions.

* A manual data entry person will click the button to invoke these procedure and then verify the results gathered before saving in the database.

* Please review information below to get the idea about each entry.

* **Email:** to pick up the email address from a _text_ field. There is a list of emails that are to be ignored if found in the text filed. For example if text has my email address in it I don’t want to be “get??. So if you can provide an option in the code to include emails to avoid. Also there could be more than one email addresses in the text field, if so then add them in the field with comma as separator.

* **Website**, if found in the _text_ a website address then pick it up.

* **Address**, only if the address cannot be separated than a possible match can be inserted for review. This option is only used when the address cannot be split as described below.

* **Street,** Pick up the street address, such as *2734 Styles Ave.* Possible phrases to look for are St. Str. Street, Ave. Av. Avenue, Cr. Circle, Wy. Way, Dr. Drive, Pl. Place, Blvd. Boulevard

* **City,** Can be tricky to detect should be closer to street address, state name or ZIP code, depends how you can Isolate it.

* **State,** I am also attaching a list of States abbreviations and full names.

* **ZIP,** ZIP follows couple of patterns: for USA, 5 digits or 5+4 digits (example: 94012-5582), for Canada 6 characters (example: M1Z 4T1) and the rest of the countries

* **Country,** There is a possibility of USA, Canada, China, Malaysia etc to be in the _text_ field.

* **Phone,** phone also follows certain pattern, easy one is people write “phone, call, ph, office?? etc next to it and the USA and Canada patterns are 999 999-9999, (999) 999-9999, 999-999-9999 and sometimes they have Ext. ( or x) for extension as well. Please provide a similar thing as email to avoid certain phone numbers.

* **Fax,** similar to phone

* **Cell,** same as above

* **MSN,** Some people write Messenger ID or MSN, Yahoo, AOL, AIM etc with it.

* **IM,** same as above

_Important Notes:_

* All fields are optional, if not found insert “not found?? in the blank field.

* For email and phone numbers please provide an exclusion list that I can enter to avoid certain email address or numbers.

* Room for exceptions are must

* The message may or may not have all the information it could be missing one or all of the above fields, that’s why we have manual verification entry.

## Platform

Web, JavaScript, IE, FF, Chrome Compatible

Apple Safari Data Entry Engineering Google Chrome MySQL PHP Software Architecture Software Testing

Project ID: #3242431

About the project

8 proposals Remote project Active Oct 5, 2008

Awarded to:

nextGenSol2002

See private message.

$38.25 USD in 3 days
(137 Reviews)
6.7

8 freelancers are bidding on average $69 for this job

flrenzi

See private message.

$102 USD in 3 days
(6 Reviews)
4.0
wookietim

See private message.

$85 USD in 3 days
(3 Reviews)
2.6
rh2008vw

See private message.

$85 USD in 3 days
(2 Reviews)
1.3
mineryeasin

See private message.

$85 USD in 3 days
(0 Reviews)
0.0
nickashley1980

See private message.

$42.5 USD in 3 days
(0 Reviews)
0.0
awais9981

See private message.

$25.5 USD in 3 days
(1 Review)
0.0
hafizsafdir2007

See private message.

$85 USD in 3 days
(0 Reviews)
0.0