Find Jobs
Hire Freelancers

HTML parse to CVS text

$30-110 USD

Completed
Posted over 20 years ago

$30-110 USD

Paid on delivery
I need a perl script which will parse an html result page from [login to view URL] of buisness listings into comma delimitied text for easy database insertion. This program DOES NOT NEED TO ACCESS THE WEBPAGE, it would only need to work at the command line level to accept a locally saved copy of an html result page from [login to view URL] and output the results to a text file. Example command line execution: [login to view URL] < [login to view URL] > [login to view URL] ## Deliverables The program should be able to accept a search result page from [login to view URL] which would be saved to the computer and parsed from the computer. Here is an example of the way which the program should execute: [login to view URL] < [login to view URL] > [login to view URL] I have attached a complete example input file to test the parsing. Here is a snipet of the HTML which contains one row worth of information: [Barina Jerome F][1] | <nobr>(262) 637-1555 </nobr> | | 201 6th, Racine, WI 53403 This contains a name, a phone number, and an address seperated by commas and terminating the row with a semicolen. Example: "Barina Jerome F","(262) 637-1555","201 6th, Racine, WI 53403"; I would also like the catagory and subcatagory included in each row, this information can be found in the html source file included. The catagory and subcatagory of the attached html file is Catagory: Attorneys Subcatagory: Attorneys So the final output should look like this: "name","phone","address","catagory","subcatagory"; "Barina Jerome F","(262) 637-1555","201 6th, Racine, WI 53403","Attorneys","Attorneys"; There is a script on o'reillys website which does almost EXACTALLY what I want to do, except it is built for googles phonebook output. Here is the script if it helps: #!/usr/bin/perl use strict; my $file='[login to view URL]'; chomp $file; die('no filename passed to me!') unless ($file); open(FILE,$file) or die("Couldn't open the file!".$!); print qq{"name","phone number","address"\n}; my listings = split / * * * /, join '', ; foreach (listings[1..($#listings-1)]) { s!\n!!g; # drop spurious newlines s!<.+?>!!g; # drop all HTML tags s!"!""!g; # double escape " marks print '"' . join('","', (split /\s+-\s+/)[0..2]) . "\"\n"; } close FILE; Thanks for looking, if you have any additional questions or concerns don't hesitate to contact me. Additional Requirements: 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Complete ownership and distribution copyrights to all work purchased. ## Platform Perl 5+ |
Project ID: 2950796

About the project

10 proposals
Remote project
Active 21 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
See private message.
$27.20 USD in 14 days
4.9 (209 reviews)
6.3
6.3
10 freelancers are bidding on average $54 USD for this job
User Avatar
See private message.
$85 USD in 14 days
5.0 (156 reviews)
7.5
7.5
User Avatar
See private message.
$68 USD in 14 days
5.0 (41 reviews)
5.4
5.4
User Avatar
See private message.
$34 USD in 14 days
5.0 (48 reviews)
4.5
4.5
User Avatar
See private message.
$51 USD in 14 days
4.0 (32 reviews)
5.2
5.2
User Avatar
See private message.
$85 USD in 14 days
5.0 (3 reviews)
1.4
1.4
User Avatar
See private message.
$41.65 USD in 14 days
0.0 (0 reviews)
0.0
0.0
User Avatar
See private message.
$50 USD in 14 days
0.0 (0 reviews)
0.0
0.0
User Avatar
See private message.
$51 USD in 14 days
0.0 (0 reviews)
0.0
0.0
User Avatar
See private message.
$51 USD in 14 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Bend, United States
5.0
3
Member since Jul 7, 2003

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.