Crawling links and extract data

Completed Posted Jul 6, 2008 Paid on delivery
Completed Paid on delivery

I need a flat file php script (if possible) that search the top ... (max 10) pages in Google for a certain keyword and visit those links and collect all the text on these pages (like if you go visit the page and use select all). This way all text is collected, including link text.

Then it takes all the text from the top... pages and put this text from all the pages together into one file and list it accordingly by the number of times the words are used, like here below:

One word list:

jobs (346)

dubai (334)

job (241)

hotel (204)

united (167)

emirates (164)

arab (135)

bahrain (107)

uae (102)

hotels (80)

location (80)

description (79)

manager (78)

hospitality (73)

travel (67)

recruitment (67)

experience (67)

middle (61)

jul (57)

dhabi (56)

etc.

Two word phrases list:

arab emirates

united arab

job description

abu dhabi

dubai united

middle east

emirates job

location dubai

saudi arabia

our client

3rd jul

real estate

united states

dubai jobs

job vacancies

hospitality travel

job category

travel location

category hospitality

description our

years dubai

manama bahrain

location united

years job

bachelors degree

dubai uae

degree experience

emirates education

etc.

Those keywords that are displayed should be able to open in a new window without numbers (only the keywords) so you can copy them and use them for another program or paste them into a text file for example.

Some other options that should be available:

- Excluse words with less then 3...(fill in) letters.

- A list of words to ignore. Want to have a prefilled list by default with words like:

the

of

to

and

a

in

is

it

you

that

he

was

for

on

are

with

as

i

his

they

be

Engineering MySQL PHP Software Architecture Software Testing Web Hosting Website Management Website Testing

Project ID: #3034323

About the project

4 proposals Remote project Active Jul 7, 2008

Awarded to:

jalsvw

See private message.

$21.25 USD in 7 days
(8 Reviews)
3.0

4 freelancers are bidding on average $37 for this job

MuktoSoftware

See private message.

$42.5 USD in 7 days
(440 Reviews)
7.5
webhorizonvw

See private message.

$42.5 USD in 7 days
(3 Reviews)
2.3
GOCBD

See private message.

$42.5 USD in 7 days
(0 Reviews)
0.0