Crawling links and extract data
$30-50 USD
Paid on delivery
I need a flat file php script (if possible) that search the top ... (max 10) pages in Google for a certain keyword and visit those links and collect all the text on these pages (like if you go visit the page and use select all). This way all text is collected, including link text.
Then it takes all the text from the top... pages and put this text from all the pages together into one file and list it accordingly by the number of times the words are used, like here below:
One word list:
jobs (346)
dubai (334)
job (241)
hotel (204)
united (167)
emirates (164)
arab (135)
bahrain (107)
uae (102)
hotels (80)
location (80)
description (79)
manager (78)
hospitality (73)
travel (67)
recruitment (67)
experience (67)
middle (61)
jul (57)
dhabi (56)
etc.
Two word phrases list:
arab emirates
united arab
job description
abu dhabi
dubai united
middle east
emirates job
location dubai
saudi arabia
our client
3rd jul
real estate
united states
dubai jobs
job vacancies
hospitality travel
job category
travel location
category hospitality
description our
years dubai
manama bahrain
location united
years job
bachelors degree
dubai uae
degree experience
emirates education
etc.
Those keywords that are displayed should be able to open in a new window without numbers (only the keywords) so you can copy them and use them for another program or paste them into a text file for example.
Some other options that should be available:
- Excluse words with less then 3...(fill in) letters.
- A list of words to ignore. Want to have a prefilled list by default with words like:
the
of
to
and
a
in
is
it
you
that
he
was
for
on
are
with
as
i
his
they
be
Project ID: #3034323