Find Jobs
Hire Freelancers

357667 Scaper Project (PHP)

N/A

In Progress
Posted over 14 years ago

N/A

Paid on delivery
Real Estate Scaper Project - Beta 1 (PHP) Scraper will download selected property information from remote website and Integrate this data into a Wordpress Blog database. This will be the continuation of a project started by someone else (Alpha 2) and that code will be provided. This will be considered Beta 1 of this project so please make notes of ideas and suggestions for Beta 2). Basicly, we need to pull every property listing off of the specified site and then format and save the data. Scraper will run once a day to add new property listings, tag removed property listings ("will list as removed"), update 'Sold' listings (will list as 'Sold), and will update the "status" of any other existing listings. After 'X' amount of days Scraper will drop removed and sold listings. Note: (Would be cool if there was a way to note staus changes in a separate field) How the Scraper will work: 1) The scraper will start out at the URL of the last known found listing (Example: [login to view URL]) and will check each page counting up (29129605, 29129606, 29129607, 29129608, 29129609, 29129610, etc ...) until it finds 20 pages in a row that show up as "The requested listings was not Found", then the scraper will move to step 2. 2) The scraper will then move backwards from the same starting point and check every existing listing for changes until it finds 20 pages in a row that show up as "The requested listings was not Found", then the scraper will stop. 3) Once a week the scraper will do a modified version of Step 2 where it it keeps going backwards to a predefined stopping point (just so we don't miss any updates to really old listings). Features Needed To Ad in this project: - Ability to search (through Wordpress) by City, Zip Code, County, school district, etc ... - Ability to Scape all pictures and thumbnail links. We currently only show the first 4 but we will want all photos shown just as they are on the scraped site. (Note: we were initially going to save all the photos and then host them on our server but we have been told now that the MLS requires linking to their photos). - Assign "Class" names to all entries for css formatting and searching abilty. (ie: <div name="bedrooms" class="bedrooms">Bedrooms: [data]</div> - Abilty of Wordpress to dynamically load separate pages based upon any class name (ie: city, county, zip, neighborhood, etc...). - Scraper needs to disguise itself as a search Engine. - Scraper needs to have the ability to pause itself if the remote site starts rejecting it (we will then make adjustments like changing the IP address and resume the scrape. Important notes: - We are only scraping the property data and not any of the other website content (see list at bottom). - The Scraper will operate on a different server than our production website (where the wordpress site & the mySQL server is) so it will need to transfer the data remotely (so the scraper should run slow enough to not cause data transmission errors. Alternatively it can save the data locally and then upload it all later via a script. - We own the servers that this will run on so we can update PHP/mySQL settings as needed. - PHP code must be written with server security in mind. - Server Information: Centos4 (Plesk:8.6.0-cos4.build86080722.02), PHP 5 ([login to view URL]), mySQL 5 ([login to view URL]), Wordpress (latest stable). - View Alpha 2 Test Site (not updated in a while): [login to view URL] - Page design (HTML/CSS) is NOT part of this project. - In the next phaze (Beta2) we will add an image overlay over the images and thumbnails for "Sold" and "Deleted" listings. Data Example: Look at the HTML code for [login to view URL] starting at about <div id="idx-detail-primary">. Some properties have different entries so we will need the ability to add them as we find them): MLS #: (example: 29129607) Price: Street: City: Zip: (Scrape all picture and thumbnail links) Description: (this is the largest piece of information and is usually a paragraph or two). Property Information (Header/No Data/not needed) Bedrooms: Bathrooms: Square Ft: Lot Size: Year Built: Prop. Type: County: Subdivision: Area: School District: Status: Schools Information (Header/No Data/not needed) School District: Additional Information (Header/No Data/not needed) Garage: Fireplaces: Taxes: Appliances: Building Information: Energy Source(heat): Exterior: Interior: Floors: Parking Type: Heating/Cooling: Lot Details: Roof: Site Features: Sewer: Terms: // Note: I will prepay this project with Scriptlance before any work begins.
Project ID: 2103500

About the project

Remote project
Active 12 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

About the client

Flag of
5.0
16
Member since Oct 10, 2009

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.