Data Scraping

Completed Posted Aug 1, 2010 Paid on delivery
Completed Paid on delivery

I need a python programmer to write a simple web scraping program to extract option prices from Yahoo finance and MSN. I expect your python library to be able to take a symbol and go ahead to download the option pages from Yahoo finance and MSN for that given symbol, and extract the option chains on different dates.

On the delivery, I expect you to use regular expression or any other robust method to parse the data in a robust way and please raise an exception if anything goes wrong (for example, your code fails to download the page or your regular expression fails to parse the data). Finally, please document your code about input, output, exceptions raised in your code etc., and make your parsing code easier to accommodate future changes.

I'll define the details of return data we expect so this should only take one or two hours by any good python programmer. I'd spend only around $40 for this scrape code.

Please include your code a unit test for the following cases:

- test your code on three symbols with options: yhoo, msft, goog

- one symbol which has no options: grvy, in this case, you should return an empty OrderedDict, no None please

- any other important functions need at least one test to show it works

You could check link to have an idea of what data you are downloading:

Yahoo: [login to view URL]

MSN: [login to view URL]

This link only by default downloads the option chains for the latest expiration date. So you'd need to parse the link for all the other dates. This should be easy since to get option chains on different months, you only need to find the months with options and the link is easily constructed by adding month and year to it:

for MSN: [login to view URL]

for yahoo: [login to view URL]

I'd expect three functions I could call to get the option chains but you should add other helper functions to make your code more modular.

def get_yahoo_options(symbol):

pass

def get_msn_options(symbol):

pass

# return the number of days from today to the expiration date.

# For option, the expiration date for any given month and year will be the 3rd Friday of that month.

def days_to_expiration(exp_date='2010-12'):

pass

Your return for both functions should contain all the option chains for the symbol on different expiration dates. Here is the requirement of the data structure to store the option chains:

Please use OrderedDict in python 3.

In the OrderedDict,

the keys are expiration dates in the form of "2010-08" for Aug 2010, "2010-12" for Dec 2010 etc.

value is a list of two lists [calls, puts], calls keeps all the call option chains for that expiration date, and puts keeps all the put option chains for that expiration date

Details for calls and puts lists in the value of the ordered dict:

calls: [call options at different strikes], a list of Option (a class defined below) objects at different strike prices. the list is sorted by strike price from low to high.

put: [put options at different strikes], a list of Option (a class defined below) objects at different strike prices. the list is sorted by strike price from low to high.

Option object mentioned in calls and puts:

class Option:

type: 'call' or 'put' is used to define whether it is a put or call option

strike_price: strike price of the option

last: last trading price of the option

chg: change of the option price

bid: bid price of the option

ask: ask price of the option

vol: volume for the option

open_int: open int or open interest for the option

all quantitative fields in the Option object should be float numbers. for any data on yahoo finance or msn, if it is shown as "N/A" or "NA", please replace it with -1 and we do not want strings on quantitative fields.

Python Web Scraping

Project ID: #755114

About the project

7 proposals Remote project Active Aug 10, 2010

Awarded to:

easydoneus

Kindergarden stuff.

$50 USD in 1 day
(4 Reviews)
3.8

7 freelancers are bidding on average $193 for this job

SigmaVisual

We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.

$200 USD in 7 days
(52 Reviews)
6.8
trezorg

Hello. I can write the script.

$50 USD in 2 days
(6 Reviews)
4.2
Ventura123

check pmb,...

$299 USD in 1 day
(1 Review)
2.0
mike80439

Program that receives a real-time price feed and forwards prices to other processes.

$500 USD in 7 days
(0 Reviews)
0.0
sanjaychauhan1

Hi, I have experience of more than 5 years in crawling and extraction I have a very efficient ETL tool for doing this job.I can deliver the product in record time by maintaining high standards. By this tool what I More

$200 USD in 3 days
(0 Reviews)
0.0
ermanish

Well, I am not a python programmer, but I can do the similar code with Regular Expressions in .Net Technologies

$50 USD in 2 days
(0 Reviews)
0.0