Download Millions of Individual Websites as JSON

Completed Posted 5 years ago Paid on delivery
Completed Paid on delivery

We need someone to download an entire Russian-language website consisting of millions of individual pages, preferably in JSON format:

[login to view URL]

This is not a straightforward job and will require advanced knowledge of HTML, Javascript, and AJAX. Simple website downloaders (such as SiteSucker) will not work. The website uses unique IDs to identify individual webpages, but the search function will not allow the complete archive to be shown at once.

Example: [login to view URL]

Russian speaker preferable (this site is in Russian and we require the JSONs to display the Russian text (not the translated English).

Deliverables:

1. JSONs for all files from [login to view URL]

a. We need the JSONS for all document IDs on the website. We expect there to be upwards of 100 million individual JSON entries

b. It would be ideal to have some sort of validation to ensure we are getting the entire site, and not some incomplete portion

2. Programming code used to make requests from the API

a. This can be delivered at the end of the project.

b. We may require some explanation of the code so we can use it later on to fill in missing entries or redownload corrupted files.

Method of Delivery:

- Shared cloud storage to be agreed upon. JSON files should be split into batches of between 500MB and 1GB for easier processing.

AJAX Data Scraping HTML JavaScript Python

Project ID: #17792622

About the project

13 proposals Remote project Active 5 years ago

Awarded to:

chirgeo

OK, as we agreed on. I will make the script and will upload data into some file sharing system like dropbox. Once I have some chunk of file will upload it. Thx, let's do the great work!

$1200 USD in 1000 days
(114 Reviews)
7.3

13 freelancers are bidding on average $1073 for this job

drgold03

Hello I am good in php/html/js/css etc and have a lot experience I am an effective web developer and can deliver exactly deadlines. I will do my best to satisfy you. Thank you for your time and consideration. I l More

$1250 USD in 20 days
(172 Reviews)
7.5
crocodile305

Hi... How are you?. I saw your description carefully. And if you want to see my skills please go this link: https://www.freelancer.com/u/crocodile305 When you have enough time to discuss about your project wi More

$750 USD in 3 days
(42 Reviews)
6.8
Gaosong2017

Honorable Seniors. How are you? I have experience in web scraping and data mining using python, java, c# with selenium enough. I will try to deliver great result with your satisfy all my best. Hope to meet you soon More

$1000 USD in 20 days
(79 Reviews)
6.8
dkarataev

Hi dss3js, I find myself eligible for this job since I am having 7 years of extensive experience I am Good at AJAX, Data Scraping, HTML, Javascript, Python dss3js, Please Send a message so that we can discuss more abo More

$1250 USD in 11 days
(21 Reviews)
6.9
polarjin2017

Hi.. How are you? I saw your description carefully your project. Owing to my rich experience in python, scrapy, data entry , i can say i can do this perfectly. I have many top skills like python, scrapy,CSS,HTML More

$1250 USD in 20 days
(95 Reviews)
6.6
i8solutions

i can download using python................................................................................................. Regards vasudha

$1250 USD in 20 days
(20 Reviews)
6.2
KGeorgy

Hello, Thank you for the job posting. It’s a pleasure to meet you. I’d really like to work with you on this one if possible! I do have a couple of questions, but first I’d like to make you an offer and some backgrou More

$1250 USD in 20 days
(48 Reviews)
6.1
skfaroo123

Hi there This is web scrapping job I have ever done it before Please contact me and discuss more detail on chat Kind Regards

$1250 USD in 20 days
(32 Reviews)
6.1