Urgent: Scrapy sitemap parsing gig

Closed Posted 4 years ago Paid on delivery
Closed

I have a huge list of domains that we need to parse to get all of the sitemap data out of.

I’ll provide csv of all the domains. You might need to normalize them (checking http/https protocol) and check www or not.

We need two outputs:

Summary csv with the following

Proper url to the sitemap | total pages in sitemap | list of dates for the last year and count of pages updated on those dates.

So the csv will have 367 columns

Next output I need

You can hit the sitemap for each site and dump to csv a file per domain. The csv should have the sitemap data in it.

Url / modified

I have about 160k domains that we need to process for this.

I’ll provide you a Ubuntu Aws machine to run your solution on. Thinking scrapy or similar running for a few days.

To apply for this job your proposal must include the following

1- questions

2- what framework will your solution use?

3- ballpark how much time to get the solution running?

4- how many domains per 10 sec do you think we can process?

Python Web Scraping Software Architecture Linux Scrapy

Project ID: #22451100

About the project

22 proposals Remote project Active 4 years ago

22 freelancers are bidding on average $35/hour for this job

dreamci

Hello there We are top quality full-stack developers and we are ready to work on this project, we use Version Control Systems, Staging Servers, Team Slack Channel and Task Management Tool Can you send me a message? T More

$40 USD / hour
(102 Reviews)
8.5
schoudhary1553

Hello, I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Pl More

$25 USD / hour
(98 Reviews)
7.0
adampohp79

I can start work right now and I can show you perfect result in a short time. Please contact me freely. Waiting for you with your great news.

$38 USD / hour
(72 Reviews)
6.2
datakillers

Hi, I am good in your required project; I also have a great working experience of more than 10 years. To ensure please visit my profile and check customers satisfaction level. I will complete your project within your More

$25 USD / hour
(70 Reviews)
6.3
smsaurabhv

Hi, I have gone through your requirement to scrape lots of websites. I am EXPERT in building scraping tools /scripts. Hence, I can SURELY work on your project. I am having 4 YEARS of EXPERIENCE in developing PHP-PYTHON More

$33 USD / hour
(104 Reviews)
5.9
writingapp

Hi. I have writen a similar app but for windows. I am ready to write your project 1- questions? Can you run it in windows? 2- what framework will your solution use? .NET 3- ballpark how much time to get the solution r More

$50 USD / hour
(51 Reviews)
5.4
RummanC

Hi, I would love the opportunity to work on this project with you. I have vast programming experience, recently specialising in MetaQutoesLanguage, but i have previously deployed several Python programs commercially. More

$45 USD / hour
(20 Reviews)
5.4
phpdevindia

Hi there! I am interested to do this project for you.'' 1- questions Ans: Please send me atleast 5 different url so i can check 2- what framework will your solution use? Ans: Scrapy will be best for this 3- ballpark h More

$25 USD / hour
(22 Reviews)
6.4
kostyakislicin

Hi. I've checked your project description and I'm interested in your job. I fully understand your requirement. I'm very skilled with: JS frameworks & libraries like Angular, React, Vue; PHP frameworks such as Laravel More

$30 USD / hour
(4 Reviews)
4.1
mackthehobbit

Hello! I have worked on a few web scraping projects in the past, some with VB and some with scrapy. To answer your questions in order: - I would use Python 3 with scrapy, and possibly modules like urllib to sanitise t More

$25 USD / hour
(14 Reviews)
4.1
dineshreddykdp

Hello. I have just reviewed your job description carefully. ALL SKILLS you need have never been problem for me. Anyhow, I can solve any problem there as I have long years experience in web development. I'll be great More

$38 USD / hour
(9 Reviews)
3.4
AnastasiaMalko

Hi. Dear I read your job description in detail and feel I can help your project. I have full experience and skills for the python. I have done the many projects as same as your project with Flask, Django project and M More

$25 USD / hour
(4 Reviews)
2.8
KonstanBer

Greetings. I am an expert in software architecture. I have rich experiences in machine learning, AI, image processing ,openCV and google apis and extensions. I have many experiences in programming languages such as c More

$38 USD / hour
(2 Reviews)
2.7
prizon2008

Hello 1- questions : please give me at least one site link and csv file for all domains. 2- what framework will your solution use? : Core PHP / DOMDocument Parser, Python scrapy framework 3- ballpark how much time More

$40 USD / hour
(1 Review)
2.6