Find Jobs
Hire Freelancers

Log analyzer in php

$250-750 USD

In Progress
Posted almost 9 years ago

$250-750 USD

Paid on delivery
I will only consider people with Freelancer experience (ratings). Bid should be around $250 to max $350 (I will consider the high-end only if you have a lot of references and project completition). I have a simple shell-script in php that parses an "access-log" visitor log made by nginx. I want you to improve it, so that it just contains data I really need for user friendly statistics. This task is only a shell-script and db-layout/sql-queries task. My script simply opens up access-log and parses each line by line, dumping each visit into mysqli. Example of the array that is available attached. I want to be able to run this script every day, maybe more often and the not-needed data in db to be deleted after each run or before insert is done WITHOUT crashing the db using a lot of non-needed sql-queries (optimized). I want you to create tables and code that just keeps the important stuff. To summarize historic data and free up db when job is run or before it is stored in db. For instance, I have no use for knowing about what each visitor x did at x page at exactly that year, day, minute, second. But I do need to be able to ask the database in a easy way how many people visited page "[login to view URL]" at hour-level 3 years back and not have to wait for ten years for mysql to run its advanced query. I want to be able to run a query like this: select UniqeVisitors from PageViews where timestamp-period BETWEEN 102443744 AND 24327487 order by NumVisits and get a list of pages that visited and the number of both uniqe and non-uniqe IP-addresses drilled down to the hour. Even three years back! The old data that was used to fill up PageViews could be removed or at least truncated a lot. select * from visits { - update PageViewHistory set NumVisits=NumVisits+1 where timestamp BETWEEN 43874681 and 6857438.} (where timestamp can be similar to this in human readable: where year=2015, month=04 day=23, hour=23, so that I can show how many visits to a certain page hour by hour even years back). I do NOT want to see something like this: select count(group(column)) from Visits, PageViews where Visits.VisitID=[login to view URL] JOIN blalbalb. I want to be able to run queries fast even with 100.000 rows and asking a big blob table against other tables using nested joins or subqueries takes a lot of resources. This will make it simple to query about pageviews and the Visit-table can be removed afterwards. I'm not even sure if is needed at all. You must also figure out a way to seperate between uniqe and not uniqe vistoris. In your bid, please explain what you intend to do. This shows me that you understand the project. You DO have to think a lot yourself and use your imagination on how to optimize this, I don't want you to ask me for every "problem" you see. I'm sure there are some stats I have not thinked of also. I also want to have a more live-view of fresh data. For instance, it is usefull to see reports for PageViews down to the minute for new data (like current month or so). So you have two tasks: 1) Optimize and make a datastructure for history down to hour for old data. 2) Optimize for also a more live-view down to every hour, including seeing the latest 100 pageviews, visits, browsers, ips, traffic. Data we don't need should be gone/deleted and script should avoid importing data that is already been processed by the script (avoid duplicates). History and “Live”-view: 1. Browsers (based on log data, you must agent-string). 2. OS (including iPhone etc. that you can gather from agent-string). 3. PageViews AND Hits (uniqe and not uniqe). 4. Traffic (bytes transfered) 5. Refereer Sites, refering Sites and referer SearchWords. 6. 404/not found and other status-code-visits. 7. Location. You need to show me a SIMPLE sql-querys to break down a report for both a history and a live view for all the reports above. This query should be based on a php-generated timestamp time(). PHP 5.5.23 (cli) MariaDB Centos 7 and Nginx is tools you can use.
Project ID: 7502671

About the project

3 proposals
Remote project
Active 9 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
3 freelancers are bidding on average $530 USD for this job
User Avatar
I would make my script move the log files to a 'todo' directory, so the webserver can create a new one thus avoiding skipped/double entries. The script would then dump all the data in a waiting list table on the mysql server, waiting to be processed by the sql server itself (using transactions/triggers), this drasticly increases the speed for all the calculations. There would be 2 'output' tables: one containing a summary for every page for every hour (unique visits/hits/etc) and one for the live data. queries would look something like select sum(unique) from summary where time < "2014/03/02 11:00" and time > "2014/03/01 09:00" group by page order by sum(unique) With regards, John ps: you can contact me if you have any questions
$500 USD in 7 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of NORWAY
OSLO, Norway
5.0
93
Payment method verified
Member since Jun 14, 2002

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.