Skills you must have:
Solid experience in Image Identification, CNN, Machine Learning, Algorithm toning, Python
this project is NOT for beginners.
1) training data:
- total products: 1000+
- each product has 1 or 2 images (image quality, size, resolution, angle, light condition varies)
- Identification is based on structure of the product not color.
- Convert all color images into grey image (if this would simplify the work for you)
- design and develop code to create a data model and extract features based on training data
- with testing data, achieve 85% or above accuracy of identification
- code in python with sufficient comments (if private modules/libs are used, shall also be included)
- data models trained or features extracted
- necessary documents
- with trained models and features extracted, each identification shall take no more than 1 sec
- identification accuracy > 85%
- able to handle most image problem including (light conditions, angles, size difference)
1) The training part need to be scalable since we keep adding new products/images to the system
2) Identification part, as part of service online, shall take less than 10 sec to finish
3) All processing can be executed on regular CPU env (GPU not required)
4) Language: Python 2.7 on Linux,
- if there's any binary data created by system, please include format explanation in document
- process of re-train system with new data in document
- parameter toning in document
Training data: https://s3.amazonaws.com/product-image-id-sample-data/backview_samples.gz
* the training model has to be scalable to re-train when new products are added to the system.
You deliver 2 system: training system, and identification system
The image quality may have all kinds of issues. So please download the images and assess the work before you propose.
sorry, above link has problem. here is the good one I've verified:
----- Below are agreements between me and Zhou H. -----
1) 08/10 --
a) binary of training system would be delivered with output format definition/explanation.
b) basic usage instruction shall be delivered
c) employer will follow the instruction and use the binary to verify results
d) bugs shall be fixed before 08/12 to secure the quality of training system.
2) 08/17 --
a) binary of Identification system would be delivered with output format definition/explanation.
identification result shall have accuracy of 99% with top 5 results ( for 100 images tested, 99 shall have the correct match detected within top 5 results)
where top 1 result has the accuracy of 85% or above
b) usage instruction of both the training system (if any change) and identification shall be delivered
c) employer will do following type of tests:
- adding new products, or/and new images to existing products to the system and verify that image of the product can be identified
- use different images of the same products in training set to validate results
d) bugs shall be fixed by 08/22. there might be multiple rounds of releasing of binary and feed back of testing results
3) 08/22 ($800 release)
a) when all testing of binary have passed and meet requirements, source code would be released to employer
b) employer would release $800 upon receiving of complete source code
c) employer would review code and repeat above testing with the source code
d) bugs are expected to be fixed by 08/24 to secure the completion of the whole project
4) 08/27 ($200 release)
a) 2nd release of source code with sufficient comments, design documents, test cases, major algorithms listed
[MODIFICATION] above milestone item 2), for top 5 results, the expected coverage is "90%".
Best Effort and Damage Control:
- the employer fully trust that the bidder is a professional who would do his best to finish this project
- both sides will cooperate to demonstrate high professionalism and secure the success of this project
- Should anything happen that this project can not be finished, the employer is willing to cover the $300 fee charged by Freelancer for this project.
1) the system will contain 2 sub-systems: training system and identification system
2) both systems will be developed using Python 2.7 and open source modules. Any in house developed modules imported shall be included as deliverables.
3) both systems will be executed at regular linux box such as Mac osx ( regular CPU, no GPU)
4) the training system shall be able to train 10000+ products within 10 minutes. adding new products or new images to existing product shall only require re-train the data model, no coding shall be required.
5) identification system shall be able to identify an image among 10,000 products within 30 sec to 1 minute
6) Image for training and identification shall ignore color difference.
7) Both systems shall have high tolerance of
- image on different background (white, grey, black, partial colored etc)
- product rotated to different angle than the training picture (upside down, horizontal vs vertical, etc)
- image is taken at 45 degree of the product, not right top of the product
- handle as grey image, color difference (red. blue, green) shall be ignored
- certain level of image distortion
36 freelancers are bidding on average $1261 for this job
Hello I'm very interesting your project. I have experiences in this kind of project. I am able to implement your requirements with high quality. Please discuss more details over chat. Thanks.