you are a given a data set of approximately 20,000 news documents collected from a set of newsgroups (mailing lists). The set of documents (email messages) is partitioned almost evenly across 20 different topics. The documents of each newsgroup are stored in one directory. Each news document is stored in a text file in a semi-structured format.
13 freelancers are bidding on average $65 for this job
message me for details and discussion Relevant Skills and Experience - java / python - text processing - 20000 text files are NOT big data Proposed Milestones $60 USD - dummy milestone
Still looking for 1st job here contact me Relevant Skills and Experience iam working in Java for 2 years,developing softwers Proposed Milestones $25 USD - end