Hi there!
For this project, you'll be using a publically available data source of schools in a CSV format. You'll download the data file and write two Python programs to manipulate the data.
We expect this to take around 2 hours to complete. If it seems like it will take significantly longer than that to complete, please ask us first to make sure you aren't over-complicating the project.
Please fork this repository and check in your code and send us a link to your repo
Part 0: Get the data
Go to this page and download the dataset described as:
Year 2005-2006 (v.1b), States A-I, ZIP (769 KB) CSV File. This is the only file you will be using. You can ignore the other files on this page.
You may also find it helpful to consult the documentation for this dataset.
Unzip the file, rename it to [login to view URL], and put it in the directory where you'll write your Python program.
Now you have the data you need to get started.
Part 1: Load data from CSV and compute stats.
Write a method (or a method per question if you'd prefer) that loads the data out of the CSV file you downloaded and computes answers to the following questions:
How many total schools are in this data set?
How many schools are in each state?
How many schools are in each Metro-centric locale?
What city has the most schools in it? How many schools does it have in it?
How many unique cities have at least one school in it?
Guidelines
Create a file called [login to view URL], and write a method called print_counts.
Please implement all of these features using pure python.
You may use the following libraries: csv, time, itertools
No other libraries may be used.
Part 2: Search over school data
We'd like teachers to be able to easily find the school they teach at. In order to do this, we'd like to offer a search feature that lets them search for their school using plain text.
This feature should search over school name, city name, and state name.
The top 3 matching results should be returned (see below for examples).
Guidelines
When a query doesn't match exactly, you'll need to come up with a set of rules to rank results. In particular, make sure more precise matches show up at the top of the list, and if there isn't an exact match, but there is a close match, some results are returned. There is no perfect set of rules, but you should come up with a set that improve the end user search experience as much as possible.
Searches should run in real-time, meaning that they should return results to the user in less than 200ms. It's ok to perform data loading and processing up front that takes longer than this if you'd like.
Create a file called [login to view URL], and write a method called search_schools.
Please implement all of these features using pure python.
You may use the following libraries: csv, time, itertools
No other libraries may be used.
Evaluation
Accuracy
We'll evaluate the accuracy of your search using sample queries. We have included a few below in Test Cases that you can test with (we'll also test with others).
The following queries should return the results shown below as the top hits. If multiple results are shown below, it doesn't matter if one appears before the other, but they must be the first results returned.
If you see [Next Best Hit], that means that you should include a reasonable hit for the search query, but that there isn't a specific hit that we'll be looking for.
Performance
All results should return within 200ms.
EXTRA : All results are returned within 10ms.
Code Quality
We're looking for clear, easy to understand code. Write code that makes your thinking algorithm as obvious as possible.
Prioritize readability over cleverness. We care less that you can code a solution to this problem in one line, and more that we are able to easily follow your code.
Don't worry about documentation for this project. Do your best to make the code readable without requiring reading lots of documentation.
[login to view URL]
This is sairahul started my careers as big data developer. I built a data lake platform with AWS services like Glue, Spark, Lambda,S3, Athena ..... in python and scala language. I do have experience on different web technologies like django, flask, mobile app development and different devops tools like jenkins, docker, kubernetes..... and a good probelm solver.
PYTHON EXPERT MASTERS IN COMPUTER SCIENCE
Hello, there!
Thank you for sharing your project requirements; I read the project description thoroughly and would like to participate in your project. I am a pleasant person to work with, as well as a determined and self-motivated individual. To deliver the finest quality and client happiness, I work according to your specifications. I hope you will find my services useful. I guarantee that I will meet or exceed your expectations.
Thanks
I have multiple years of experience as a Data Engineer and Data Analyst working with Python and SQL, I'd be happy to take on this assignment. Happy to discuss more details if you want.
Hello, im a data scientist who graduated from one of the most important schools in morocco (INPT), i cumulated a lot of experience in machine learning, Deep learning, Computer vision and ETL during my career, Actually i work in one of the biggest companies in morocco
I can guarantee you that the project will be done in the fastest and most precise way possible with all the details and comments mentioned.
I hope you choose my profile so that we can work together.