Find Jobs
Hire Freelancers

Data Transformation Pipeline in Jupyter Notebook

$30-250 USD

Completed
Posted over 6 years ago

$30-250 USD

Paid on delivery
Develop a python script (can't be cmd driven) that enables me to quickly cleanse and transform datasets of varying sizes for use in other analytics systems. Using a Jupyter Notebook I want to import complex datasets and wrangle them for use in virtually any target system. Key capabilities include: - Import from flat file - Locate and remove or modify missing or mismatched data - Unnest complex data structures - Identify statistical outliers in your data for review and management - Perform lookups from one dataset into another reference dataset - Aggregate columnar data using a variety of aggregation functions - Merge datasets with joins - Append one dataset to another through union operations This is not intended to be a web app of any kind. There is really no front-end to speak of... I simply want to be able to interact with the Jupyter Notebook to pull all this off. In general, the flow is as follows: 1. Import data: Integrate data from a variety of sources of data. 2. Profile our data: Before, during, and after we transform our data, we can use the visual profiling tools to quickly analyze and make decisions about your data. 3. Build transform recipes: Use the various views in the Transformers to build our transform recipes and preview the results on sampled data. 4. Generate Results: Launch a task to run our recipe on the full dataset. Review results and iterate as needed. 5. Export results: Export the generated results data for use outside of the script running in Jupyter Notebook. Walking through the above, you will have noticed that we imported, cleansed, transformed, and possibly enhanced our data for use in the next step of our analytics pipeline. Here are the greater details of what we are expecting as part of this solution: We expect that most of the functions contained within Pandas will suffice for what we need. However, each column within in an imported Pandas dataframe needs to have all the below available to be applied to it should a user decide to select it: ^^^Please See Uploaded Document for More Details^^^
Project ID: 15348575

About the project

5 proposals
Remote project
Active 7 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
A proposal has not yet been provided
$166 USD in 7 days
0.0 (0 reviews)
0.0
0.0
5 freelancers are bidding on average $185 USD for this job
User Avatar
Hello! Any manipulations done in Jupyter Notebooks are part of my day job as a bioinformatics analyst. Relevant Skills and Experience Python, Jupyter, Data processing Proposed Milestones $294 USD - All
$294 USD in 5 days
5.0 (11 reviews)
4.5
4.5
User Avatar
I have a good experience on working with Advanced R and Python. I have quite a good knowledge of Deep learning and ML Algorithm , have also developed dashboards and Shiny Web Application. Relevant Skills and Experience I understand the project requirement and will deliver the desired product within the time specified. Proposed Milestones $155 USD - milestone
$155 USD in 3 days
4.5 (10 reviews)
4.1
4.1
User Avatar
A proposal has not yet been provided
$177 USD in 7 days
5.0 (3 reviews)
2.7
2.7
User Avatar
I am python expert with data analytics. Relevant Skills and Experience Python, Jupyter notebbok, Excel, Data Processing Proposed Milestones $133 USD - full task
$133 USD in 2 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Franklin, United States
5.0
9
Payment method verified
Member since Apr 17, 2010

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.