Find Jobs
Hire Freelancers

write a program using python language

$10-30 USD

Closed
Posted over 6 years ago

$10-30 USD

Paid on delivery
Download the genome in FASTA format. Open this file in Python and read it in construction a single DNA string of length 580076 characters long. Be sure and remove the header information in the file. An easy way to do this is just read in every line, removing the white space, and concatenating the resulting strings together. Print out the strand and its length to make sure this works. Part A of this project is the analysis of the pattern of start codons and stop codons in the above sequence. It should be noted that stop codons always stop the translation while start codons do not always start a translation. I wonder how many start and stop codons are there. Are there more starts than stops or vice versa? Check this out. Write a script that counts them.. Remember there are three different stops. How many times is the distance from stop to the next stop greater than 600? Are there multiple starts found along these long stretches of no stops? What thoughts do you have about this? Have your program print out the six frames (three forward and 3 complementary) as follows. Large stop blocks are those that are 600 chars long without any intervening stops. For each frame print a line like this: Frame # : startct= # , stopct= # , large stop blocks= #. As we discussed in class build a dict() that associates all the start codons just prior to a stop with the stop. The stop is the key and the list of starts is the value We will restrict our analysis to these large stop to stop blocks. Part B: But before we do this let’s look at the genbank file for this little guy. It contains the actual genes that the original researchers annotated. Normally when the sequencing is first performed this information is not known. They have to look at every ORF, and either convert it to a protein sequence and check if this protein is known or at least look at the sequence statistically and see if it resembles known protein in its pattern. In this file you will notice CDS entries. The Coding Sequence (CDS) is the actual region of DNA that is supposedly translated to form proteins, tRNA etc. Some are hypothetical in the sense that the protein was not observed at the time of the annotation. While the ORF may contain introns (in eukaryotes), the ORF and the CDS are the same in prokaryotes. Since this is a long file write a program that extracts the gene information using regular expressions. For each gene put the gene in a list(or some other data structure ie dict()) with the start location being the first and the stop its second value. Print out the smallest gene, the largest gene in length and the number of genes. We can use this list to check to see if any of the ORF’s we find in the FASTA file are in the dictionary. I will discuss regular expressions on monday. Part C: The final stage of this program is to determine which of the large ORFs that you find in the FASTA file are actual genes in the gb file. Just go thru the either the Fasta or the Genbank data and see if the gene or ORF is in the other. Print out the number that you find and the largest 5 genes. Just print its start and stop value and whether or not it is on the complementary strand. Also print out the number of large ORF’s that are not found in the gb file.
Project ID: 15315350

About the project

13 proposals
Remote project
Active 6 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
13 freelancers are bidding on average $31 USD for this job
User Avatar
please ignore the bid amount we will discuss the price later in the chat after we have discussed the project thoroughly . Relevant Skills and Experience: . Proposed Milestones: 30 - . Hi rexxjohnson! please inbox me to discuss the project
$30 USD in 2 days
5.0 (5 reviews)
4.6
4.6
User Avatar
i have very good experience in python besides my skills in data structures and algorithms as well Relevant Skills and Experience python OOP data structures Proposed Milestones $25 USD - whole project
$25 USD in 2 days
5.0 (3 reviews)
4.3
4.3
User Avatar
A proposal has not yet been provided
$15 USD in 0 day
4.2 (2 reviews)
3.1
3.1
User Avatar
$25 for 1 day Relevant Skills and Experience Python Proposed Milestones $25 USD - complete work
$25 USD in 1 day
4.9 (3 reviews)
2.3
2.3
User Avatar
I am an undergraduate with years of experience working with Python. Also I am good with finding algorithmic solutions for any problem and I am certain that I can do this task. Relevant Skills and Experience Python Proposed Milestones $5 USD - Part A $5 USD - Part B $10 USD - Part C and completed project
$20 USD in 2 days
4.0 (1 review)
2.3
2.3
User Avatar
Hi, This looks like an interesting project and I think I can help. My strengths lie in analysing using scripts like the one you propose. I'd like to get some more information from you if you would like to discuss in chat? James
$35 USD in 2 days
3.0 (1 review)
0.6
0.6
User Avatar
I have good experience about python language. I am an Engineer and l will like to help you. Relevant Skills and Experience I have my final year project using python language and implement it on raspberry pi3.
$18 USD in 1 day
0.0 (0 reviews)
0.0
0.0
User Avatar
5 years experience in bioinformatics Relevant Skills and Experience strong genomic and genetic background Proposed Milestones $42 USD - result and scripts
$42 USD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
A proposal has not yet been provided
$15 USD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hello! I can help You with your problem, and You can not pay me for my work. Соответствующие навыки и опыт Look my account. Предлагаемые промежуточные платежи $10 USD - ...
$10 USD in 2 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
United States
0.0
0
Payment method verified
Member since Oct 3, 2017

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.