生物信息學課件英文原版課件 (43)_第1頁
生物信息學課件英文原版課件 (43)_第2頁
生物信息學課件英文原版課件 (43)_第3頁
生物信息學課件英文原版課件 (43)_第4頁
生物信息學課件英文原版課件 (43)_第5頁
已閱讀5頁,還剩68頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領

文檔簡介

1、Computing with Whole GenomesStuart M. BrownResearch Computing, NYU School of MedicineThe Human Genome ProjectGenome SequencingThe ability to sequence entire genomes has created a huge demand for bioinformaticsSimple data management for the sequencing projectsGenome assemblyAnnotationPublic access to

2、 the dataNew types of whole genome analysesGenome sequencing factories churn out raw sequence data at an ever increasing rateFewer scientists are involved in generating data and more are involved in data analysisSequence Pipeline Laboratory Information Management - track samples, store raw dataAssem

3、ble fragmentsTrack orientation and distance for paired reads from libraries of known sized clonesFind genesGene prediction algorithmsMap known genes and cDNAsAnnotation and public access to dataRaw Genome Data:Finding genes in genome sequence is not easy About 1% of human DNA encodes functional gene

4、s. Genes are interspersed among long stretches of non-coding DNA.Repeats, pseudo-genes, and introns confound mattersThe next step is obviously to locate all of the genes and describe their functions. This will probably take another 15-20 years!UCSCGene Prediction Works PoorlyAlgorithms are not accur

5、atenon-consensus splice siteswhere is the true first 5 exon?cDNA data is incomplete and confusingtruncated cDNA sequencesreal alternative splicingPseudo-genes and true gene duplicationvs. Mistakes in the genome assemblyEnsembl at EBI/EMBLIntegrate With other Genetic DatasetsCytogenetic and molecular

6、 markers(STS, microsatellites, radiation hybrids)Known mutationsOMIM for humansHuge collection of mouse genetic dataNearly complete collection of yeast mutantsSNPsGene ExpressionII. GenomicsWhat is Genomics?An operational definition: The application of high throughput automated technologies to biolo

7、gy.A philosophical definition: A wholistic or systems approach to the study of information flow within a cell. A technology created by the availability of the genome sequenceGenomics TechnologiesDNA microarraysgene expression (measure RNA levels)SNP GenotypingPharmacogenomicsProteomicsAlleles are cr

8、eated by mutations in the DNA sequence of one person - which are passed on to their descendantsGenome DiversityHuman Genetic VariationEvery human has essentially the same set of genesBut there are different forms of each gene - known as allelesblue vs. brown eyesgenetic diseases such as cystic fibro

9、sis or Huntingtons disease are caused by dysfunctional allelesSome Diseases Involve Many GenesThere are a number of classic “genetic diseases caused by mutations of a single gene Huntingtons, Cystic Fibrosis, Tay-Sachs, PKU, etc.There are also many diseases that are the result of the interactions of

10、 many genes:asthma, heart disease, cancerEach of these genes may be considered to be a risk factor for the disease.Groups of SNP markers may be associated with a disease without determining mechanismMultiple CausesSome complex (i.e. multi-gene) diseases may actually be caused by any of a group of di

11、fferent genes (multiple causes), but all show the same symptoms.Different diseases with similar symptoms?SNP linkage analysis can identify these sub-populations more efficiently than classical molecular genetic approaches.Clinical Manifestationsof Genetic Variation(All disease has a genetic componen

12、t)Susceptibility vs. resistanceVariations in disease severity or symptomsReaction to drugs (pharmacogenetics)All of these traits can be traced back to particular genes (or sets of genes) So Whats a SNPA mutation that causes a single base change is known as a Single Nucleotide Polymorphism (SNP)SNPs

13、are very common in the human population.there are SNPs located near all genesthey can be used as markersMost of these have no visible effectin regions between genesSNP GenotypingIt is possible to measure many thousands of SNPs simultaneously in a small blood sample from a patientCan compare “genotyp

14、es for SNP markers linked to virtually any traitA human genome can be characterized with a few thousand common SNP markers on a single chipa personal genetic profile SNPs are Very CommonSNPs are very common in the human population.Between any two people, there is an average of one SNP every 1250 bas

15、es.Most of these have no phenotypic effectVenter et al. estimate that only gnl|dbSNP|rs1042574_allelePos=51 total len = 101 |taxid = 9606|snpClass = 1 Length = 101 Score = 149 bits (75), Expect = 3e-33 Identities = 79/81 (97%) Strand = Plus / Plus Query: 1489 ccctcttccctgacctcccaactctaaagccaagcacttt

16、atatttttctcttagatatt 1548 | | |Sbjct: 1 ccctcttccctgacctcccaactctaaagccaagcactttatattttcctyttagatatt 60 Query: 1549 cactaaggacttaaaataaaa 1569 |Sbjct: 61 cactaaggacttaaaataaaa 81If a matchingSNP is found, then it can bedirectly located on the Genome mapSNP markersSNPs can be found that are linked to

17、 any disease alleles.These mutations are likely to be neutral - they have no direct effect on phenotypeLinked SNPs can be used as markers for the disease in diagnostic tests.Closely linked markers rarely separate, a pair of flanking markers almost never do.DNA Diagnostic Testinghereditary diseases -

18、 potential parents, pre-natal, late onset diseasesgenes that predisposes to disease (risk factors)genotyping of infectious agents (bacterial & viral)measure the type and stage of cancer tumorsforensics - using DNA testing to establish identityDirect Medical ApplicationsDiagnosis Type of cancerAggres

19、sive or benign?Monitor treatment outcomeIs a treatment having the desired effect on the target tissue?Pharmacogenomics The use of DNA sequence information to measure and predict the reaction of individuals to drugs.Personalized drugsFaster clinical trialsLess drug side effectsPeople React Differentl

20、y to DrugsSide effectsEffectivenessThere are genes that control these reactionsSNP markers can be used to identify these genesThere are proteins that chemically activate or inactivate drugs.Other proteins can directly enhance or block a drugs activity.There are also genes that control side effectsSo

21、me Gene Products Interact with DrugsSome Examples10% of African Americans have polymorphic alleles of Glucose-6-phosphate dehydrogenase that lead to haemolyitic anemia when they are given the anti-malarial drug primaquine.0.04% of individuals are homozygous for alleles of psedocholineseterase that a

22、re unable to inactivate the muscle relaxant drug succinylcholine, leading to respiratory paralysis.Succinylcholine ToxicityThere are many polymorphic alleles of the N-acetlytransferase (NAT2) gene with reduced (or acclerated) ability to inactivate the drug isoniazid. Some individuals developed perip

23、heral neuropathy in reaction to this drugSome alleles of the NAT2 gene are also associated with succeptibility to various forms of cancerIsoniazid MetablolismCytochrome P45010% of the Caucasian population is homozygous for alleles of the Cytochrome P450 gene CYP2D6 that do not metabolize the hyperte

24、nsion drug debrisoquine, which can lead to dangerous vacular hypotension.ACEPatients homozygous for an allele with a deletion in intron 16 of the gene for angiotensin-converting enzyme (ACE) showed no benefit from the hypertension drug enalapril while other patients did benefit.These drug response p

25、henotypes are associated with a set of specific gene alleles.Identify populations of people who show specific responses to a drug.In early clinical trials, it is possible to identify people who react well and react poorly.Collect Drug Response DataScan these populations with a large number of SNP ma

26、rkers.Find markers linked to drug response phenotypes.It is interesting, but not necessary, to identify the exact genes involved.Make Genetic ProfilesUse the ProfilesGenetic profiles of new patients can then be used to prescribe drugs more effectively & avoid adverse reactions.Can also speed clinica

27、l trials by testing on those who are likely to respond well.Real World ApplicationsMost of the major pharmaceutical companies are currently collecting pharmacogenomic data in their clinical trials.Data is yet to be published.Genetic indications for drug use are still a few years away.Gene Expression

28、 ProfilingSequence bulk cDNAs from different tissuesNCBI CGAP website allows digital differential displaySAGE (sequence short tags from cDNAs)MicroarraysDigital Differential Display cDNA spotted microarraysLink Gene Expression to Genome SequenceIdentify promoter and 5 sequence for a group of co-expr

29、essed genes.Scan for known transcription factor binding sites.Predict new regulatory sites based on common sequence elements.Whole Genome ComparisonsComparative GenomicsUse mouse homologs to find human genescDNAsChromosome scanning for conserved regionsSyntenyUse knockouts to define functionDeep hom

30、ology Metabolic reconstructionConserved Regulatory RegionsIn a syntenic stretch of DNA, the protein coding regions will be conserved, but not the intergenic regionsEXCEPT for important regulatory motifsVISTA website: ://vista/Metabolic ReconstructionIf we know the genome sequence, and

31、 we know the metabolic pathwaysThen we should be able to map genes to the pathways in every organismWIT2 (What is There) is an attempt to do this ://WIT2/How can organisms lack genes that are essential in related groups?EMP DatabaseEnzymes and Metabolic Pathways database (EMP) :/emp.m

32、/2-Oxobutanoate-Isoleucine, 2-Oxoglutarate_Anabolism (NADPH,_NADH)Clusters of Orthologus Groups (COGs)COGs were delineated by comparing protein sequences encoded in 43 complete genomes, representing 30 major phylogenetic lineages.Each COG consists of individual proteins or groups of paralo

33、gs from at least 3 lineages and thus corresponds to an ancient conserved domain. A simple COG with two yeast paralogs. YPL040c is the yeast mitochondrial isoleucyl-tRNA synthetase; the bacterial orthologs and that from M. jannaschii are the BeTs for this yeast protein, but the reverse is true only o

34、f the bacterial proteins., For YBL076c (yeast cytoplasmic isoleucyl-tRNA synthetase), the M. jannaschii ortholog is a symmetrical BeT, whereas the bacterial genes are asymmetrical.ProteomicsIdentify all of the proteins in an organismPotentially many more than genes due to alternative splicing and po

35、st-translational modificationsQuantitate in different cell types and in response to metabolic/environmental factorsProtein-protein interactionsProtein-Protein InteractionsMetabolic and regulatory pathwaysTranscription factorsCo-expressionBiochemical data crosslinkingyeast 2-hybrid affinity taggingUseful feedback to genome annotation/protein function and gene expression BIND - The Biomolecular Interaction Network DatabaseImpact on Bioinformatics Genomics produces high-throughput, high-quality data, and bioi

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論