RNA-seq数据的处理与分析_精品文档.pdf
- 文档编号:3210221
- 上传时间:2022-11-20
- 格式:PDF
- 页数:25
- 大小:1.27MB
RNA-seq数据的处理与分析_精品文档.pdf
《RNA-seq数据的处理与分析_精品文档.pdf》由会员分享,可在线阅读,更多相关《RNA-seq数据的处理与分析_精品文档.pdf(25页珍藏版)》请在冰豆网上搜索。
RNA-seqDataHandlingandAnalysisKevinChildsStatisticalgenetics/genomicsjournalclub中国测序论坛OverviewFasta/FastqfileformatsNCBISRADatapreparationBowtie/Tophat/CufflinksVelvet/OasesTrinity中国测序论坛FastaFileFormat#FASTAgi|1800214|gb|U56729.1|SBU56729SorghumbicolorphytochromeACGCATCCTTCCGCGCCGGGCATGGGCACCGCGTCGGCGCGCGCCCCTACCCAGTCGTCGACTTGATGCTGCTCACTCGCACTCGTCGCAGCGCCCCACGCCCCGCTATTTATGCGTACTTGCTTGCCGGGAGAGTCGCTGGAGGTGGGCGTCCTCCTCCCGCTCCAGAGCTCGCTGCTTCGCTCCACCCACCCTTAAGCAGGAGTGATATCTGGTGGTTTTTCAAAAGAAGACAAAAATGTCTTCCTCGAGGCCTGCCCACTCTTCCAGTTCATCCAGTAGGACTCGCCAGAGCTCCCAGGCAAGGATATTAGCACAAACAACCCTTGATGCTGAACTCAATGCAGAGTATGAAGAATCTGGTGATTCCTTTGATTACTCCAAGTTGGTTGAAGCACAGCGGAGCACTCCATCTGAGCAGCAAGGGCGATCAGGAAAGGTCATAGCCTACTTGCAGCATATTCAAAGAGGAAAGCTAATCCAACCATTTGGTTGCTTGTTGGCCCTTGACGAGAAGAGCTTCAGGGTCATTGCATTCAGTGAGAATGCACCTGAAATGCTCACAACGGTCAGCCATGCTGTGCCAAACGTTGATGATCCCCCAAAGCTAGGAATTGGTACCAATGTGCGCTCCCTTTTCACTGACCCTGGTGCTACAGCACTGCAGAAGGCACTAGGATTTGCTGATGTTTCTTTGCTGAATCCTATCCTAGTTCAATGCAAGACCTCAGGCAAGCCATTCTATGCCATTGTTCATAGGGCAACTGGTTGTCTGGTGGTTGATTTTGAGCCTGTGAAGCCTACAGAATTTCCTGCCACTGCTGCTGGGGCTTTGCAGTCT中国测序论坛FastqFileFormatReadNameSequenceQualityQualityscoresareinASCIIcharactersrepresentingcodedPhredscores.ASCIIcodesstartatASCII33orASCII64.AllSRAcodesconvertedtoASCII33Thesescoresprovidealikelihoodthatthebasewascalledincorrectly.101in10chancethebasecallisincorrect201in100chancethebasecallisincorrect301in1000chancethebasecallisincorrect中国测序论坛HighThroughputSequencingPlatformsIlluminaHiSeq1000andHiSeq2000IlluminaGenomeAnalyzerIIx*LifeSciences/Roche454pyrosequencingABISolidSequencingSystem*PacificBiosciences*IonTorrentCambridgeNannopore(late2012?
)中国测序论坛HighThroughputSequencingHiSeq2000HighlyparallelsequencingbysynthesisSingleandpaired-endreadsbetween50bpand100bp187millionsingleendor374millionpaired-endreadsperlaneHigherrorrateinthe3end中国测序论坛NCBISRASRAtoolkitfastq-dump/opt/sratoolkit/fastq-dumpSRR373821.lite.sra/opt/sratoolkit/fastq-dump-split-filesSRR329070.lite.sra中国测序论坛ReadQualitywiththeFASTX-Toolkithttp:
/hannonlab.cshl.edu/fastx_toolkit/中国测序论坛ReadQualitywiththeFASTX-ToolkitBadSequenceGoodSequence中国测序论坛FASTXToolkitfastx_quality_stats-Q33iinitial_fastq_file.fastqostats.txtfastx_quality_boxplot_graph.sh-Q33istats.txttTitleoquality.pngfastx_clipper-Q33-v-aAATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-iinitial_fastq_file.fastq-ofastq_file_clipped.fastqfastx_artifacts_filter-Q33-v-ifastq_file_clipped.fastq-ofastq_file_artifact_filtered.fastqfastq_quality_trimmer-Q33-v-t20-l30-ifastq_file_artifact_filtered.fastq-ofastq_file_cleaned.fastq-QisanundocumentedparametertoindicatethatqualityvaluesuseASCII33encoding.中国测序论坛FastQChttp:
/www.bioinformatics.bbsrc.ac.uk/projects/fastqc/Aqualitycontroltoolforhighthroughputsequencedata.中国测序论坛SamtoolsPackageofprogramsformanipulatingsamandbamfilessamsequencealignmentmapbambinaryalignmentmapcompressedformofsamfilehttp:
/中国测序论坛TuxedoSuiteBowtiefastandqualityawareshortreadalignerforaligningDNAandRNAsequencereadsTopHatfast,splicejunctionmapperforRNA-SeqreadsbuiltontheBowtiealignerCufflinksassemblestranscripts,estimatestheirabundances,andtestfordifferentialexpressionandregulationusingthealignmentsfromBowtieandTopHat中国测序论坛BowtieAlignsshortreadstolargegenomesFormsthebasisforTopHat,Cufflinks,Crossbow,andMyrnaUnlessyouareworkingwithgenomicDNAderivedshortreads,youwillnotdirectlyuseBowtieWiththeexceptionofusingbowtie-buildtocreateangenomicsequenceindexfile中国测序论坛TopHatBuiltonBowtieandusesthesamegenomeindexUsedforalignmentofRNA-SeqreadstoagenomeOptimizedforpaired-end,Illuminasequencereads70bp中国测序论坛TopHat中国测序论坛QuantificationofgeneexpressionusingRNA-seqreadsTestsfordifferentialexpressionUsesoutputfrombowtie/tophatAssemblesreadalignmentsintotranscriptsUsescufflinks-predictedtranscriptsoruser-suppliedgenemodelsforquantificationEstimatestranscriptabundancebalancedacrosstranscriptisoformsCufflinks中国测序论坛Cufflinks中国测序论坛Bowtie/Tophat/Cufflinksbowtie-buildpseudomolecule.fapseudomolecule.indextophat-p6-solexa1.3-quals-i5-I1000-r100-no-novel-juncs-GTFpseudomolecule.gtf-o/output/directorypseudomolecule.indexpurified_reads.fastqsamtoolssorttophat_output_pairs.bamtophat_output_pairs_sortedsamtoolsview-otophat_output_pairs_sorted.samtophat_output_pairs_sorted.bamcufflinks-q-o/output/directory/-p4-Gpseudomolecule_corrected.gtftophat_output_pairs_sorted.sam中国测序论坛Velvet/OasesGenome/transcriptomeassemblypackageVelveth/velvetgworkwellforgenomesbutproducefragmentedtranscriptomesassemblies.Itsmodulesexplicitlyassumelinearityanduniformcovera
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- RNA seq 数据 处理 分析 精品 文档