当前流行的商用芯片.docx
- 文档编号:2435272
- 上传时间:2022-10-29
- 格式:DOCX
- 页数:10
- 大小:521.53KB
当前流行的商用芯片.docx
《当前流行的商用芯片.docx》由会员分享,可在线阅读,更多相关《当前流行的商用芯片.docx(10页珍藏版)》请在冰豆网上搜索。
当前流行的商用芯片
1AMDAthlon
LiketheIntelPentiumchip,ofwhichtheAMDAthlonisaclonewithrespecttoIntel'sx86InstructionSetArchitecture,itisfrequentlyusedinclusters.Thereforewediscussthisprocessorherealthoughitisnotusedpresentlyinintegratedparallelsystems.\\TheAthlonprocessorhasmanyfeaturesthatarealsopresentinmodernRISCprocessors:
itsupportsout-of-orderexecution,hasmultiplefloating-pointunits,andcanissueupto9instructionssimultaneously.AblockdiagramoftheprocessorisshowninFigure7
Figure7:
BlockdiagramofAMDAthlonprocessor.
ItshowsthattheprocessorhasthreepairsofIntegerExecutionUnitsandAddressGenerationUnitsthatviaan18-entryIntegerSchedulertakescareoftheintegercomputationsandaddresscalculations.BoththeIntegerSchedulerandtheFloating-PointSchedulerarefedbythe72-entryInstructionControlUnitthatreceivesthedecodedinstructionsfromtheinstructiondecoders.AninterestingfeatureoftheAthlonisthepre-decodingofx86instructionsinfixed-lengthmacro-operationsthatcanbestoredinaPre-decodeCache.Thisenablesafasterandmoreconstantinstructionflowtotheinstructiondecoders.LikeinRISCprocessors,thereisaBranchPredictionTableassistinginbranchprediction.
Thefloating-pointunitsallowout-of-orderexecutionofinstructionsviatheFPUStackMap&Renameunit.Itreceivesthefloating-pointinstructionsfromtheInstructionControlUnitandreordersthemifnecessarybeforehandingthemovertotheFPUScheduler.TheFloating-PointRegisterFileis88elementsdeepwhichapproachesthenumberofregistersasisavailableonRISCprocessors.
Thefloating-pointpartoftheprocessorcontainsthreeunits:
aFloatingStoreunitthatstoresresultstotheLoad/StoreQueueUnitandFloatingAddandMultiplyunitsthatcanworkinsuperscalarmode,resultingintwofloating-pointresultsperclockcycle.BecauseofthecompatibilitywithIntel'sPentiumIIIprocessors,thefloating-pointunitsalsoareabletoexecuteIntelMMXinstructionsandAMD'sown3DNow!
instructions.However,thereisthegeneralproblemthatsuchinstructionsarenotaccessiblefromhigherlevellanguages,likeFortran90orC(++).Bothinstructionsetsaremeantformassiveprocessingofvisualisationdataandonlyallowfor32-bitprecisiontobeused.ThesystembuscomplementingtheAthlonprocessorisalsofasterthanwhatisstandardlyavailableforIntelPIIIprocessors:
200MHzinsteadof133MHz.AMDclaimsthebusspeedcanbescaledtoover400MHz.
Withthecurrentclockfrequencyof1-1.33GHzofthecurrentprocessorstheAthlonisaninterestingalternativeformanyoftheRISCprocessorsthatareavailableatthismoment.
2IntelPentium4
AlthoughPentiumprocessorsarenotappliedinintegratedparallelsystemsthesedays,theyplayamajorroleintheclustercommunityasmostcomputenodesinBeowulfclustersareofthistype.Thereforewebrieflydiscussalsothistypeofprocessor.
Unfortunately,Intelonlyprovidesscantinformationonthisnewprocessor,notevenenoughtoputtogetherareliableblockdiagramoftheprocessor.Still,thereanumberofdistinctivefeatureswithrespecttotheearlierPentiumgenerations.Therearetwomainwaystoincreasetheperformanceofaprocessor:
byraisingtheclockfrequencyandbyincreasingthenumberofinstructionspercycle(IPC).Thesetwoapproachesaregenerallyinconflict:
whenonewantstoincreasetheIPCthechipwillbecomemorecomplicated.Thiswillhaveanegativeimpactontheclockfrequencybecausemoreworkhastobedoneandorganisedwithinthesameclockcycle.VeryseldomlychipdesignerssucceedinraisingbothclockfrequencyandIPCsimultaneously.AlsointhePentium4thiscouldnotbedone.Intelhaschosenforahighclockspeed(initiallyabout40%morethanthatofthePentiumIIIwiththesamefabricationtechnology)whiletheIPCdecreasedby10--20%.Thisstillgivesanetperformancegainevenifotherchangeswouldhavebeenmadetotheprocessor.Tosustaintheveryhighclockratethatthepresentprocessorshave,currently1.7GHz,averydeepinstructionpipelineisrequired.Theinstructionpipelinehasnolessthan20stages,doublethenumberofstagesinthatofthePentiumIII.Althoughthisfavoursahighclockrate,thepenaltyforapipelinemiss(e.g.,abranchmis-predict)ismuchheavierandthereforeIntelhasimprovedthebranchpredictionbyaincreasingthesizeoftheBranchTargetBufferfrom0.5to4KB.Inaddition,thePentium4hasanexecutiontracecachewhichholdspartlydecodedinstructionsofformerexecutiontracesthatcanbedrawnupon,thusforegoingtheinstructiondecodephasethatmightproduceholesintheinstructionpipeline.
Theprimarycacheisquitesmallbytoday'sstandards:
8KB.Thisisagaintoaccommodatethehighclockspee
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 当前 流行 商用 芯片