Displaying and Searching PDF Content.docx
- 文档编号:11235936
- 上传时间:2023-02-26
- 格式:DOCX
- 页数:9
- 大小:17.51KB
Displaying and Searching PDF Content.docx
《Displaying and Searching PDF Content.docx》由会员分享,可在线阅读,更多相关《Displaying and Searching PDF Content.docx(9页珍藏版)》请在冰豆网上搜索。
DisplayingandSearchingPDFContent
Menu
•Home
•Mission
•Weblog
•MailingLists
•PrivacyStatement
•ContactUs
Getthelatestnewsandupdates!
Signupforourmailinglist!
YourEmailAddress
CocoaTouchforiPhoneOS3
JivaDeVoe
BestPrice$0.01 orBuyNew$26.12
PrivacyInformation
DisplayingandSearchingPDFContentoniPhone
PDFparsingisablackartthatmostprogrammersavoid.“Madnesslurkshere.”Theymumbletothemselvesquietly.ChoosinginsteadtopushtheirPDFsthroughUIWebViewsandcommitothercrimesagainsthumanity.
Itdoesn’thavetobethisway,however.Parsing,displaying,andsearchingPDFsnativelyandatalowlevelisactuallysurprisinglyeasyifyou’renotafraidtogetyourhandsalittledirtywiththeCoreGraphicsPDFfunctions.I’mgoingtoshowyouhow.
Where’sithiding?
Thefirstthingtoknowisthatinordertodothis,youneedtouseCoreGraphicscalls.SoyouneedtoincludetheCoreGraphicsframeworkinyourproject,andinanyfilesyouwanttousethecalls,youhavetoincludetheCoreGraphics.hheader.It’sprobablyalsoworthwhiletoreviewtheCoreFoundationmemorymanagementrules.
Onceyou’vedonethis,it’sverystraightforwardtoreadyourPDFfilesanddisplaytheminacustomview.Let’stakealookathowwedothat.
InitializingaPDFDocument
ToinitializeaPDFdocument,youfirsthavetousethecallCGPDFDocumentCreate,passingintheURLtothedocumentyouwanttoopen.SinceNSURListollfreebridgedtoCFURLRef,youcancreateaCFURLRefjustusingplainoldNSURLlikeso:
NSString*pathToPdfDoc=[[NSBundlemainBundle]
pathForResource:
@"mypdf"ofType:
@"pdf"];
NSURL*pdfUrl=[NSURLfileURLWithPath:
pathToPdfDoc];
Then,tocreatetheCGPDFDocumentRef,callCGPDFDocumentCreateWithURL:
CGPDFDocumentRefdocument=CGPDFDocumentCreateWithURL((CFURLRef)pdfUrl);
DisplayingPages
Sonowyouhaveadocument.Todisplaythecontentofthedocument,youhavetogetthecontentintheformofpages.PDFsarealreadyformattedbypages,soallyouneedtodoisgetatthatdata.FortunatelyCoreGraphicshasfunctionsforthattoo.
Togetthetotalcountofthepagesinthedocument,youusethecallCGPDFDocumentGetNumberOfPages,whichtakesasaparameter,thedocumentyoucreatedabove.So,forexample:
size_tpageCount=CGPDFDocumentGetNumberOfPages(document);
Then,togetanindividualpagetodisplayinyourview,youusethefunctionCGPDFDocumentGetPage,passingthedocumentandthepagenumberyouwant.Likeso:
CGPDFPageRefpage=CGPDFDocumentGetPage(document,currentPage);
NotethatthecurrentPageparameterhereis1based,not0basedasistheusualcaseinprogramming.ThismeansthatthefirstpageofthePDFdocumentisinfact,page1,andnotpage0.
Onceyouhavethepage,youcandisplayitinyourcustomview.TheonlycomplicatedparthereisthatoniPhone,thecoordinatesystemisflippedcomparedtotheMac.ThiscausesaproblembecausetheCoreGraphicsPDFsystemusesthedesktopcoordinatesystemevenoniPhone.(It’syucky,Iknow.)Thesolutiontothisistoflipthepage(thiscanbedoneinyourdrawRectmethodwhenyougotodrawthecontent):
CGPDFPageRefpage=CGPDFDocumentGetPage(document,currentPage);
CGContextRefctx=UIGraphicsGetCurrentContext();
CGContextSaveGState(ctx);
CGContextTranslateCTM(ctx,0.0,[selfbounds].size.height);
CGContextScaleCTM(ctx,1.0,-1.0);
CGContextConcatCTM(ctx,
CGPDFPageGetDrawingTransform(page,kCGPDFCropBox,[selfbounds],0,true));
ThekeyhereisthecalltoCGContextScaleCTM.Whatwedo,iswegetthecurrentdrawingcontext,andthenwescaleit’scoordinatesystemonit’syaxisby-1.0.This,effectively,flipsitupsidedownalongit’shorizontal(x)axis.
Finally,wedrawthepageintothecontextusingtheCGContextDrawPDFPagefunction:
CGContextDrawPDFPage(ctx,page);
CGContextRestoreGState(ctx);
p.
Sobasically,afullon@drawRect@methodforacustomviewthatdrawscontentfromaPDFpage,lookssomethinglikethis:
bc..-(void)drawRect:
(CGRect)inRect;
{
if(document)
{
CGPDFPageRefpage=CGPDFDocumentGetPage(document,currentPage);
CGContextRefctx=UIGraphicsGetCurrentContext();
CGContextSaveGState(ctx);
CGContextTranslateCTM(ctx,0.0,[selfbounds].size.height);
CGContextScaleCTM(ctx,1.0,-1.0);
CGContextConcatCTM(ctx,
CGPDFPageGetDrawingTransform(page,kCGPDFCropBox,
[selfbounds],0,true));
CGContextDrawPDFPage(ctx,page);
CGContextRestoreGState(ctx);
}
}
p.
That'sallthereistoit!
SearchingPDFs
OneofthethingsthatseemstobeparticularlyscarytoprogrammersissearchingPDFs.Iagreethatit’scertainlynotpleasantstufftocode,butit’snothardeither.
Now,IwanttoprefacethisbysayingthatIfeelthiscodeisabitofahack,butitdefinitelyworks,andseemstoworkquitewell.Perhapsthere’sabetterwaytodothis,andifyouknowofone,pleaseletmeknow.Thatsaid,however,here’showI’vedoneit.
ThefirstthingtoknowisthatPDFfilesaremadeupofoperatorswhichdelineatethedatawithinthem.So,forexample,alltextinaPDFdocumentisstoredasglyphsandprefixedbyoperatorsoftypeeither“Tj”,inthecaseofastring,or“TJ”inthecaseofanarrayofstrings.Knowingthis,youcanaccessthePDFdataasastreamandcreateascannerwhichwillcallcallbackmethodsyouspecifywhentheseoperatorsareencountered.Youcanthenretrievethedataaftertheoperatoranduseittobuildyoursearchcorpus.
Thatprobablysoundsintimidating,butit’tnot.Youstartoutbycreatingaclassthatwillbeyour“pagesearcher.”Thiswillholdthestateforyoursearchengine.Here’sthelistingfortheinterfaceforthisclass:
#import
@interfacePDFSearcher:
NSObject
{
CGPDFOperatorTableReftable;
NSMutableString*currentData;
}
@property(nonatomic,retain)NSMutableString*currentData;
-(id)init;
-(BOOL)page:
(CGPDFPageRef)inPagecontainsString:
(NSString*)inSearchString;
@end
p.
Prettystraightforwardstuff.WeusethecurrentDatamembertostorethetextofthepagebeingscanned.Thisisamembervariableratherthanalocalvariablebecausewe'regoingtobeusingCfunctionstofillitin.Don'tworry,that'llmakesenseinamoment.
The@init@methodfortheclassactuallycreatesthecallbacktable:
bc..-(id)init
{
if(self=[superinit])
{
table=CGPDFOperatorTableCreate();
CGPDFOperatorTableSetCallback(table,"TJ",arrayCallback);
CGPDFOperatorTableSetCallback(table,"Tj",stringCallback);
}
returnself;
}
ThearrayCallbackandthestringCallbackfunctionsareCfunctionsthatwillbecalledbythescanner.They’reshownhere:
voidarrayCallback(CGPDFScannerRefinScanner,void*userInfo)
{
PDFSearcher*searcher=(PDFSearcher*)userInfo;
CGPDFArrayRefarray;
boolsuccess=CGPDFScannerPopArray(inScanner,&array);
for(size_tn=0;n { if(n>=CGPDFArrayGetCount(array)) continue; CGPDFStringRefstring; success=CGPDFArrayGetString(array,n,&string); if(success) { NSString*data=(NSString*)CGPDFStringCopyTextString(string); [searcher.currentDataappendFormat: @"%@",data]; [datarelease]; } } } voidstringCallback(CGPDFScannerRefinScanner,void*userInfo) { PDFSearcher*searcher=(PDFSearcher*)userInfo; CGPDFStringRefstring; boolsuccess=CGPDFScannerPopString(inScanner,&string); if(success) { NSString*data=(NSString*)CGPDFStringCopyTextString(string); [searcher.currentDataappendFormat: @"%@",data]; [datarelease]; } } p.
Asyoucansee,thesewillbecalledwhentheoperatorsfire.Whentheydo,wepopthedataoffthescanner,andaddittothesearcher'scorpus.Theuserinfopointerisactuallypointingtooursearcherobject(basedonthefactthatwewillpassitasthesecondparameterto@CGPDFScannerCreate@inthenextcode).SowecantypecastittoaPDFSearcherandthenaccessthatcurrentDatamember(rememberIsaiditwouldmakesenselater?
).
Theactualsearchmethodlookslikethis:
bc..-(BOOL)page:
(CGPDFPageRef)inPagecontainsString:
(NSString*)inSearchString;
{
[selfsetCurrentData:
[NSMutableStringstring]];
CGPDFContentStreamRefcontentStream=CGPDFContentStreamCreateWithPage(inPage);
CGPDFScannerRefscanner=CGPDFScannerCreate(contentStream,table,self);
boolret=CGPDFScannerScan(scanner);
CGPDFScannerRelease(scanner);
CGPDFContentStreamRelease(contentStream);
return([[currentDatauppercaseString]
rangeOfString:
[inSearchStringuppercaseString]].location!
=NSNotFound);
}
p.