书签 分享 收藏 举报 版权申诉 / 9

类型Displaying and Searching PDF Content.docx

  • 文档编号:11235936
  • 上传时间:2023-02-26
  • 格式:DOCX
  • 页数:9
  • 大小:17.51KB

Sobasically,afullon@drawRect@methodforacustomviewthatdrawscontentfromaPDFpage,lookssomethinglikethis:

bc..-(void)drawRect:

(CGRect)inRect;

{

if(document)

{

CGPDFPageRefpage=CGPDFDocumentGetPage(document,currentPage);

CGContextRefctx=UIGraphicsGetCurrentContext();

CGContextSaveGState(ctx);

CGContextTranslateCTM(ctx,0.0,[selfbounds].size.height);

CGContextScaleCTM(ctx,1.0,-1.0);

CGContextConcatCTM(ctx,

CGPDFPageGetDrawingTransform(page,kCGPDFCropBox,

[selfbounds],0,true));

CGContextDrawPDFPage(ctx,page);

CGContextRestoreGState(ctx);

}

}

 

p.

That'sallthereistoit!

SearchingPDFs

OneofthethingsthatseemstobeparticularlyscarytoprogrammersissearchingPDFs.Iagreethatit’scertainlynotpleasantstufftocode,butit’snothardeither.

Now,IwanttoprefacethisbysayingthatIfeelthiscodeisabitofahack,butitdefinitelyworks,andseemstoworkquitewell.Perhapsthere’sabetterwaytodothis,andifyouknowofone,pleaseletmeknow.Thatsaid,however,here’showI’vedoneit.

ThefirstthingtoknowisthatPDFfilesaremadeupofoperatorswhichdelineatethedatawithinthem.So,forexample,alltextinaPDFdocumentisstoredasglyphsandprefixedbyoperatorsoftypeeither“Tj”,inthecaseofastring,or“TJ”inthecaseofanarrayofstrings.Knowingthis,youcanaccessthePDFdataasastreamandcreateascannerwhichwillcallcallbackmethodsyouspecifywhentheseoperatorsareencountered.Youcanthenretrievethedataaftertheoperatoranduseittobuildyoursearchcorpus.

Thatprobablysoundsintimidating,butit’tnot.Youstartoutbycreatingaclassthatwillbeyour“pagesearcher.”Thiswillholdthestateforyoursearchengine.Here’sthelistingfortheinterfaceforthisclass:

#import

@interfacePDFSearcher:

NSObject

{

CGPDFOperatorTableReftable;

NSMutableString*currentData;

}

@property(nonatomic,retain)NSMutableString*currentData;

-(id)init;

-(BOOL)page:

(CGPDFPageRef)inPagecontainsString:

(NSString*)inSearchString;

@end

 

p.

Prettystraightforwardstuff.WeusethecurrentDatamembertostorethetextofthepagebeingscanned.Thisisamembervariableratherthanalocalvariablebecausewe'regoingtobeusingCfunctionstofillitin.Don'tworry,that'llmakesenseinamoment.

The@init@methodfortheclassactuallycreatesthecallbacktable:

bc..-(id)init

{

if(self=[superinit])

{

table=CGPDFOperatorTableCreate();

CGPDFOperatorTableSetCallback(table,"TJ",arrayCallback);

CGPDFOperatorTableSetCallback(table,"Tj",stringCallback);

}

returnself;

}

 

ThearrayCallbackandthestringCallbackfunctionsareCfunctionsthatwillbecalledbythescanner.They’reshownhere:

voidarrayCallback(CGPDFScannerRefinScanner,void*userInfo)

{

PDFSearcher*searcher=(PDFSearcher*)userInfo;

CGPDFArrayRefarray;

boolsuccess=CGPDFScannerPopArray(inScanner,&array);

for(size_tn=0;n

{

if(n>=CGPDFArrayGetCount(array))

continue;

CGPDFStringRefstring;

success=CGPDFArrayGetString(array,n,&string);

if(success)

{

NSString*data=(NSString*)CGPDFStringCopyTextString(string);

[searcher.currentDataappendFormat:

@"%@",data];

[datarelease];

}

}

}

voidstringCallback(CGPDFScannerRefinScanner,void*userInfo)

{

PDFSearcher*searcher=(PDFSearcher*)userInfo;

CGPDFStringRefstring;

boolsuccess=CGPDFScannerPopString(inScanner,&string);

if(success)

{

NSString*data=(NSString*)CGPDFStringCopyTextString(string);

[searcher.currentDataappendFormat:

@"%@",data];

[datarelease];

}

}

 

p.

Asyoucansee,thesewillbecalledwhentheoperatorsfire.Whentheydo,wepopthedataoffthescanner,andaddittothesearcher'scorpus.Theuserinfopointerisactuallypointingtooursearcherobject(basedonthefactthatwewillpassitasthesecondparameterto@CGPDFScannerCreate@inthenextcode).SowecantypecastittoaPDFSearcherandthenaccessthatcurrentDatamember(rememberIsaiditwouldmakesenselater?

).

Theactualsearchmethodlookslikethis:

bc..-(BOOL)page:

(CGPDFPageRef)inPagecontainsString:

(NSString*)inSearchString;

{

[selfsetCurrentData:

[NSMutableStringstring]];

CGPDFContentStreamRefcontentStream=CGPDFContentStreamCreateWithPage(inPage);

CGPDFScannerRefscanner=CGPDFScannerCreate(contentStream,table,self);

boolret=CGPDFScannerScan(scanner);

CGPDFScannerRelease(scanner);

CGPDFContentStreamRelease(contentStream);

return([[currentDatauppercaseString]

rangeOfString:

[inSearchStringuppercaseString]].location!

=NSNotFound);

}

 

p.

Basically,wecreateastreamfromthepagedata,thenusethatandourcallbacktabletocreateascanner.Wethenscanthedata.It'satthispointourcurrentDatamemberisbeingfilledwiththedatafromthePDFasstrings.Finally,wejustsearchthatstringforoursearchstring.

Easypeezy.

Note:

muchofthiscodeisonlysightcompiled.IpulleditfromsomecodeIhad,butitwasn'tastraightacrosscopy,soifyoufindanerror,pleaseletmeknow.

举报
举报
版权申诉
版权申诉
word格式文档无特别注明外均可编辑修改;预览文档经过压缩,下载后原文更清晰! 立即下载
配套讲稿:

如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

特殊限制:

部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

关 键  词:
Displaying and Searching PDF Content
提示  冰豆网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
关于本文
本文标题:Displaying and Searching PDF Content.docx
链接地址:https://www.bdocx.com/doc/11235936.html
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

copyright@ 2008-2022 冰点文档网站版权所有

经营许可证编号:鄂ICP备2022015515号-1

收起
展开