Using the NCBI eUtilities via CGIWord文档格式.docx
- 文档编号:17910751
- 上传时间:2022-12-12
- 格式:DOCX
- 页数:22
- 大小:27.04KB
Using the NCBI eUtilities via CGIWord文档格式.docx
《Using the NCBI eUtilities via CGIWord文档格式.docx》由会员分享,可在线阅读,更多相关《Using the NCBI eUtilities via CGIWord文档格式.docx(22页珍藏版)》请在冰豆网上搜索。
∙UsingLWPtopostlargeUIDlistsandretrieveresultsinbatches
∙RetrievingeLinkeddatainbatchesusing"
indexlists"
∙Additionalinformation
EntrezmanipulatessetsofUIDs
EverydatabaseintheEntrezdomainassignsuniqueIDs(UIDs)tomajorrecord-typesineachdatabase.TheseIDsareintegervaluesuniquewithinthedatabase,butthesameintegermaybeusedtoidentifyrecordsinmultipledatabases.(Thus,toidentifyaparticularrecord,onemustspecifyboththedatabaseandtherecord'
sUID.)
TheprimaryfunctionoftheEntrezprogrammaticinterfaceistohelpusersmanipulatesetsofUIDs,andfetchdatarecordsidentifiedbythoseUIDs.
Entrez,itself,mustalsoformatdataforWeb-basedusers,buttheprogrammaticinterfaceleavesdatadisplaytotheclientprogram.
Theinterfaceallowsprogramsto:
∙defineasetofUIDs,
∙displaythecontentsofrecordsidentifiedbyasetofUIDs,
∙createanewUIDsetfromanexistingsetbychoosingmembersoftheexistingsetwhosedatarecordssatisfyspecifiedcriteria,and
∙createanewsetofUIDsrepresentingrecordsthatareinsomewayrelatedtomembersoftherecordsidentifiedbyanexistingsetofUIDs.
TheseprimitivecapabilitiescanbecombinedintopowerfulsequencesthatcanintegratedatafromamongmostoftheEntrezdataresources.Infact,theymakeitpossibleto(partially)mimicrelationaldatabaseoperationssuchasselectsandjoinsondatainseparatedataresources.
Note,however,thatrecordcontentmayberetrievedinalimitednumberofreportformats,whereareporttypecontainsa
fixed
subsetofelementstakenfromtherawdatarecord.Asaresult,additionalprocessingmayberequiredtoprunereportdataforsubsequentdisplayoruse,and/ormultiplerequestsmayberequiredtoretrievedatainmultiplereportformatstoobtainalldesireddatafields.
TheEntrez"
TheEntrezCorecankeeparecordofeachqueryitprocesses,includingtheUIDsetresultingfromeachquery.Thedatabaseholdingtheserecordswillbereferredtoas"
thequeryresultdatabase"
withinthispresentation,althoughitisdescribedas"
theHistory"
or"
theHistoryserver"
insomeNCBIdocumentation.
Thequeryresultdatabasecanbeusedby(mostofthe)programsthatimplementtheEntrezsetmanipulationfunctionslistedabove,andissoimportantfor
efficient
useofEntrezthatthispresentationisalmostentirelyorientedaroundit.
"
Efficient"
useofthequeryresultdatabaseallowsuserstodownloadlargenumbersofrecordswithoutviolatingtheaccessratelimitsthatNCBIimposesuponremotequeries.
EachUIDsetintheCoredatabaseisidentifiedby3piecesofinformation:
∙aqueryidentifier,knownasthe"
querykey"
∙thenameofthedatabaseusedtogeneratetheassociatedUIDset,and
∙anidentifierforthestateofthedatabaseatthetimeofthequery,knownasthe"
webenvironment"
.
Querykeysareintegers,butareoftendisplayedasapoundsign(#)followedbyaninteger.TheEntrezdatabasesnamesarestringslike"
snp"
"
nuc"
nucest"
gene"
etc.Webenvironmentidentifiersarelong(around60character)strings.
Hereisaschematicqueryresultdatabaseentry:
Database
Query
Key
WebEnv
(edited)
UIDset
snp
2
A3zq156CDS_p1DdWz...AU6u3yb5D3B634BAF50
242,28853987
NCBIprograminterfacestotheEntrezCore
Thereexistseveral"
technologies"
foraccessingremotedataandcomputingresourcesprogrammatically.
Thetwomostpopularapproachesare:
∙theWebCommonGatewayInterface(CGI),and
∙RemoteProcedureCalls(RPC)overSOAP,sometimesknownasJAX-RPCor"
WebServices"
NCBIsupportsbothoftheseinterfacestotheEntrezCore.Inaddition,NCBIprovidesaneducationalPerlmodule(NCBI_PowerScripting.pm)thatdefinesasetofobjectsthatcalltheCGIservicesbehindthescenes.
TheCGIandWebServicesroutinesareknownasthe"
eUtilities"
eUtils"
andmaybecategorizedwithrespecttotheUIDmanipulationfunctionslistedaboveas:
Function
Genericname
CGIroutine
defineasetofUIDs
ePost(andsometimeseSearch)
epost.fcgi,esearch.fcgi
displaythecontentsofrecordsidentifiedbyUIDs
eSummaryandeFetch
esummary.fcgi,efetch.fcgi
createaUIDsetfromapreviouslydefinedset
eSearch
esearch.fcgi
createaUIDsetbyfindinglinksfromanexistingset
eLink
elink.fcgi
ThispresentationwilldealonlywiththeCGIfunctions,buttheWebServicesprovideidenticalfunctionalitywithintheJAX-RPCframework.(NotethattheWebServicesarenotcurrently,circa2007,availableviaPerl.)
HereisanURLthatusestheepost.fcgiscripttoinsert(or"
post"
)2UIDs(242and2885398)intothequeryresultdatabase:
http:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/epost.fcgi?
db=snp&
id=242,28853987
IfyouenterthisURLintoaWebbrowseryouwillgetaresponselike:
<
?
xmlversion="
1.0"
>
!
DOCTYPEePostResultPUBLIC"
-//NLM//DTD
ePostResult,11May2002//EN"
"
http:
//www.ncbi.nlm.nih.gov/entrez/query/DTD/ePost_020511.dtd"
ePostResult>
<
QueryKey>
1<
/QueryKey>
WebEnv>
01yWrS_p1DdWzAUPU6eOwxX2...s@1FBE5D3B634BAF50_0012SID
/WebEnv>
/ePostResult>
andthequeryresultdatabasewillthenincludeanewrecordcontainingthe2UIDsspecifiedbyusingthe"
id"
parameter:
1
01yWrS_p1DdWzAUPU6e...E5D3B634BAF50_0012SID
242,28853987
Ifyouthenspecifythe"
db"
query_key"
and"
WebEnv"
parametersinaURLlike:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?
query_key=1&
\
WebEnv=01yWrS_p1DdWzAUPU6eOwxX2...s@1FBE5D3B634BAF50_0012SID
wherethe"
\"
attheendofthelinesignifiesthatthelineactuallycontinuesontothenextline(butdoesNOTgettypedin),eSummary.fcgiwillreturnadocumentlikethis(withmanylinesremoved):
DOCTYPEeSummaryResultPUBLIC"
-//NLM//DTDeSummaryResult,29October2004//EN"
//www.ncbi.nlm.nih.gov/entrez/query/DTD/eSummary_041029.dtd"
eSummaryResult>
DocSum>
Id>
242<
/Id>
ItemName="
SNP_ID"
Type="
Integer"
/Item>
Organism"
String"
GENE"
CHR"
TAX_ID"
9606<
SNP_CLASS"
in-del<
CHRPOS"
1:
20742047<
/DocSum>
28853987<
LOC653635<
FXN_CLASS"
locus-region<
snp<
800<
/eSummaryResult>
Thefullresultisshownin
first-query-xml.html.
NotethatsummaryrecordswereretrievedforbothoftheSNPUIDsplacedontheEntrezquerydatabasePRIORtothisrequestforasummary.esummary.fcgiusedthedatabasename,thequerykey,andthewebenvironmentparameterstofindtheUIDlist,andthenretrievedarecordfromthespecifieddatabaseforeachUIDonthelist.
ThefollowingURLshowshowtouseefetch.fcgitogetafullXMLrecordforthesetwoSNPUIDs:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
query_key=2&
WebEnv=01yWrS_p1DdWzAUPU6eOwxX2...s@1FBE5D3B634BAF50_0012SID&
report=sgml&
mode=xml
Theresultmaybeexaminedin
fetch-example-xml.html.Notethatthe"
report"
and"
mode"
optionswereusedtospecifythereportcontentsandformat.Selectionofvaluesfortheseoptionsseemsratherunusual.
UsingePostinaPerlprogram
Thehand-enteredqueriesshownabovecanallbesenttoEntrezviaprograms.APerlprogramtopost2UIDs(242and28853987)tothequeryresultdatabaseisshownbelow:
#!
/usr/bin/perl-w
useLWP:
:
Simple;
$url=
id=242,28853987"
;
@result_array=get("
$url"
);
#notethatepostisreturninganarraylinesofXML.
print@result_array;
Notethatthequeryisidenticaltotheoneissuedinthefirstexampleabove,andtheresultswillbeidentical,exceptforchangesintheWeb
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Using the NCBI eUtilities via CGI