<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="/wikid/docs/xsl/mwCollections/CollectionHarvester2/display.xsl" type="text/xsl"?>

<!--
This resource container holds the product of the resolution request
-->
<resource xmlns:config="info:sid/localhost:CollectionSimpleSchemas:config" xmlns:explain="http://explain.z3950.org/dtd/2.0/" xmlns:srw="http://www.loc.gov/zing/srw/" xmlns:wiki="info:sid/localhost:CollectionSimpleSchemas:wiki" xmlns:wr="http://errol.oclc.org/oai:xmlregistry.oclc.org:errol/WikiRepository" xmlns:xlink="http://www.w3.org/TR/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<!--
This is an echo of the request information this stylesheet used to produce the resolution product
-->
<uri-context>
<srwIdentifier>info:sid/localhost:CollectionHarvester2:Harvester2Faq</srwIdentifier>
<collectionURI>info:sid/localhost:CollectionHarvester2</collectionURI>
<repository-identifier>CollectionHarvester2</repository-identifier>
<srwURL>http://alcme.oclc.org:80/wikid/search/WikiDb.localhost</srwURL>
<local-identifier>Harvester2Faq</local-identifier>
<action>display</action>
</uri-context>
<!--
This is the collection configuration record
-->
<record xmlns="info:sid/localhost:CollectionSimpleSchemas:config" xsi:schemaLocation="info:sid/localhost:CollectionSimpleSchemas:config http://alcme.oclc.org:80/wikid/raw/info:sid/localhost:CollectionSimpleSchemas:config.xsd">
<repositoryName>OAI Harvester2 Documentation</repositoryName>
<description>http://www.oclc.org/research/software/oai/harvester2.htm</description>
<localIdentifierType>userAssigned</localIdentifierType>
<adminEmail>mailto:jyoung@oclc.org</adminEmail>
<defaultXSL>no</defaultXSL>
<schemaURI recordPrefix="wiki">info:sid/localhost:CollectionSimpleSchemas:wiki</schemaURI>
<crosswalkSchemaURI recordPrefix="xhtml">info:sid/localhost:CollectionExternalSchemas:xhtml</crosswalkSchemaURI>
<defaultSchemaURI>info:sid/localhost:CollectionExternalSchemas:xhtml</defaultSchemaURI>
</record>
<!--
There is a local-identifier, so this URI must identify an item in a collection
-->
<!--
This is the searchRetrieveResponse for the item's Deposit record
-->
<content>
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/">
<version>1.1</version>
<numberOfRecords>1</numberOfRecords>
<resultSetId>1xwimq</resultSetId>
<resultSetIdleTime>300</resultSetIdleTime>
<records xmlns:ns1="http://www.loc.gov/zing/srw/">
<record>
<recordSchema>http://www.oclc.org/schemas/WikiRepository</recordSchema>
<recordPacking>xml</recordPacking>
<recordData>
<wr:Deposit xmlns="http://www.w3.org/TR/xhtml1/strict">
<wr:browserPath>http://alcme.oclc.org:80/wikid/docs/WikiRepository</wr:browserPath>
<wr:refID>info:sid/localhost:CollectionHarvester2:Harvester2Faq</wr:refID>
<wr:refIDPrefix/>
<wr:userName>anonymous</wr:userName>
<wr:collection>CollectionHarvester2</wr:collection>
<wr:relativePath>2006/07/10/11</wr:relativePath>
<wr:fullRefID>inf_3asid_2flocalhost_3aCollectionHarvester2_3aHarvester2Faq_5f20060710112555331</wr:fullRefID>
<wr:mimeType>text/xml</wr:mimeType>
<wr:sort>CollectionHarvester2:Harvester2Faq</wr:sort>
<wr:dateCreated>2006-07-10</wr:dateCreated>
<wr:datestamp>2006-07-10</wr:datestamp>
<wr:oldDate/>
</wr:Deposit>
</recordData>
<recordPosition>1</recordPosition>
</record>
</records>
<echoedSearchRetrieveRequest xmlns:ns2="http://www.loc.gov/zing/srw/">
<version>1.1</version>
<query>repos.hasDate = "hasdate" and oai.identifier exact "info:sid/localhost:CollectionHarvester2:Harvester2Faq"</query>
<xQuery>
<ns3:searchClause xmlns:ns3="http://www.loc.gov/zing/cql/xcql/">
<ns3:index>cql.any</ns3:index>
<ns3:relation>
<ns3:value>=</ns3:value>
</ns3:relation>
<ns3:term>huh?</ns3:term>
</ns3:searchClause>
</xQuery>
<startRecord>1</startRecord>
<maximumRecords>1</maximumRecords>
<recordPacking>xml</recordPacking>
<recordSchema>default</recordSchema>
</echoedSearchRetrieveRequest>
</searchRetrieveResponse>
<!--
This is the datestamp for the Deposit
-->
<datestamp>2006-07-10</datestamp>
<!--
This is the URL for the content
-->
<contentURL>http://alcme.oclc.org:80/wikid/docs/WikiRepository/2006/07/10/11/inf_3asid_2flocalhost_3aCollectionHarvester2_3aHarvester2Faq_5f20060710112555331</contentURL>
<!--
Here is the record content
-->
<record>
<record xmlns="info:sid/localhost:CollectionSimpleSchemas:wiki" xsi:schemaLocation="info:sid/localhost:CollectionSimpleSchemas:wiki http://alcme.oclc.org:80/wikid/raw/info:sid/localhost:CollectionSimpleSchemas:wiki.xsd">
<raw>= OAI Harvester2 FAQ =
== Does Harvester2 do ...? ==
Probably not. Harvester2 is a bare-bones OAI tool you can use to incrementally harvest data from an OAI repository and write it out to disk files. What happens to it from there is left as an exercise for the user.
== Which version of OAI-PMH does Harvester2 support? ==
* OAI-PMH v1.1
* OAI-PMH v2.0
== Does Harvester2 support XOAI-PMH? ==
I don't know much about XOAI-PMH. From what I can tell, some of the features of XOAI-PMH were integrated into OAI-PMH v2.0. Beyond that, I can't find much in the way of details.
== Can Harvester2 process records as they are being harvested? ==
The first harvester I wrote, (http://www.oclc.org/research/software/oai/harvester.htm''''''), included a hook to allow records to be processed as they were being harvested. The benefit of doing so is nil, though, so I didn't include it in Harvester2. It's better to write the data out to disk first and then run a separate process to do something with it. Since the possibilities beyond this point are infinite, there's not much I could provide in the way of utilities anyway.
== How do I use the RawWrite application? ==
* java -classpath log4j.jar:harvester2.jar:xalan.jar ORG.oclc.oai.harvester2.app.Raw''''''Write {OAI baseURL}
* Optional Parameters:
** -out (local filename)
** -from (ISO8601 date)
** -until (ISO 8601 date)
** -metadataPrefix (from the L''''''istMetadataFormats response)
** -resumptionToken (to resume from an interrupted harvest session)
** -setSpec (from the L''''''istSets response)
== Which metadata formats does Harvester2 support ==
Harvester2 can accept any metadataformat available from an OAI repository. Records are just a blob of XML as far as Harvester2 is concerned.
== Does Harvester2 support searching? ==
No. Searching is beyond the scope of OAI-PMH. Once you've harvested some data, doing something useful with it is left as an exercise for the user. Here are some clues to get you started with searching, though:
* http://en.wikipedia.org/wiki/OpenSearch
* http://en.wikipedia.org/wiki/SRW
Note that there are many tools that include OAI harvesting as one of their features. Check out the list at http://www.openarchives.org/tools/tools.html''''''.
== Does Harvester2 support I18N? ==
Harvester2 fully conforms to the OAI-PMH requirement for handling Unicode.
== What kind of database does Harvester2 include? ==
Harvester2 doesn't include a database. It is designed to write harvested data to disk files. Loading this data into a database where it can be used is left as an exercise for the user.
== Do I need Tomcat to run Harvester2? ==
No. Harvester2 is a Java application. It is strictly client-side with an OAI repository acting as the server-side.</raw>
</record>
</record>
</content>
<displayContent>
<html xmlns="http://www.w3.org/1999/xhtml">
<body><h1> OAI Harvester2 FAQ </h1>
<h2> Does Harvester2 do ...? </h2>
Probably not. Harvester2 is a bare-bones OAI tool you can use to incrementally harvest data from an OAI repository and write it out to disk files. What happens to it from there is left as an exercise for the user.
<h2> Which version of OAI-PMH does Harvester2 support? </h2>
<ul>
<li> OAI-PMH v1.1</li>
<li> OAI-PMH v2.0</li>
</ul>

<h2> Does Harvester2 support XOAI-PMH? </h2>
I don't know much about XOAI-PMH. From what I can tell, some of the features of XOAI-PMH were integrated into OAI-PMH v2.0. Beyond that, I can't find much in the way of details.
<h2> Can Harvester2 process records as they are being harvested? </h2>
The first harvester I wrote, (<a href="http://www.oclc.org/research/software/oai/harvester.htm">http://www.oclc.org/research/software/oai/harvester.htm</a>), included a hook to allow records to be processed as they were being harvested. The benefit of doing so is nil, though, so I didn't include it in Harvester2. It's better to write the data out to disk first and then run a separate process to do something with it. Since the possibilities beyond this point are infinite, there's not much I could provide in the way of utilities anyway.
<h2> How do I use the RawWrite application? </h2>
<ul>
<li> java -classpath log4j.jar:harvester2.jar:xalan.jar ORG.oclc.oai.harvester2.app.RawWrite {OAI baseURL}</li>
<li> Optional Parameters:
<ul>
<li> -out (local filename)</li>
<li> -from (ISO8601 date)</li>
<li> -until (ISO 8601 date)</li>
<li> -metadataPrefix (from the ListMetadataFormats response)</li>
<li> -resumptionToken (to resume from an interrupted harvest session)</li>
<li> -setSpec (from the ListSets response)</li>
</ul>
</li>
</ul>

<h2> Which metadata formats does Harvester2 support </h2>
Harvester2 can accept any metadataformat available from an OAI repository. Records are just a blob of XML as far as Harvester2 is concerned.
<h2> Does Harvester2 support searching? </h2>
No. Searching is beyond the scope of OAI-PMH. Once you've harvested some data, doing something useful with it is left as an exercise for the user. Here are some clues to get you started with searching, though:
<ul>
<li> <a href="http://en.wikipedia.org/wiki/OpenSearch">http://en.wikipedia.org/wiki/OpenSearch</a></li>
<li> <a href="http://en.wikipedia.org/wiki/SRW">http://en.wikipedia.org/wiki/SRW</a></li>
</ul>

Note that there are many tools that include OAI harvesting as one of their features. Check out the list at <a href="http://www.openarchives.org/tools/tools.html">http://www.openarchives.org/tools/tools.html</a>.
<h2> Does Harvester2 support I18N? </h2>
Harvester2 fully conforms to the OAI-PMH requirement for handling Unicode.
<h2> What kind of database does Harvester2 include? </h2>
Harvester2 doesn't include a database. It is designed to write harvested data to disk files. Loading this data into a database where it can be used is left as an exercise for the user.
<h2> Do I need Tomcat to run Harvester2? </h2>
No. Harvester2 is a Java application. It is strictly client-side with an OAI repository acting as the server-side.</body>
</html>
</displayContent>
</resource>
