The OAIHarvester application is available here for evaluation purposes. This demo harvests the OCLC ETD repository and dumps the records to standard output. Right click on the following links and save the target/link to a local subdirectory. NOTICE! Be sure the file names match those listed when saving them to your local machine.
Next, obtain Xerces and Tomcat from the Apache Web site and install them. Locate the xerces.jar and servlet.jar in the installed packages and copy them to the target directory or adjust the following command's classpath to point to their installed location.
Finally, issue the command 'java -classpath servlet.jar:xerces.jar:pears.jar:harvester.jar ORG.oclc.oai.harvester.Harvester'.
The OAIHarvester properties file defaults to the file named
harvester.properties in the current directory. This
default can be overridden, however, using command line switches
when running the application. Regardless, this file must contain
the following key=value pairs:
The list of OAI Repositories to be harvested and the date of last harvest is maintained via the ORG.oclc.oai.harvester.OAIServerSet interface. The default implementation of this interface is ORG.oclc.oai.harvester.SimpleOAIServerSet. To initialize this function, create an XML data file containing the list of servers and an optional start date for each. The serverset.xml file should take the form of:
<HarvesterAdmin>
<OAIServer>
<baseURL>http://purl.org/alcme/harvestcat/servlet/OAIHandler</baseURL>
<lastHarvestDate>2000-01-01</lastHarvestDate>
</OAIServer>
<OAIServer>
<baseURL>http://purl.org/alcme/etdcat/servlet/OAIHandler</baseURL>
<lastHarvestDate></lastHarvestDate>
</OAIServer>
...
</HarvesterAdmin>
Given this input file, you can execute the command:java ORG.oclc.oai.harvester.SimpleOAIServerSet serverset.xml simpleoaiserverset.ser.
The result is a serialized OAIServerSet object that can be used by the harvester
by assigning the following properties in the harvester.properties file:
OAIServerSet.className=ORG.oclc.oai.harvester.SimpleOAIServerSetSimpleOAIServerSet.collectionFileName=simpleoaiserverset.ser