Skip to content
Patrick Hochstenbach edited this page Jun 3, 2014 · 31 revisions

Install Catmandu OAI processing on your computer

Make sure you have cpanm (hint: $ cpan App::cpanminus) installed.

$ cpanm Catmandu::OAI

Read Dublin Core records from an OAI repository from the command line

  1. Goto: http://www.opendoar.org/
  2. Find a repository of choice
  3. Read the base URL of the repository from the 'OAI-PMH'
  4. Execute in a terminal the catmandu import command with the URL found in the OAI-PPMH field

E.g.

$ catmandu convert OAI --url https://biblio.ugent.be/oai

Read Dublin Core records from an OAI repository in your Perl code

use Catmandu;

Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai')->each(sub {
   my $record = shift;
   print "$record\n";
});

Convert Dublin Core records from an OAI repository into YAML from the command line

$ catmandu convert OAI --url https://biblio.ugent.be/oai to YAML

Convert Dublin Core records from an OAI repository into YAML in your Perl code

use Catmandu;

my $importer = Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai');
my $exporter = Catmandu->exporter('YAML');

$exporter->add_many($importer);
$exporter->commit;

Extract all identifiers from an OAI repository from the command line

$ catmandu convert OAI --url https://biblio.ugent.be/oai --fix 'retain_field("_id")'

or if you like an CSV file

$ catmandu convert OAI --url https://biblio.ugent.be/oai to CSV --fix 'retain_field("_id")'

Extract all identifiers from an OAI repository into CSV in your Perl code

use Catmandu;

my $importer = Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai');
my $fixer    = Catmandu->fixer('retain_field("_id")');
my $exporter = Catmandu->exporter('CSV');

$exporter->add_many(
     $fixer->fix($importer)
);

$exporter->commit;

Show the speed of importing records from the command line

Hint: use the -v option

$ catmandu convert -v OAI --url https://biblio.ugent.be/oai to CSV --fix 'retain_field("_id")' > /dev/null

Here we send the output to the /dev/null to show the verbose messages.

Show the speed of importing records from your Perl program

use Catmandu;

my $importer = Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai');
my $fixer    = Catmandu->fixer('retain_field("_id")');
my $exporter = Catmandu->exporter('CSV');

$exporter->add_many(
     $fixer->fix($importer->benchmark)
);

$exporter->commit;

See some debug messages

Make sure you have Log::Log4perl installed (hint: $ cpan Log::Log4perl).

In your main program do:

use Catmandu;
use Log::Any::Adapter;
use Log::Log4perl;

Log::Any::Adapter->set('Log4perl');
Log::Log4perl::init('./log4perl.conf');

# The lines above should be enough to activate logging for Catmandu.
# Include the lines below to activate logging for your main program.
my $logger = Log::Log4perl->get_logger('myprog');

$logger->info("Starting main program");

...your code...

with log4perl.conf like:

# Send a copy of all logging messages to STDOUT
log4perl.rootLogger=DEBUG,STDOUT

# Logging specific for your main program
log4perl.category.myprog=INFO,STDOUT

# Logging specific for on part of Catmandu
log4perl.category.Catmandu::Fix=DEBUG,STDOUT

# Where to send the STDOUT output
log4perl.appender.STDOUT=Log::Log4perl::Appender::Screen
log4perl.appender.STDOUT.stderr=1
log4perl.appender.STDOUT.utf8=1

log4perl.appender.STDOUT.layout=PatternLayout
log4perl.appender.STDOUT.layout.ConversionPattern=%d [%P] - %p %l time=%r : %m%n

You will see now Catmandu log messages (e.g. for Fixes).

If you want to add logging functionality in your own Perl modules you have two options;

  1. Your package is a Catmandu::Importer or Catmandu::Exporter. In this case you are lucky because you have a logger as part of your instance:

    $self->log->debug('blablabla'); # where $self is an Importer,Fix or Exporter instance

  2. You need to create the logger yourself.

    package Foo::Bar;

    use Moo; use Log::Any qw($log);

    our $log = Log::Any->get_logger(category => PACKAGE);

    sub bar { my $self = shift; $log->debug('tadaah'); }

If you want to see the logging messages only of your package, then use a this type of line in your log4perl.conf:

log4perl.category.Foo::Bar=DEBUG,STDOUT

or if you want to see all the log messages for Foo packages:

log4perl.category.Foo=DEBUG,STDOUT 
Clone this wiki locally