Skip to content
Vitali Peil edited this page Aug 22, 2022 · 31 revisions

Install Catmandu OAI processing on your computer

Make sure you have cpanm (hint: $ cpan App::cpanminus) installed.

$ cpanm Catmandu::OAI

Read Dublin Core records from an OAI repository from the command line

  1. Goto: http://www.opendoar.org/
  2. Find a repository of choice
  3. Read the base URL of the repository from the 'OAI-PMH'
  4. Execute in a terminal the catmandu import command with the URL found in the OAI-PPMH field

E.g.

$ catmandu convert OAI --url https://biblio.ugent.be/oai

Read Dublin Core records from an OAI repository in your Perl code

use Catmandu;

Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai')->each(sub {
   my $record = shift;
   print "$record\n";
});

Convert Dublin Core records from an OAI repository into YAML from the command line

$ catmandu convert OAI --url https://biblio.ugent.be/oai to YAML

Convert Dublin Core records from an OAI repository into YAML in your Perl code

use Catmandu -all;

my $importer = importer('OAI',url => 'https://biblio.ugent.be/oai');
my $exporter = exporter('YAML');

$exporter->add_many($importer);
$exporter->commit;

Extract all identifiers from an OAI repository from the command line

$ catmandu convert OAI --url https://biblio.ugent.be/oai --fix 'retain("_id")'

or if you like an CSV file

$ catmandu convert OAI --url https://biblio.ugent.be/oai to CSV --fix 'retain("_id")'

Extract all identifiers from an OAI repository into CSV in your Perl code

use Catmandu;

my $importer = Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai');
my $fixer    = Catmandu->fixer('retain("_id")');
my $exporter = Catmandu->exporter('CSV');

$exporter->add_many(
     $fixer->fix($importer)
);

$exporter->commit;

Show the speed of importing records from the command line

Hint: use the -v option

$ catmandu convert -v OAI --url https://biblio.ugent.be/oai to CSV --fix 'retain("_id")' > /dev/null

Here we send the output to the /dev/null to show the verbose messages.

Show the speed of importing records from your Perl program

use Catmandu;

my $importer = Catmandu->importer('OAI',url => 'https://biblio.ugent.be/oai');
my $fixer    = Catmandu->fixer('retain("_id")');
my $exporter = Catmandu->exporter('CSV');

$exporter->add_many(
     $fixer->fix($importer->benchmark)
);

$exporter->commit;

See some debug messages

Make sure you have Log::Log4perl installed (hint: $ cpan Log::Any::Adapter::Log4perl).

In your main program do:

use Catmandu;
use Log::Any::Adapter;
use Log::Log4perl;

Log::Any::Adapter->set('Log4perl');
Log::Log4perl::init('./log4perl.conf');

# The lines above should be enough to activate logging for Catmandu.
# Include the lines below to activate logging for your main program.
my $logger = Log::Log4perl->get_logger('myprog');

$logger->info("Starting main program");

...your code...

with log4perl.conf like:

# Send a copy of all logging messages to STDERR
log4perl.rootLogger=DEBUG,STDERR

# Logging specific for your main program
log4perl.category.myprog=INFO,STDERR

# Logging specific for on part of Catmandu
log4perl.category.Catmandu::Fix=DEBUG,STDERR

# Where to send the STDERR output
log4perl.appender.STDERR=Log::Log4perl::Appender::Screen
log4perl.appender.STDERR.stderr=1
log4perl.appender.STDERR.utf8=1

log4perl.appender.STDERR.layout=PatternLayout
log4perl.appender.STDERR.layout.ConversionPattern=%d [%P] - %p %l time=%r : %m%n

You will see now Catmandu log messages (e.g. for Fixes).

If you want to add logging functionality in your own Perl modules you have two options;

  1. Your package is a Catmandu::Importer or Catmandu::Exporter. In this case you are lucky because you have a logger as part of your instance:

    $self->log->debug('blablabla'); # where $self is an Importer,Fix or Exporter instance

  2. You need to create the logger yourself.

    package Foo::Bar;

    use Moo;

    with 'Catmandu::Logger';

    sub bar { my $self = shift; $self->log->debug('tadaah'); }

If you want to see the logging messages only of your package, then use a this type of line in your log4perl.conf:

log4perl.category.Foo::Bar=DEBUG,STDOUT

or if you want to see all the log messages for Foo packages:

log4perl.category.Foo=DEBUG,STDOUT 

How to create a new Catmandu::Store

A Catmandu::Store is used to store items. Stores can have one or more compartments where to store the items. Each such compartment is a Catmandu::Bag. You can compare a Store with a database and a Bag with a table in a database. Like tables, Bags have names. When no name is provided for a Bag, then 'data' is used.

To implement a Catmandu store you need to create at least two packages:

  1. A 'Catmandu::Store', defining the general parameters, possible connection parameters and actions for the whole store.
  2. A 'Catmandu::Bag', which is used to list, add,fetch and delete items from a Bag.

As example, this is a skeleton for a 'Foo' Catmandu::Store which requires at least one 'foo' connection parameter:

package Catmandu::Store::Foo;
use Moo;

use Catmandu::Store::Foo::Bag;

with 'Catmandu::Store';

has 'foo' => (is => 'ro' , required => 1);

1;

For this Catmandu::Store::Foo we can define a module 'Catmandu::Store::Foo::Bag' to implement the Bag functions. Notice how in the generator the bag can access the Catmandu::Store instance:

package Catmandu::Store::Foo::Bag;
use Moo;

with 'Catmandu::Bag';

sub generator {
    my $self = shift;
    sub {
        # This subroutine is used to loop over all items
        # in a store and should return a item HASH for
        # every call
        return { 
             name => $self->name,
             foo => $self->store->foo 
       };
    };
}

sub get {
    my ($self,$id) = @_;
    # return a item HASH given an $id
    return {};
}

sub add {
    my ($self,$data) = @_;
    # add/update an item HASH to the bag and return the item with an _id field set
    return $data;
}

sub delete {
    my ($self,$id) = @_;
    # delete an item from the bag given an $id
    1;
}

sub delete_all {
    my ($self) = @_;
    # delete all items
    $self->each(sub {
        $self->delete($_[0]->{_id});
    });
}

1;

With this skeleton Store you have enough code to run basic tests. Save these package in a lib directory:

lib/Catmandu/Store/Foo.pm lib/Catmandu/Store/Foo/Bag.pm

and a catmandu command to test your implementation:

$ catmandu -I lib export Foo --foo bar

{"foo":"bar","name":"data"} {"foo":"bar","name":"data"} {"foo":"bar","name":"data"} . . .

Or create a test.pl script to access your new Store via Perl:

#!/usr/bin/env perl
use lib qw(./lib);
use Catmandu;

my $store = Catmandu->store('Foo', foo => 'bar');

$store->add({ test => 123});
Clone this wiki locally