-
Notifications
You must be signed in to change notification settings - Fork 12
Getting Started Amazon AWS
Make sure that the vagrant-aws-plugin is installed ($ vagrant plugin install vagrant-aws
).
Access https://github.com/crowdrec/idomaar/releases/tag/v3.5.1-aws, download the package and unzip the content.
Prerequisites to run the Idomaar on aws:
- obtain your secret/access key (generate it from IAM console, http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSGettingStartedGuide/AWSCredentials.html)
- create a .pem key for access your machine and make it accessible from your launching environment (http://docs.aws.amazon.com/gettingstarted/latest/wah/getting-started-prereq.html)
- in order to be able to run the framework on AWS you should at least have the grants to use the Amazon EC2 services
You have the option of use the prebuilt images (AMI) publicly available on Oregon or to provision the instance starting from an Ubuntu image, default behaviour is to startup a basic Ubuntu image, if you want to start from a pre-built image (and speedup the process) use the following environment variable:
AMI_PREBUILT=true
The following parameters can be overwritten using environment variables:
KEYPAIR_NAME: default value 'idomaar', it specify the key name, it has to be one of the key present in "Key pairs" under "Network & security" section of EC2 console.
REGION: default value 'us-west-2', it specify the region where the instance will be instantiated, list of available regions at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions
INSTANCE_TYPE: default value 't2.small', it specify the type of running instance, list of instance types at https://aws.amazon.com/ec2/instance-types/
Assuming ROOTDIR is the directory where you unzipped the Idomaar package.
-
Start the computing environment
a simplified computing environment using 0MQ interface can be run with:
1. cd ROOTDIR/computingenvironments/01.linux/01.centos/01.mahout AMI_PREBUILT=true AWS_ACCESS_KEY_ID=<<YOUR ACCESS KEY>> AWS_SECRET_ACCESS_KEY=<<YOUR SECRET KEY>> AWS_PRIVATEKEY_PATH=~/PUBLICKEY.pem vagrant up --provider=aws
Retrieve the IP of the created machines.
-
start the recommendation engine on the computing environment:
1. AMI_PREBUILT=true AWS_ACCESS_KEY_ID=<<YOUR ACCESS KEY>> AWS_SECRET_ACCESS_KEY=<<YOUR SECRET KEY>> AWS_PRIVATEKEY_PATH=~/PUBLICKEY.pem vagrant ssh -c 'sudo /vagrant/algorithms/01.example/itembasedrec.sh start'
-
Execute the Orchestrator The template evaluation, with Mahout and MovieTweetings data, can be executed with:
1. cd ROOTDIR/datastreammanager
2. AMI_PREBUILT=true AWS_ACCESS_KEY_ID=<<YOUR ACCESS KEY>> AWS_SECRET_ACCESS_KEY=<<YOUR SECRET KEY>> AWS_PRIVATEKEY_PATH=~/PUBLICKEY.pem vagrant up --provider=aws
3. cd ROOTDIR
4. AMI_PREBUILT=true AWS_ACCESS_KEY_ID=<<YOUR ACCESS KEY>> AWS_SECRET_ACCESS_KEY=<<YOUR SECRET KEY>> AWS_PRIVATEKEY_PATH=~/PUBLICKEY.pem ./idomaar.sh --new-topic --comp-env-address tcp://<<COMPUTING_ENV_IP>>:2760 --training-uri https://raw.githubusercontent.com/crowdrec/datasets/master/01.MovieTweetings/datasets/snapshots_10K/evaluation/training/data.dat --test-uri https://raw.githubusercontent.com/crowdrec/datasets/master/01.MovieTweetings/datasets/snapshots_10K/evaluation/test/data.dat
Where COMPUTING_ENV_IP is the one retrieved at step 2.
You can also use data from s3 using an s3 uri as in the following example:
--training-uri s3://idomaar-test/demo/training-data.dat
--test-uri s3://idomaar-test/demo/test-data.dat
-
Evaluation result
At the end of the process you should be able to see the evaluation result like the following:
INFO [datastream] Mode: recall@N
INFO [datastream] Number of total elements in the GT: ...
...
...
INFO [datastream] Mode: precision
INFO [datastream] Precision: ...
INFO [datastream] Mode: precision@N
INFO [datastream] Precision@1: ...
...
...
The generic command line command is:
./idomaar.sh --new-topic --comp-env-address tcp://0MQHOSTNAME:0MQPORT --training-uri TRAINING_URI --test-uri TEST_URI
--new-topic create the topic in Kafka orchestrator
--comp-env-address is the hostname and port of the 0MQ instance of the computing environment
--training-uri URI for training data in idomaar format
--test-uri URI for test data in Idomaar format
AWS authorization error
In order to decode, and visualize the description of the error, on linux you should use awscli console (python is a prerequisite):
$ sudo pip install awscli
$ aws configure
$ aws sts decode-authorization-message --encoded-message
cat errorcode.txt``