This library provides tools to easily and quickly import data into Odoo or export data from Odoo using CSV file. It also provide a framework to manipulate data from CSV.
- Odoo CSV Import Export Library
- Installation
- Importing Data
- Import Parameters
- Using the Script
- Transformations
- Basic Concepts
- Mapper Functions
- mapper.const(value)
- mapper.val(field, default='', postprocess=lambda x: x, skip=False)
- mapper.map_val(field, mapping, default='')
- mapper.num(field, default='0.0')
- mapper.bool_val(field, true_vals=[], false_vals=[])
- mapper.binary(field, path_prefix, skip=False, encoding="utf-8")
- mapper.concat(separator, *fields)
- mapper.m2o(PREFIX, field, default='', skip=False)
- mapper.m2o_map(PREFIX, mapper, default='', skip=False)
- mapper.m2m(PREFIX, *fields)
- mapper.m2m_id_list(PREFIX, *args, **kwargs)
- mapper.m2m_value_list(*args, **kwargs)
- Advanced Transformations
- A Real Life Example
- Performances Considerations
- Troubleshooting
- Tips and Tricks
- Exporting Data
- Requirements
- From GitHub
git clone git@github.com:tfrancoi/odoo_csv_import.git
- From PyPi
[sudo] pip install odoo_import_export_client
The Odoo CSV Import Export library provides the script odoo_import_thread.py
to import data into Odoo. The script is designed to load one data file into one model. That means you might need to run the script several times with different data files, models and other options to complete an import.
Data are not inserted directly into the database, instead they are loaded by calling the method models.load
. Doing so, the standard behaviour of each model is respected.
This script has several options. Type the command odoo_import_thread.py --help
to get the usage.
usage: odoo_import_thread.py [-h] -c CONFIG --file FILENAME --model MODEL
[--worker WORKER] [--size BATCH_SIZE]
[--skip SKIP] [--fail] [-s SEPARATOR]
[--groupby SPLIT] [--ignore IGNORE] [--check]
[--context CONTEXT] [--o2m]
Import data in batch and in parallel
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Configuration File that contains connection parameters
--file FILENAME File to import
--model MODEL Model to import
--worker WORKER Number of simultaneous connection
--size BATCH_SIZE Number of line to import per connection
--skip SKIP Skip until line [SKIP]
--fail Fail mode
-s SEPARATOR, --sep SEPARATOR
CSV separator
--groupby SPLIT Group data per batch with the same value for the given
column in order to avoid concurrent update error
--ignore IGNORE list of column separate by comma. Those column will be
remove from the import request
--check Check if record are imported after each batch.
--context CONTEXT context that will be passed to the load function, need
to be a valid python dict
--o2m When you want to import o2m field, don't cut the batch
until we find a new id
One of the most important feature is the ability to import in parallel while controlling the transaction size. These options allow to import huge data files while dealing with performance, what is not possible with the builtin Odoo import wizard.
Here is how looks like a typical execution of the script:
The parameter values are set for illustrating purposes.
When running the script, an amount of threads are spawned. Each of them handles an amount of records by transaction. Each transaction inserts or updates records in the Odoo instance defined in the configuration file.
The configuration file CONFIG
is a text file that defines the parameters used in the import. Here is an example.
[Connection]
hostname = mydb.odoo.com
database = mydb
login = admin
password = admin
protocol = jsonrpcs
port = 443
uid = 2
The section [connection]
is mandatory. Then the following parameters must be set accordingly.
Parameter | Description |
hostname | Name of the host where the Odoo resides. |
database | Name of the Postgresql database. |
login | The login used to create or update the records. |
password | The login's password. |
protocol |
Protocol used for RPC calls. It can be one of the following values: xmlrpc, xmlrpcs, jsonrpc, jsonrpcs. For a remote database, it's strongly advised to used an encrypted protocol (xmlrcps or jsonrpcs). |
port | TCP port where Odoo can be reached. Usually 443 for encrypted remote connections, or 8069 for a local Odoo with its default configuration. |
uid |
This is the database id of the res.user identified by the parameter 'login'. Well known ids are: 1 = admin user prior to V12. 2 = admin user as of V12. |
Tips: On premise, it's advised to use a dedicated user with the minimal access rights on all the models related to the import.
By default, CONFIG
is set to conf/connection.conf
. Under windows, you must always set this option because the path separator is not compliant with the OS.
Define the CSV FILENAME
to import. The CSV format is mandatory. In order to be importable in Odoo, this file must follow some rules:
- The file must be in UTF-8 encoding.
- One file must contain data of only one model.
- The first line is the column names. All columns must have the technical name of the fields.
- All lines must have an
id
column fullfilled with an XML_ID that identifies the record. - Some field formats must be respected:
- Boolean values must be 0 or 1.
- Binary data must be encoded in base64.
- Datetime fields format depends on the language (often %Y-%m-%d %H:%M:%S).
- The decimal separator of float values also depends on the language (often '.').
- Selection fields must always contain database values.
- Many2one fields must be suffixed with
/id
if their value is an XML_ID or.id
if it's a database id. - Many2many fields must be a comma separated list of XML_IDs.
- If a field value is splitted into multiple lines, it must be enclosed with double quotes (").
The fields separator can be set with the option --sep SEPARATOR
.
You can skip the first lines of the file with the option --skip LINE
.
The name of the CSV file can be used to set the model by default. Ex: the file name res_partner.csv
sets the model to res.partner
. See the --model
option for more detail.
Define the column separator. Ex: --sep=,
. By default, it's the semicolon (;).
If the separator is present in a field value, the value must be enclosed with double quotes (").
Allow to skip the first LINE
line(s) of the file. Ex: --skip=3
will skip the first three lines.
The first line must be the column names. Don't skip any line if it's the case.
Set the MODEL
to import data into. Ex: --model=res.partner
.
By default the model is the name of the CSV file with the underscores (_) replaced by dots (.) and without extension. Meaning, if the CSV file is namedres_partner.csv
, the model is res.partner
by default .
Controls the number of records (BATCH_SIZE
) imported in one transaction.
When using the standard import wizard of Odoo, an import is always handled by one worker in one single transaction for the whole file.
When dealing with big data files, this may lead to two main issues:
- the time needed to import the whole file could exceed the maximum time allowed for a transaction to run. This time is set by the Odoo parameters
limit-time-cpu
andlimit-time-real
, - if an error occurs on one record, the whole transaction fails and all the records are rejected.
The solution is then to reduce the number of records in one transaction by setting the --size
parameter to the desired number.
Here colored in blue, the transaction contains two records (--size=2
). Now, only two records instead of the whole file must be imported during the time allocated for the transaction.
This option is also helpful when importing large records over a WAN connection because a smaller transaction size leads to smaller JSON or XML payloads to send over the network. However it causes a bit more network overhead which could slow down the total run time. This run time can be drastically decreased by using the --worker
parameter.
Controls the number of import threads in parallel.
Here is how a import looks like whith --worker=2
.
The whole file is now handled by two workers in parallel. The total run time is then divided by two.
As a rule of thumb, you can set the number of workers up to 80% of the number Odoo workers. So that other users can still work while the import runs.
When working with multiple workers, there is a potential drawback: the concurrent updates.
In the following example, suppose Import Thread 1 is importing my_partner_2 while Import Thread 2 is importing my_partner_3.
Both partners have the same parent_id: my_partner_1. As the parent_id
is a related field non readonly, the insert/update of my_partner_2 and my_partner_3 will both trigger an update on my_partner_1. That's a concurrent update. As a consequence the current transaction of both threads will fail. To solve such an issue, the parameter --groupby
can be used.
Selects the field to group in one thread.
To avoid the concurrent update issue described previously, you can use --groupby=parent_id/id
. By doing this, we ensure all the records with the same parent_id/id
are imported by the same thread. It thus eliminates the concurrent updates caused by the parent_id.
Specifies the columns that do not need to be imported. Multiple columns can be set in a comma separated list. Ex: --ignore=col1,col2,col3
.
This is typically used to avoid cascade updates while importing related fields. Refer to Importing Related or Computed fields.
Define the context of the ORM while importing. Ex:--context="{'tracking_disable': True}"
.
Here are some useful context keys.
Key | Description |
write_metadata | When True, allow to import audit log fields (create_uid, create_date, write_uid, write_date). The import must run with the `admin` user. Requires the module import_metadata available here. |
update_many2many |
Set it to True when the data file contains a many2many relationship splitted as one record per line instead of a comma separated list of XML_IDs in one column. Suppose we want to assign categories to products, a regular record is: product1;categ1,categ2,categ3 With 'update_many2many': True, you can import a file with the following structure: product1;categ1 product1;categ2 product1;categ3 |
Key | Description |
tracking_disable | When True, don't create messages in the chatter |
defer_fields_computation | When True, recompute the computed fields at the end of the transaction instead of after each record. Useless if --size=1. Requires the module defer_fields_computation available here. |
defer_parent_store_computation | Defer the computation of the fields parent_left and parent_right at the end of the transaction. Valid up to Odoo 11. |
lang | Set the current language. Ex: 'lang': fr_FR |
force_company | Set the current company. Use the database identifier of the company. |
Key | Description |
check_move_validity | Set it to False when you import account moves and account move lines. Refer to "Importing Account Move Lines" for more details. |
create_product_product | Set it to True when you import product templates and also the variants. Without this key, the ORM will automatically create the variants when the templates are imported. |
These are some examples. Feel free to look into Odoo code to find out all context keys.
Use this option when you import a data file with one2many relationships. The import file must follow a specific structure.
Suppose the model master
has two one2many fields child1_ids
and child2_ids
, linking respectively the models child1
and child2
.
In the line beginning a master record, you can set all the master fields, like a regular import file. In addition, you can add the fields of child records. In the next lines, you can add the data of the next childs, leaving empty the columns of the master record and the unexisting childs.
With the --o2m
option, the master record will be imported with its two child1
and its three child2
in the same time.
It worths noticing that it's impossible to set XML_IDs on the child records. As a consequence:
- you cannot run the import again to update the childs data,
- the childs cannot be referenced in another import file.
With this option, at the end of each transaction, the number of records in the transaction is compared to the number of imported records. If these numbers do not match, an error message is printed. Most likely, the transaction contains records with duplicate XML_IDs. Refer to When the number of records does not match for more explanations.
Engage the fail mode.
When you run odoo_import_thread.py
whithout the --fail
option, it runs in normal mode. In this mode, any rejected record is printed in a file with the same name as the parameter --file
suffixed by .fail
and located in the same folder.
This .fail
file may contain records rejected for good reasons (ie. a required field is missing) or bad reasons. If you run an import with multiple workers, a bad reason could be a concurrent update. And even in a single thread, when an error occurs in one record, all the records of the transaction (--size
) are rejected.
This means the .fail
file may contain records that could be imported if the process had run by a single thread and in a specific transaction for each record. That's the role of the fail mode.
In fail mode, odoo_import_thread.py
will try to import the records of the .fail
file. Additionaly, neither --worker
nor --size
is set. The default values of 1
are then used to ensure a single thread and a single record per transaction.
In this mode, the rejected records are placed in the .fail.bis
file in the same folder. This file contains only rejections that need your attention and must be solved before importing again.
To use all the benefits of the script, most of the time imports are run with multiple workers and a user defined transaction size. In this case and because of the fail mode, you always need two command lines to import one file.
Assuming:
- your configuration file is
connection.conf
located in the current folder, - your data file is
my_data.csv
located in the current folder, - the target model is
my.model
, - you want to run a fast multithreaded import.
Run the following two commands (as an example):
odoo_import_thread.py -c connection.conf --file=my_data.csv --model=my.model --worker=4 --size=200 [other options]
odoo_import_thread.py -c connection.conf --file=my_data.csv --model=my.model --fail [other options]
The first command runs the import in parallel and in batch. The rejected records are printed in the file my_data.csv.fail
. We don't care about this file, it will be handled in the next command.
In the second command, the parameter --fail
replaces --worker
and --size
. The import runs in fail mode. It will read my_data.csv.fail
(note the parameter --file
is unchanged) and print the rejected records in the file my_data.csv.fail.bis
. If the .fail.bis
file empty, all the records are imported (inserted or updated).
Refer to the Troubleshooting section to know how to solve some issues.
When the file to import doesn't respect the expected format of odoo_import_thread.py
, it's necessary to apply some transformations in order to create compliant CSV files. The Odoo CSV Import Export library helps in creating a python script able to transform CSV files thanks to the Processor
and the mapper
objects.
Let's start with a simple use case to introduce the main concepts of the tranformations. Once you're familiar with, a more complete use case is provided here.
A customer wants to import some partners. He provides the following CSV file, say client_file.csv
:
Firstname;Lastname;Birthdate
John;Doe;31/12/1980
David;Smith;28/02/1985
This file cannot be imported directly because:
- the fields
Lastname
,Firstname
,Birthdate
do not exist in the modelres.partner
, - the date format is not compliant,
- there is no
id
field.
The first step to do is to ensure that all the fields exist in the target model. Here, the birthdate is a new data. We assume it must be stored in the field birthdate
created before the import. Instead, Firstname
and Lastname
will be used as the name
of the partner.
Now the transformation step can begin. It consists on writing a python script that builds another CSV file compliant with the model res.partner
, this is our Odoo CSV file.
In this case, the transformation steps will:
- define how we build the
name
field from the columnsLastname
andFirstname
, - change the date format to a compliant one.
Another important point to consider is what happens when we load the data several times (it could occur if the tranformations must be rewritten). Basically, when we import the Odoo CSV file the first time, the two partners will be created. But if we run it again, we don't want to create duplicates. Instead, we want to update the partner information. So the transformation phase is also necessary to:
- assign an XML_ID to each partner of the file.
The presence of an XML_ID ensures that a record is created if it doesn't exist, or updated if it already exists. This behaviour is included in the method load
of each Odoo model.
Let's build the transformation script, say res_partner.py
. We start with importing the needed objects from the library.
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
# Custom import
from datetime import datetime # used to change the format of datetime fields
Then we build a Processor
object from the client CSV file client_file.csv
. Assuming this file resides in the current folder:
processor = Processor('client_file.csv', delimiter=';')
Now we create a mapping dictionary where the keys are the fields of the target model (res.partner
) we want to import -at least the required fields without default value- and how we get them from the client file.
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name: mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
}
All the fields are extracted with the methods of the mapper
object. These are described here.
An important thing to notice is the id
field. It is mandatory by the script odoo_import_thread.py
. It contains an XML_ID that we build as we want as soon as its unicity is garanteed. In this example, we assume the concatenation of three columns (Lastname
, Firstname
and Birthdate
) is enough to identify a record. It will create XML_IDs like my_import_res_partner.John_Doe_31/12/1980
in the Odoo CSV file. You are free to choose whatever module name you want, here my_import_res_partner
, but it's a good idea to include the model name and something like the project name.
Now we can invoke the transformation by itself.
processor.process(res_partner_mapping, 'res.partner.csv', {'model': 'res.partner', 'context': "{'tracking_disable': True}", 'worker': 2, 'batch_size': 20})
This step will create the import file res.partner.csv
for the model. res.partner
. It should look like this, conforming to res_partner_mapping
:
id;name;birthdate
my_import_res_partner.John_Doe_31/12/1980;John Doe;31-12-1980 00:00:00
my_import_res_partner.David_Smith_28/02/1985;David Smith;28-02-1985 00:00:00
Note: The order of the columns is not related to the client file or the keys in the transform mapping dictionary.
Notice some options are set when invoking the transformation: 'context': "{'tracking_disable': True}", 'worker': 2, 'batch_size': 20}
.
They don't play any role in the transformation by itself. Instead it will be used by the import shell script later. Hopefully, we can automatically create the shell script by adding this line:
processor.write_to_file("res_partner.sh", python_exe='', path='')
This will create the script res_partner.sh
that will load the data with odoo_import_thread.py
, first in normal mode, then in fail mode. It looks like this:
odoo_import_thread.py -c conf/connection.conf --file=res.partner.csv --model=res.partner --worker=2 --size=20 --groupby= --ignore= --sep=";" --context="{'tracking_disable': True}"
odoo_import_thread.py -c conf/connection.conf --fail --file=res.partner.csv --model=res.partner --ignore= --sep=";" --context="{'tracking_disable': True}"
The complete python script:
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
# Custom import
from datetime import datetime # used to change the format of datetime fields
processor = Processor('client_file.csv', delimiter=';')
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name: mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
}
processor.process(res_partner_mapping, 'res.partner.csv', {'model': 'res.partner', 'context': "{'tracking_disable': True}", 'worker': 2, 'batch_size': 20})
processor.write_to_file("res_partner.sh", python_exe='', path='')
Run the transformations
python res_partner.py
You should have created:
- the import file
res.partner.csv
in the same folder as the client fileres_partner.csv
, - the shell script
res_partner.sh
in your current folder.
Coming back to our simple example, let's suppose the client adds the partner's company in his data. Here, we are not in a multi companies environment, the company is just the partner's parent. The file could look like this:
Company;Firstname;Lastname;Birthdate
The World Company;John;Doe;31/12/1980
The Famous Company;David;Smith;28/02/1985
In this case we must import four partners (the two companies and the two persons) and set the field parent_id
of the two persons to their respective company. In a relational database we link records thanks to their internal identifiers (id
). But at this step, these ids are unknown because the records are not imported yet. We will then use the XML_IDs to link the records.
It means when we transform a company, we assign an XML_ID to it, then we use this XML_ID as the parent_id
of the person who is a member of this company. As a consequence the companies must be imported before the persons. More precisely, the XML_IDs set in the parent_id
must exist before being used as a relationship value.
Let's create the transformation script. As usual, we start with the needed imports and the creation of a Processor
on the client file.
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
from datetime import datetime # used to change the format of datetime fields
processor = Processor('client_file.csv', delimiter=';')
Now we can define the mapping to extract the companies. These are records in the model res.partner
with the boolean field is_company
set. We also assume the company name is unique so that we can use it as an identifier in the XML_ID.
res_partner_company_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
'name': mapper.val('Company'),
'is_company': mapper.const('1'),
}
processor.process(res_partner_company_mapping, 'res.partner.company.csv', {}, 'set')
It worths noting the option'set'
of processor.process
while invoking the companies transformation. This option allows to remove duplicates in the Odoo CSV file. It could be the case if several partners belong to the same company.
And here is the mapping to extract the persons. It's exactly the same as before except we've added the field parent_id
.
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name': mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
'parent_id/id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
The important thing to notice here is that we use exactly the same transformation method for the partner ids in both mappings in order to generate the same XML_ID (res_partner_mapping['parent_id/id']
= res_partner_company_mapping['id']
). Remember also the suffix /id
when using XML_IDs in relation fields.
The results will be two differents Odoo CSV files:
- The partners who are companies:
res.partner.company.csv
.
id;name;is_company
my_import_res_partner.The World Company;The World Company;1
my_import_res_partner.The Famous Company;The Famous Company;1
- The persons:
res.partner.csv
where the columnpartner_id/id
refer to an existingid
inres.partner.company.csv
.
id;parent_id/id;name;birthdate
my_import_res_partner.John_Doe_31/12/1980;my_import_res_partner.The World Company;John Doe;31-12-1980 00:00:00
my_import_res_partner.David_Smith_28/02/1985;my_import_res_partner.The Famous Company;David Smith;28-02-1985 00:00:00
Finally we generate the shell script that will load the files by adding this line in the transformation script.
processor.write_to_file("res_partner.sh", python_exe='', path='')
This will create the shell script res_partner.sh
that will load the data. It looks like this:
odoo_import_thread.py -c conf/connection.conf --file=res.partner.company.csv --model=res.partner --groupby= --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --fail --file=res.partner.company.csv --model=res.partner --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --file=res.partner.csv --model=res.partner --groupby= --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --fail --file=res.partner.csv --model=res.partner --ignore= --sep=";" --context="{}"
The script contains all the commands to load both Odoo CSV files. They are written in the same order as in the transformation script. So the import sequence is respected.
The complete python script:
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
from datetime import datetime # used to change the format of datetime fields
processor = Processor('client_file.csv', delimiter=';')
res_partner_company_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
'name': mapper.val('Company'),
'is_company': mapper.const('1'),
}
processor.process(res_partner_company_mapping, 'res.partner.company.csv', {}, 'set')
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name': mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
'parent_id/id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
processor.write_to_file("res_partner.sh", python_exe='', path='')
Usually we don't import One2many
fields. Instead, we import the inverse Many2one
relation in the linked model.
Let's suppose our customer adds some categories to the partners. The client file could look like this:
Company;Firstname;Lastname;Birthdate;Category
The World Company;John;Doe;31/12/1980;Premium
The Famous Company;David;Smith;28/02/1985;Normal,Bad Payer
The categories are set in one column in the same line of the partner record and separated with a comma.
By looking into Odoo, we see that the model res.partner
contains a field category_id
which is a Many2many to the model res.partner.category
. If you remember the rule Many2many fields must be a comma separated list of XML_IDs and that an XML_ID must be created before being used in a relationship, you get an idea of the procedure to apply.
1- Create all the categories by extracting them from the client file and assign them and XML_ID.
2- Build a comma separated list of XML_IDs of categories for each partner.
Let's start the transformation script. As usual, we start with the needed imports and the creation of a Processor
on the client file.
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
from datetime import datetime # used to change the format of datetime fields
processor = Processor('client_file.csv', delimiter=';')
The first transformation extracts the categories. We assume the name is unique to create the XML_IDs.
partner_category_mapping = {
'id': mapper.m2m_id_list('res_partner_category', 'Category'),
'name': mapper.m2m_value_list('Category'),
}
processor.process(partner_category_mapping, 'res.partner.category.csv', {}, m2m=True)
Notice we use two mapper functions to deal with Many2many relationships: m2m_id_list
and m2m_value_list
, and the option m2m=True
in the processor.process
command.
This will create the file res.partner.category.csv
with all the unique categories as follow:
id;name
res_partner_category.Premium;Premium
res_partner_category.Normal;Normal
res_partner_category.Bad Payer;Bad Payer
Now we can complete the person mapping. It's exactly the same as before except we have added the field category_id
.
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name': mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
'parent_id/id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
'category_id/id': mapper.m2m('res_partner_category', 'Category'),
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
The important thing here is to use the same XML_ID 'res_partner_category' + Category'
for both
res_partner_mapping[category_id/id] = mapper.m2m
and partner_category_mapping[id] = mapper.m2m_id_list
.
The mapping res_partner_mapping
will create a new file res.partner.csv
like this:
id;parent_id/id;name;birthdate;category_id/id
my_import_res_partner.John_Doe_31/12/1980;my_import_res_partner.The World Company;John Doe;31-12-1980 00:00:00;res_partner_category.Premium
my_import_res_partner.David_Smith_28/02/1985;my_import_res_partner.The Famous Company;David Smith;28-02-1985 00:00:00;res_partner_category.Normal,res_partner_category.Bad Payer
Notice the column category_id/id
that contains a comma separated list of XML_IDs of partner categories.
Finally we create the load script by adding this line:
processor.write_to_file("res_partner.sh", python_exe='', path='')
It creates the script res_partner.sh
looking like this:
odoo_import_thread.py -c conf/connection.conf --file=res.partner.category.csv --model=res.partner.category --groupby= --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --fail --file=res.partner.category.csv --model=res.partner.category --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --file=res.partner.csv --model=res.partner --groupby= --ignore= --sep=";" --context="{}"
odoo_import_thread.py -c conf/connection.conf --fail --file=res.partner.csv --model=res.partner --ignore= --sep=";" --context="{}"
The complete python script:
# -*- coding: utf-8 -*-
from odoo_csv_tools.lib import mapper
from odoo_csv_tools.lib.transform import Processor
from datetime import datetime # used to change the format of datetime fields
processor = Processor('client_file.csv', delimiter=';')
partner_category_mapping = {
'id': mapper.m2m_id_list('res_partner_category', 'Category'),
'name': mapper.m2m_value_list('Category'),
}
processor.process(partner_category_mapping, 'res.partner.category.csv', {}, m2m=True)
res_partner_mapping = {
'id': mapper.m2o_map('my_import_res_partner', mapper.concat('_', 'Firstname', 'Lastname', 'Birthdate')),
'name': mapper.concat(' ','Firstname','Lastname'),
'birthdate': mapper.val('Birthdate', postprocess=lambda x: datetime.strptime(x, "%d/%m/%y").strftime("%Y-%m-%d 00:00:00")),
'parent_id/id': mapper.m2o_map('my_import_res_partner', mapper.val('Company')),
'category_id/id': mapper.m2m('res_partner_category', 'Category'),
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
processor.write_to_file("res_partner.sh", python_exe='', path='')
Note: it's possible to import many2many relationships with another file structure. Refer to the context key update_many2many to learn how.
The load order in the shell script depends on the order of the processor.process
instructions in the transformation script.
This example script:
processor = Processor('client_file.csv', delimiter=';')
res_partner_company_mapping = {
}
res_partner_mapping = {
}
processor.process(res_partner_company_mapping, 'res.partner.company.csv', {}, 'set')
processor.process(res_partner_mapping, 'res.partner.csv', {})
processor.write_to_file("res_partner.sh", python_exe='', path='')
will create the load script res_partner.sh
with:
1- the load of res.partner.company.csv
because it's the first invoked transformation,
2- the load of res.partner.csv
.
If you want to import the persons first (which is a bad idea here), just inverse the two processor.process
commands.
processor = Processor('client_file.csv', delimiter=';')
res_partner_company_mapping = {
}
res_partner_mapping = {
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
processor.process(res_partner_company_mapping, 'res.partner.company.csv', {}, 'set')
processor.write_to_file("res_partner.sh", python_exe='', path='')
Now if you want to create two separated shell scripts, one for companies and another for the persons, you need to create a new Processor
.
#For the 1st load script
processor = Processor('client_file.csv', delimiter=';')
res_partner_company_mapping = {
}
processor.process(res_partner_company_mapping, 'res.partner.company.csv', {}, 'set')
processor.write_to_file("res_partner_company.sh", python_exe='', path='')
#For the 2nd load script
processor = Processor('client_file.csv', delimiter=';')
res_partner_mapping = {
}
processor.process(res_partner_mapping, 'res.partner.csv', {})
processor.write_to_file("res_partner.sh", python_exe='', path='')
This will create the script res_partner_company.sh
that loads only res.partner.company.csv
, and the script res_partner.sh
that loads only res.partner.csv
. It can be useful if you need to do something between importing the companies and the persons (ie. to call another load or RPC calls). The drawback is that the client file is read twice.
You can get the value of columns in the client file with several methods defined in the mapper
object. Take a look at lib/mapper.py
to get an up to date list of methods. Here are the most commonly used.
Use it to assign always the same value to a field.
Client File | Mapper | Import File |
my_column my_value1 my_value2 |
{ 'my_field': mapper.const('forced_value'), } |
my_field forced_value forced_value |
Example: setting a company missing in the client file:
fields_mapping = {
...
'company_id/id': mapper.const('base.main_company'),
...
}
Takes the value of the column. Use it for a Char
or Text
field.
Client File | Mapper | Import File |
my_column my_value1 my_value2 |
{ 'my_field': mapper.val('my_column'), } |
my_field my_value1 my_value2 |
Takes the value from a dictionary where the key is the value of the column.
Client File | Mapper | Import File |
my_column key1 key2 |
mapping = { 'key1': 'value1', 'key2': 'value2' } { 'my_field': mapper.map_val('my_column', mapping), } |
my_field value1 value2 |
Example: setting a country.
Country_column;
BE;
FR;
country_map = {
'BE': 'base.be',
'FR': 'base.fr',
}
fields_mapping = {
...
'country_id/id': mapper.map_val('Country_column', country_map),
...
}
Takes the numeric value of the column. Transform the comma decimal separator by a dot. Use it for Integer
or Float
fields.
Client File | Mapper | Import File |
my_column 01; 2,3; |
{ 'my_field': mapper.num('my_column'), } |
my_field 01 2.3 |
A boolean field in Odoo is always imported as 1 or 0. true_vals
and false_vals
are used to map the original values to 1 and 0. If the value in the client file is not in true_vals
or false_vals
, it is considered as TRUE if a value is present or FALSE if the column is empty.
Client File | Mapper | Import File |
my_column Yes No something else |
{ 'my_field': mapper.bool_val('my_column', ['Yes'], ['No']), } |
my_field 1 0 0 1 |
Use it to convert a binary file in base64 and to put the result in a binary field. Typically used to import images and attachments.
Assuming images pict_1.png
and pict2.png
are located in /home/Pictures
:
Client File | Mapper | Import File |
my_column pict_1.png pict_2.png |
{ 'my_field': mapper.binary('my_column', '/home/Pictures/'), } |
my_field kllkxqlxsqnxqxhHJVJSFSVSJDYVDV...... KKjdsndb77573çinjhffxxcdkllkxq...... |
Concatenate the value of one or several columns separated with separator
.
Client File | Mapper | Import File |
my_column1;my_column2 val1;val2 val3,val4 |
{ 'my_field': mapper.concat('_','my_column1','my_column2'), } |
my_field val1_val2; val3_val4; |
Use it to create an XML_ID where the module is PREFIX
and the name is the value of the column.
Client File | Mapper | Import File |
my_column my_value1 my_value2 |
{ 'my_field/id': mapper.m2o('my_import_my_model','my_column'), } |
my_field/id my_import_my_model.my_value1 my_import_my_model.my_value2 |
Notice the field name suffixed with /id in the mapping dictionary.
Use it to create an XML_ID where the module is PREFIX
and the name is the result of mapper
. It is often used with the mapper mapper.concat
to create XML_IDs as the concatenation of several columns.
Client File | Mapper | Import File |
my_column1;my_column2 val1,val2 val3,val4 |
{ 'my_field/id': mapper.m2o_map('my_import_my_model',mapper.concat('_','my_column1','my_column2')), } |
my_field/id my_import_my_model.val1_val2; my_import_my_model.val3_val4; |
Notice the field name suffixed with /id in the mapping dictionary.
Returns a comma separated list of one or several columns, each value being prefixed by PREFIX
. Use it to build XML_IDs list for a Many2many
field.
Client File | Mapper | Import File |
my_column1;my_column2 val1,val2;val3 |
{ 'my_field/id': mapper.m2m('my_import_my_model','my_column1', 'my_column2')), } |
my_field/id; my_import_my_model.val1,my_import_my_model.val2,my_import_my_model.val3 |
Notice the field name suffixed with /id in the mapping dictionary.
Notice val1, val2 of my_column1 are handled the same way as val3 in my_column2.
Build one record (line) per distinct value of a comma separated list of data inside a column, and prefix the value with PREFIX
.
Client File | Mapper | Import File (1) |
my_column1 val1,val2 val1,val3 val4 |
{ 'my_field/id': mapper.m2m_id_list('my_import_my_model','my_column1')), } |
my_field/id my_import_my_model.val1 my_import_my_model.val2 my_import_my_model.val3 my_import_my_model.val4 |
(1) To use in conjunction with the option m2m
while invoking the transformation (see Many2many Relationships).
processor.process(mapping, 'output.file.csv', {}, m2m=True)
Without this option, the import file would look like this:
my_field/id
[my_import_my_model.val1, my_import_my_model.val2]
[my_import_my_model.val1, my_import_my_model.val3]
[my_import_my_model.val4]
Build one record (line) per distinct value of a comma separated list of data inside a column.
Client File | Mapper | Import File (1) |
my_column1 val1,val2 val1,val3 val4 |
{ 'my_field': mapper.m2m_value_list('my_column1')), } |
my_field val1 val2 val3 val4 |
(1) To use in conjunction with the option m2m
while invoking the transformation (see Many2many Relationships).
processor.process(mapping, 'output.file.csv', {}, m2m=True)
Without this option, the import file would look like this:
my_field/id
[val1, val2]
[val1, val3]
[val4]
Sometimes, the builtin mappers do not meet your needs, even with a postprocess
function. In this case, you can map a field to a tailor made function.
This function takes an argument representing an entire line of the client file as a dictionary where the columns are the keys.
def my_field_mapper(line):
if line['column1'] == 'a_value':
return 'something'
return 'something_else'
fields_mapping = {
...
'my_field': my_field_mapper,
...
}
Sometimes it's useful to change the client file according to some needs (ie. removing useless lines, adding columns with data, ...). You can do that with a preprocessor function when building the Processor
. See this example that does... nothing.
def myPreprocessor(header, data):
return header, data
processor = Processor('client_file.csv', delimiter=';', preprocess=myPreprocessor)
The preprocessor
function takes two arguments: header
is a list of all the columns, and data
is a list of dictionaries, each dictionary being a line. Let's see two learning examples of preprocessing.
Here we add the column NEW_COLUMN
filled with the value NEW_VALUE
for all the records.
def myPreprocessor(header, data):
header.append('NEW_COLUMN')
for i, j in enumerate(data):
data[i].append(NEW_VALUE)
return header, data
Say we want to remove all the lines having the column Firstname
with the value John
.
def myPreprocessor(header, data):
data_new = []
for i, j in enumerate(data):
line = dict(zip(header, j))
if line['Firstname'] != 'John':
data_new.append(j)
return header, data_new
Note: The client file is not physically changed. Only the buffer used by the Processor is changed in memory. Anyway the new columns are usable in the fields mapping dictionary, and the removed lines are not processed.
It is possible to update records knowing their database ID instead of their XML_ID. The field .id
designates a database ID. But also, the script odoo_import_thread.py
requires an id
field. So the trick is to build an empty id
field and to map the .id
from the client file.
my_mapping = {
'id': mapper.const(''),
'.id': mapper.val('id_column'),
...
}
A complete import project (transformation and load) is available in the repo odoo_import_example. It demonstrates use cases such as:
- importing partners with multiple categories
- importing products and variants with their suppliers
- importing messages
Note: The project was done in Odoo 11. Some models may differ in other versions.
Importing related fields (non readonly) causes cascade updates that drastically increase the import run time. Suppose the following example.
Importing my_partner_2000 will trigger an update of my_partner_1 because the parent_id is a related field non readonly. And updating my_parent_1 will also trigger the update of all its childs. When importing the next record, my_partner_2001, the same scenario happens. My_partner_1 will be updated a second time and this will trigger the update of all its childs again. But this time the number of childs is increased by my_partner_2000. So each time a new partner is created, the number of updates behind the scene increases.
You can figure out this scenario by looking at the transaction time that increases exponantially.
The solution is to use the parameter --ignore
. In this case, you should import with the option --ignore=parent_id/id
.
Suppose your Odoo CSV file contain 100 records but after the load, you count less than 100 records more.
Possible cause:
- One or more records have the same XML_ID. So the first record with that XML_ID was inserted, while the other records have updated the first one instead of creating new ones.
To check the unicity of what you think is a record identifier in the client file:
- create a new CSV file with one column containing the supposed identifier,
- check the unicity of the values with the command, for example, in bash:
sort my_xml_ids.csv | uniq -c | sort -nr
- look for an output line begining with a number > 1.
When you run an import, the current company is the company of the user defined by the parameters login
and uid
in the configuration file.
As a rule of thumb, it's advised to separate the imports by company.
Assuming you have to import data for Company_A and Company_B:
- build import files with data of Company_A
- build import files with data of Company_B
- set the import user as a member of Company_A
- import the files with the data of Company_A
- change the import user company to Company_B
- import the files with the data of Company_B
Of course, the translations can be imported with data files tailored for the translation models. But it's a lot easier with the lang
key set in the context. Let's take an example with the products.
First, import the sources terms. This can be done while importing legacy data. Say we have the following CSV file product_template.csv
id;name;price
my_module.product_wallet;Wallet;10.0
my_module.product_bicyle;Bicycle;400.0
Import the products with the following command:
odoo_import_thread.py -c connection.conf --file=product_template.csv
The option --model=product.template
is not necessary since the CSV file is correctly named according to the model.
Then, build another CSV for the translations, say product_template_FR.csv
for the french translations. This file contain only the translated terms for the products.
id;name
my_module.product_wallet;Portefeuille
my_module.product_bicyle;Bicyclette
Import this file by setting the language in the context.
odoo_import_thread.py -c connection.conf --file=product_template_FR.csv --model=product.template --context="{'lang': fr_FR}"
And it's done.
Actually, it doesn't import the translations explicitly. What happened is an update of the products name in the fr_FR language, which is equivalent and a more convenient process to build the translations file. First because it is based on the legacy file and overall because it lets the ORM manage the translations process.
This is an interesting use case of one2many relationship. Let's take a look at the simplified relationship model of account.move
and account.move.line
.
As a rule of thumb, avoid to import one2many relationships because the inverse many2one relation always exists. So, one strategy could be to import first all account.move in one CSV file, then all the account.move.line in another CSV file. But here, this strategy doesn't work because there is a balance check on the account.move. One account.move must have at least two account.move.line where the credit amount of the one balances the debit amount of the other.
That means, the import of the first account.move.line will fail because there its amount is not balanced yet. And the import of the second account.move.line will also fail because the first one is missing, and so on.
One possible solution is to use a context with 'check_move_validity': False
. In that case you can import first all the account.move, then all the account.move.line. As there is no more balance check, you must be sure that all the account.move are well balanced.
Another solution is to import the account.move and the account.move.line together. To do this, you can build a mixed CSV file like this, say account_move.csv
.
Then import it with the --o2m
option.
odoo_import_thread.py -c connection.conf --file=account_move.csv --model=account.move --o2m
The option --o2m
prevents the batch to be cut while importing the same account.move. The account.move is imported with its account.move.line in the same time. So the balance check can be performed at the end of the transaction.
The drawback of this method is that you cannot set an XML_ID to an account.move.line. See --o2m for more details.
The Odoo CSV Import Export library provides the script odoo_export_thread.py
to export data from Odoo. This script has several options. Type the command odoo_export_thread.py --help
to get the usage.
usage: odoo_export_thread.py [-h] -c CONFIG --file FILENAME --model MODEL
--field FIELDS [--domain DOMAIN]
[--worker WORKER] [--size BATCH_SIZE]
[-s SEPARATOR] [--context CONTEXT]
Import data in batch and in parallel
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Configuration File that contains connection parameters
--file FILENAME Output File
--model MODEL Model to Export
--field FIELDS Fields to Export
--domain DOMAIN Filter
--worker WORKER Number of simultaneous connection
--size BATCH_SIZE Number of line to import per connection
-s SEPARATOR, --sep SEPARATOR
CSV separator
--context CONTEXT context that will be passed to the load function, need
to be a valid python dict
(To be continued...)