Provide a "dry-run" functionality for bulk import/update #13778

peteeckel · 2023-09-15T08:22:05Z

NetBox version

v3.6.1

Feature type

New functionality

Proposed functionality

Based on the brief discussion in #13773 I suggest implementing a "dry-run" functionality for bulk import/update data.

Use case

This FR needs to be seen in conjunction with #13775 und #13777. Importing or updating data in bulk can be complex and deal with large amounts of data and certain errors such as mis-spelled or mis-cased headers currently result in data being silently ignored.

In case of an input, data sets with some invalid header names are currently silently ignored while the remainder of the columns is imported, which requires a subsequent update run with a new data set including IDs. It would be helpful to validate the input data and check for this kind of error before the import is actually executed so errors can be fixed.

Database changes

None

External dependencies

None

jeremystretch · 2023-09-15T12:12:32Z

I don't see how this would really help. After completing a "dry run" import, you would presumably only be able to see what attributes are listed in the resulting import table: Any others that happen not to be displayed in the table cannot be verified.

Additionally, the mechanism by which imported objects are displayed would not permit this behavior. After objects are imported, NetBox redirects the user to a list of objects filtered by request ID. This would not be feasible if the imported objects don't actually exist.

peteeckel · 2023-09-15T12:44:49Z

In combination with the features suggested in #13775 and especially #13777 a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

Without the "dry-run" the import is performed, but lacking the affected columns. Currently this might not even be noticed, with a notice that these rows haven't been imported would at the very least require a bulk update which needs an additional ID column - which in turn requires exporting the records and generating a new import set.

With a dry-run the user will be aware that there are some problematic columns and can fix them for the import, thus avoiding the need for the subsequent update.

jeremystretch · 2023-09-15T12:49:43Z

a dry-run would give the user a list of ignored columns that would not be used in the import without actually performing the import.

How? Maybe an example would help.

peteeckel · 2023-09-15T13:05:58Z

Let's assume someone is trying to import the following data set:

address,status,dnsname
10.0.0.1/16,active,node1.zone1.example.com
10.0.0.2/16,active,node2.zone1.example.com
10.0.0.3/16,active,node3.zone1.example.com
10.0.0.4/16,active,node4.zone1.example.com
10.0.0.5/16,active,node5.zone1.example.com
10.0.0.6/16,active,node6.zone1.example.com
10.0.0.7/16,active,node7.zone1.example.com
[...]
10.0.0.254/16,active,node7.zone1.example.com
10.0.1.1/16,active,node1.zone2.example.com
10.0.1.2/16,active,node2.zone2.example.com
10.0.1.3/16,active,node3.zone2.example.com
10.0.1.4/16,active,node4.zone2.example.com
10.0.1.5/16,active,node5.zone2.example.com
10.0.1.6/16,active,node6.zone2.example.com
10.0.1.7/16,active,node7.zone2.example.com
[...]
10.0.1.254/16,active,node7.zone2.example.com
[...]
10.0.16.1/16,active,node1.zone16.example.com
10.0.16.2/16,active,node2.zone16.example.com
10.0.16.3/16,active,node3.zone16.example.com
10.0.16.4/16,active,node4.zone16.example.com
10.0.16.5/16,active,node5.zone16.example.com
10.0.16.6/16,active,node6.zone16.example.com
10.0.16.7/16,active,node7.zone16.example.com
[...]
10.0.16.254/16,active,node7.zone16.example.com

(and imagine the data being less schematic to add a bit of complexity).

What currently happens, as the dns_name field is optional, is that all data would be imported without the dns_name as the column header is not spelled correctly.

Now since the field is missing for all records, the only way to fix it is a bulk update. For that, the user needs the IDs for the IPAddress objects in question, so the CSV data need to be amended:

id,address,status,dns_name
1,10.0.0.1/16,active,node1.zone1.example.com
2,10.0.0.2/16,active,node2.zone1.example.com
3,10.0.0.3/16,active,node3.zone1.example.com
4,10.0.0.4/16,active,node4.zone1.example.com
5,10.0.0.5/16,active,node5.zone1.example.com
6,10.0.0.6/16,active,node6.zone1.example.com
7,10.0.0.7/16,active,node7.zone1.example.com
[...]
8,10.0.0.254/16,active,node7.zone1.example.com
9,10.0.1.1/16,active,node1.zone2.example.com
10,10.0.1.2/16,active,node2.zone2.example.com
11,10.0.1.3/16,active,node3.zone2.example.com
12,10.0.1.4/16,active,node4.zone2.example.com
13,10.0.1.5/16,active,node5.zone2.example.com
14,10.0.1.6/16,active,node6.zone2.example.com
15,10.0.1.7/16,active,node7.zone2.example.com
[...]
16,10.0.1.254/16,active,node7.zone2.example.com
[...]
17,10.0.16.1/16,active,node1.zone16.example.com
18,10.0.16.2/16,active,node2.zone16.example.com
19,10.0.16.3/16,active,node3.zone16.example.com
20,10.0.16.4/16,active,node4.zone16.example.com
21,10.0.16.5/16,active,node5.zone16.example.com
22,10.0.16.6/16,active,node6.zone16.example.com
23,10.0.16.7/16,active,node7.zone16.example.com
[...]
24,10.0.16.254/16,active,node7.zone16.example.com

A dry run would have returned the message like 'Field "dnsname" is unknown and will not be imported' (provided #13777 gets implemented) without actually importing anything, thereby giving the user the chance to fix the issue by correcting the header field.

jeremystretch · 2023-09-15T13:15:26Z

Ok, I think I understand the concern better, thanks. I believe this would be addressed by #11617, which seeks to raise a validation error on the presence of an unrecognized column header.

In general I don't like the concept of dry runs because in the best case scenario, they require wasting time, and in the worst the user forgets to utilize them in the first place.

peteeckel · 2023-09-15T13:25:04Z

Absolutely d'accord, but in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug, and he suggested the dry-run feature as a way to solve the issue. I'd prefer the error message in combination with not accepting errorneous data as well.

jeremystretch · 2023-09-15T13:35:44Z

in #13773 @pv2b answered that silently ignoring this kind of error was a feature and not a bug

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

peteeckel · 2023-09-15T13:38:56Z

I'll admit it's a bit subjective, but I'd prefer to treat it as a bug per the principle of least astonishment.

Since I was quite astounded when I stumbled across this behaviour today I'm totally with you on that. Especially since in many cases you won't even notice that something is missing, i.e. when the misspelled column is not in the list of columns displayed in the table popping up after the import.

peteeckel added the type: feature Introduction of new functionality to the application label Sep 15, 2023

peteeckel changed the title ~~Provide a "dry-run" functionality for bulk update/import~~ Provide a "dry-run" functionality for bulk import/update Sep 15, 2023

peteeckel closed this as completed Sep 15, 2023

github-actions bot locked as resolved and limited conversation to collaborators Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a "dry-run" functionality for bulk import/update #13778

Provide a "dry-run" functionality for bulk import/update #13778

peteeckel commented Sep 15, 2023

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023 •

edited

Loading

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023 •

edited

Loading

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023

Provide a "dry-run" functionality for bulk import/update #13778

Provide a "dry-run" functionality for bulk import/update #13778

Comments

peteeckel commented Sep 15, 2023

NetBox version

Feature type

Proposed functionality

Use case

Database changes

External dependencies

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023 • edited Loading

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023 • edited Loading

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023

jeremystretch commented Sep 15, 2023

peteeckel commented Sep 15, 2023

peteeckel commented Sep 15, 2023 •

edited

Loading

peteeckel commented Sep 15, 2023 •

edited

Loading