Skip to content

Commit

Permalink
docs: Edit spellling errors in Read.md files #8264 (#8279)
Browse files Browse the repository at this point in the history
* fix:corrected spelling errors
* Update explain-packaging-data.md

Co-authored-by: Stéphane Gigandet <stephane@openfoodfacts.org>
  • Loading branch information
oyenuga17 and stephanegigandet authored May 23, 2023
1 parent f0a581c commit 7953dbc
Show file tree
Hide file tree
Showing 4 changed files with 138 additions and 88 deletions.
3 changes: 1 addition & 2 deletions docs/dev/explain-packaging-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,8 @@ Changing the "packagings" value will not change the "packaging_text_[language co

For a single product, we might get partial packaging data from different sources that we map to similar but distinct shapes, like "bottle", "jar" and "jug". It may be difficult to determine if the data concerns a single packaging component, or different components.


### Products with packaging changes

## Ressources
## Resources

- 2020 project to start structuring packaging data: https://wiki.openfoodfacts.org/Packagings_data_structure
5 changes: 2 additions & 3 deletions docs/dev/explain-taxonomy-build-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
Taxonomies have a significant impact on OFF processing and automated test results so need to be rebuilt before running any tests. However, this process takes some time, so the built taxonomy files are cached in a GitHub repository so that they only need to be rebuilt when there is a genuine change.

# How it works

A hash is calculated for all of the source files used to build a particular taxonomy and GitHub is then checked to see if a cache already exists for that hash.

If no cached build is found then the taxonomy is rebuilt and cached locally.
Expand All @@ -15,7 +16,7 @@ The GITHUB_TOKEN is a personal access token, created here: https://github.com/se

# Considerations

In maintianing this code be aware of the following complications...
In maintaining this code be aware of the following complications...

## Circular Dependencies

Expand All @@ -26,5 +27,3 @@ This is currently resolved by building the taxonomy on the fly if it is requeste
## Taxonomy Dependencies

Some taxonomies perform lookups on others, e.g. additives_classes are referenced by additives, so the referenced taxonomy needs to be built first. The build order is determined in the Config_off.pm file.


4 changes: 2 additions & 2 deletions docs/dev/how-to-learn-perl.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

Here are some introductory resources to learn Perl:

### Quick start
### Quick start

- [Perl Youtube Tutorial](https://www.youtube.com/watch?v=c0k9ieKky7Q) - Perl Enough to be dangerous // FULL COURSE 3 HOURS.
- [Perl - Introduction](https://www.tutorialspoint.com/perl/perl_quick_guide.htm) - Introduction to perl from tutorialspoint
- [Impatient Perl](https://blob.perl.org/books/impatient-perl/iperl.pdf) - PDF document for people wintrested in learning perl.
- [Impatient Perl](https://blob.perl.org/books/impatient-perl/iperl.pdf) - PDF document for people interested in learning perl.

### Official Documentation

Expand Down
214 changes: 133 additions & 81 deletions docs/dev/how-to-update-agribalyse-ecoscore.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Download the AGRIBALYSE food spreadsheet from the [AGRIBALYSE](https://doc.agrib

In a backend shell run the ssconvert.sh script. This will re-generate the CSV files, including the AGRIBALYSE_version and AGRIBALYSE_summary files. The AGRIBALYSE_summary file is sorted to make for easier comparison with the previous version.

The Ecoscore calculation just uses the data from the "Detail etape" tab, which is converted to AGRIBALYSE_vf.csv.2 by ssconvert. The Ecoscore.pm module skips the first three lines of this file to ignore headers. This should be checked for each update as the number of header lines has previously changed. Also check that none of the column headings have changed.
The Ecoscore calculation just uses the data from the "Detail etape" tab, which is converted to AGRIBALYSE_vf.csv.2 by ssconvert. The Ecoscore.pm module skips the first three lines of this file to ignore headers. This should be checked for each update as the number of header lines has previously changed. Also check that none of the column headings have changed.

## Review and fix any changed Categories

Expand All @@ -23,86 +23,133 @@ It is also worth checking the impact the update has had on the main product data
The previous values of the Ecoscore are stored in the previous_data section under ecoscore_data. Before applying an update you will need to delete this section with the following MongoDB script:

```js
db.products.update({}, {$unset: {"ecoscore_data.previous_data":0}});
db.products.update({}, { $unset: { "ecoscore_data.previous_data": 0 } });
```

You can then use the following script from a backend bash shell to update products:

```
./update_all_products.pl --fields categories --compute-ecoscore
```

The process will set the `en:ecoscore_grade_changed` and `en:ecoscore_changed` misc_tags, which can be queried to analyse the results. For example, the following script generates a CSV file that summaries all the categories where the grade has changed:

```js
var results = db.products.aggregate([
var results = db.products
.aggregate([
{
$match: {
misc_tags: "en:ecoscore-grade-changed",
},
},
{
$match: {
misc_tags: "en:ecoscore-grade-changed"
}
}, { $group: {
_id: {en: "$ecoscore_data.agribalyse.name_en",
fr: "$ecoscore_data.agribalyse.name_fr",
code_before: "$ecoscore_data.previous_data.agribalyse.code",
code_after: "$ecoscore_data.agribalyse.code",
before: "$ecoscore_data.previous_data.grade",
after: "$ecoscore_data.grade" },
count: { $sum: 1 }
} }
]).toArray();
print('en.Name,fr.Name,Code Before,Code After,Grade Before,Grade After,Count');
$group: {
_id: {
en: "$ecoscore_data.agribalyse.name_en",
fr: "$ecoscore_data.agribalyse.name_fr",
code_before: "$ecoscore_data.previous_data.agribalyse.code",
code_after: "$ecoscore_data.agribalyse.code",
before: "$ecoscore_data.previous_data.grade",
after: "$ecoscore_data.grade",
},
count: { $sum: 1 },
},
},
])
.toArray();
print("en.Name,fr.Name,Code Before,Code After,Grade Before,Grade After,Count");
results.forEach((result) => {
// eslint-disable-next-line no-underscore-dangle
var id = result._id;
print('"' + (id.en || '').replace(/"/g,'""')
+ '","' + (id.fr || '').replace(/"/g,'""')
+ '",' + id.code_before
+ ',' + id.code_after
+ ',' + id.before
+ ',' + id.after
+ ',' + result.count);
// eslint-disable-next-line no-underscore-dangle
var id = result._id;
print(
'"' +
(id.en || "").replace(/"/g, '""') +
'","' +
(id.fr || "").replace(/"/g, '""') +
'",' +
id.code_before +
"," +
id.code_after +
"," +
id.before +
"," +
id.after +
"," +
result.count
);
});
```

The following script fetches the specific products that have changed:

```js
var products = db.products.find(
var products = db.products
.find(
{
misc_tags: "en:ecoscore-grade-changed",
},
{
misc_tags: "en:ecoscore-grade-changed"
}, { _id: 1,
"ecoscore_data.agribalyse.name_en": 1,
"ecoscore_data.agribalyse.name_fr": 1,
"ecoscore_data_main.agribalyse.code": 1,
"ecoscore_data.previous_data.agribalyse.code": 1,
"ecoscore_data.agribalyse.code" : 1,
"ecoscore_data_main.grade": 1,
"ecoscore_data.previous_data.grade" : 1,
"ecoscore_data.grade" : 1,
"ecoscore_data_main.score": 1,
"ecoscore_data.previous_data.score" : 1,
"ecoscore_data.score" : 1,
"ecoscore_data_main.agribalyse.ef_total": 1,
"ecoscore_data.previous_data.agribalyse.ef_total" : 1,
"ecoscore_data.agribalyse.ef_total" : 1,
"categories_tags": 1}).toArray();

print('_id,en.Name,fr.Name,Code Before Main,Code Before Change,Code After,Grade Before Main,Grade Before Change,Grade After,Score Before Main,Score Before Change,Score After,ef_total Before Main,ef_total Before Change,ef_total After,Categories Tags');
_id: 1,
"ecoscore_data.agribalyse.name_en": 1,
"ecoscore_data.agribalyse.name_fr": 1,
"ecoscore_data_main.agribalyse.code": 1,
"ecoscore_data.previous_data.agribalyse.code": 1,
"ecoscore_data.agribalyse.code": 1,
"ecoscore_data_main.grade": 1,
"ecoscore_data.previous_data.grade": 1,
"ecoscore_data.grade": 1,
"ecoscore_data_main.score": 1,
"ecoscore_data.previous_data.score": 1,
"ecoscore_data.score": 1,
"ecoscore_data_main.agribalyse.ef_total": 1,
"ecoscore_data.previous_data.agribalyse.ef_total": 1,
"ecoscore_data.agribalyse.ef_total": 1,
categories_tags: 1,
}
)
.toArray();

print(
"_id,en.Name,fr.Name,Code Before Main,Code Before Change,Code After,Grade Before Main,Grade Before Change,Grade After,Score Before Main,Score Before Change,Score After,ef_total Before Main,ef_total Before Change,ef_total After,Categories Tags"
);
products.forEach((result) => {
var ecoscore_data_main = result.ecoscore_data_main || {};
var ecoscore_data_main_agribalyse = ecoscore_data_main.agribalyse || {};
// eslint-disable-next-line no-underscore-dangle
print( result._id
+ ',"' + (result.ecoscore_data.agribalyse.name_en || '').replace(/"/g,'""')
+ '","' + (result.ecoscore_data.agribalyse.name_fr || '').replace(/"/g,'""')
+ '",' + ecoscore_data_main_agribalyse.code
+ ',' + result.ecoscore_data.previous_data.agribalyse.code
+ ',' + result.ecoscore_data.agribalyse.code
+ ',' + ecoscore_data_main.grade
+ ',' + result.ecoscore_data.previous_data.grade
+ ',' + result.ecoscore_data.grade
+ ',' + ecoscore_data_main.score
+ ',' + result.ecoscore_data.previous_data.score
+ ',' + result.ecoscore_data.score
+ ',' + ecoscore_data_main_agribalyse.ef_total
+ ',' + result.ecoscore_data.previous_data.agribalyse.ef_total
+ ',' + result.ecoscore_data.agribalyse.ef_total
+ ',"' + result.categories_tags.join(" ") +'"'
);
var ecoscore_data_main = result.ecoscore_data_main || {};
var ecoscore_data_main_agribalyse = ecoscore_data_main.agribalyse || {};
// eslint-disable-next-line no-underscore-dangle
print(
result._id +
',"' +
(result.ecoscore_data.agribalyse.name_en || "").replace(/"/g, '""') +
'","' +
(result.ecoscore_data.agribalyse.name_fr || "").replace(/"/g, '""') +
'",' +
ecoscore_data_main_agribalyse.code +
"," +
result.ecoscore_data.previous_data.agribalyse.code +
"," +
result.ecoscore_data.agribalyse.code +
"," +
ecoscore_data_main.grade +
"," +
result.ecoscore_data.previous_data.grade +
"," +
result.ecoscore_data.grade +
"," +
ecoscore_data_main.score +
"," +
result.ecoscore_data.previous_data.score +
"," +
result.ecoscore_data.score +
"," +
ecoscore_data_main_agribalyse.ef_total +
"," +
result.ecoscore_data.previous_data.agribalyse.ef_total +
"," +
result.ecoscore_data.agribalyse.ef_total +
',"' +
result.categories_tags.join(" ") +
'"'
);
});
```

Expand All @@ -114,27 +161,32 @@ Re-run the `update_all_products` script after doing this to assess how many prod

## Add new Categories for new AGRIBALYSE codes

For any new categories, review the AGRIBALYSE category descriptions to ensure they are concise and unambiguous sucgh that an OFF user is most likely to get a match on a type-ahead search. Give notice of the change on the taxonomies channel in Slack so that additional translations can be added for the new categories.
For any new categories, review the AGRIBALYSE category descriptions to ensure they are concise and unambiguous such that an OFF user is most likely to get a match on a type-ahead search. Give notice of the change on the taxonomies channel in Slack so that additional translations can be added for the new categories.

It is not necessary to add a category for every single AGRIBALYSE entry. For example, AGRIBALYSE has over 80 codes for different mineral waters but these all have almost exactly the same environmental impact. In cases like this it is acceptable to pick a single representative AGRIBALYSE code as a proxy for the Category in general.

It may be worth doing a final check to see how many categories cominations still do not have a match to AGRIBALYSE:

```js
var missing = db.products.aggregate([
var missing = db.products
.aggregate([
{
$match: {
"ecoscore_data.grade": null
}
}, { $group: {
$match: {
"ecoscore_data.grade": null,
},
},
{
$group: {
_id: "$categories_tags",
count: { $sum: 1 }
} }
]).toArray();
print('Category,Count');
count: { $sum: 1 },
},
},
])
.toArray();
print("Category,Count");
missing.forEach((result) => {
// eslint-disable-next-line no-underscore-dangle
var id = result._id;
print('"' + (id.join(',') || '').replace(/"/g,'""')
+ '",' + result.count);
// eslint-disable-next-line no-underscore-dangle
var id = result._id;
print('"' + (id.join(",") || "").replace(/"/g, '""') + '",' + result.count);
});
```
```

0 comments on commit 7953dbc

Please sign in to comment.