-
Notifications
You must be signed in to change notification settings - Fork 2
Data providing institutions
Julaiti Alafate edited this page Apr 6, 2020
·
14 revisions
We use the data from 7 institutions for training:
- AGSO
- JAMSTEC/JAMSTEC2
- NGA/NGA2
- NGDC
- NOAA_geodas
- SIO
- US_multi
There are two institutions, JAMSTEC and NGA, that have the cruises in two subsets. We train two models for the two subsets, and test them individually, so that we can evaluate if the model trained on one subset can generalize to the other.
We divide data in two steps:
- Chunking: we chunk the measurements in a cruise into multiple small segments. For the single-beam measurements, each part contains around 5,000 measurements; for the multi-beam measurements, each part contains around 100,000 measurements.
- Train/test split: we then split the segments into three sets for training, validation, and testing respectively.
- Raw data:
/cryosat3/btozer/CREATE_ML_FEATURES/tsv_all
- Chunked data:
/cryosat3/jalafate/bathymetry/data/chulks
(typo)
overall edit rate | edit rate in train set | edit rate in validate set | edit rate in test set | |
---|---|---|---|---|
AGSO | 0.72 | 0.64 | 0.35 | 1.35 |
JAMSTEC | 0.57 | 0.66 | 0.42 | 0.23 |
JAMSTEC2 | 6.45 | 5.74 | 6.31 | 12.57 |
NGA | 21.29 | 21.23 | 24.13 | 18.98 |
NGA2 | 0.12 | 0.11 | 0.12 | 0.13 |
NGDC | 4.20 | 3.85 | 4.20 | 5.77 |
NOAA_geodas | 10.39 | 10.64 | 9.84 | 9.82 |
SIO | 13.11 | 12.72 | 15.00 | 13.00 |
US_multi | 5.06 | 4.66 | 4.60 | 7.17 |
overall edit rate | edit rate in train set | edit rate in validate set | edit rate in test set | |
---|---|---|---|---|
AGSO | 0.77 | 0.36 | 2.08 | 0.12 |
JAMSTEC | 0.55 | 0.44 | 1.29 | 0.40 |
JAMSTEC2 | 7.28 | 7.80 | 10.27 | 1.67 |
NGA | 21.53 | 22.74 | 23.16 | 10.69 |
NGA2 | 0.12 | 0.13 | 0.02 | 0.59 |
NGDC | 4.19 | 4.46 | 3.93 | 3.30 |
NOAA_geodas | 10.29 | 10.29 | 8.10 | 12.67 |
SIO | 13.71 | 15.43 | 13.84 | 3.41 |
US_multi | 5.30 | 5.61 | 6.43 | 2.63 |
overall | train set | validate set | test set | |
---|---|---|---|---|
AGSO | 209 (201.32) | 144 (138.94) | 31 (29.16) | 34 (33.22) |
JAMSTEC | 1088 (888.69) | 773 (641.08) | 159 (124.48) | 156 (123.16) |
JAMSTEC2 | 168 (84.81) | 128 (63.52) | 25 (12.50) | 15 (8.88) |
NGA | 2034 (1200.50) | 1408 (825.80) | 302 (180.07) | 324 (194.68) |
NGA2 | 1921 (1918.99) | 1347 (1345.58) | 302 (302.00) | 272 (271.41) |
NGDC | 3974 (1785.70) | 2754 (1240.03) | 611 (271.08) | 609 (274.65) |
NOAA_geodas | 8073 (7527.66) | 5593 (5208.89) | 1267 (1185.24) | 1213 (1133.56) |
SIO | 6037 (3901.72) | 4194 (2716.40) | 933 (608.04) | 910 (577.43) |
US_multi | 772 (565.04) | 534 (388.26) | 116 (88.05) | 122 (88.77) |
overall | train set | validate set | test set | |
---|---|---|---|---|
AGSO | 48 (25.31) | 35 (22.08) | 7 (2.78) | 6 (2.85) |
JAMSTEC | 538 (165.75) | 379 (121.12) | 80 (27.72) | 79 (18.43) |
JAMSTEC2 | 150 (42.70) | 105 (29.93) | 23 (8.47) | 22 (4.98) |
NGA | 1368 (54.42) | 960 (31.48) | 204 (14.35) | 204 (15.97) |
NGA2 | 24 (11.94) | 17 (8.90) | 4 (2.16) | 3 (1.94) |
NGDC | 1040 (445.19) | 731 (328.96) | 155 (58.43) | 154 (60.14) |
NOAA_geodas | 3672 (1681.77) | 2572 (1164.31) | 551 (256.48) | 549 (264.38) |
SIO | 243 (91.31) | 172 (70.02) | 35 (16.36) | 36 (6.87) |
US_multi | 615 (296.38) | 432 (209.54) | 92 (45.92) | 91 (41.23) |
total num. of cruises | num. of cruises in train set | num. of cruises in validate set | num. of cruises in test set | |
---|---|---|---|---|
AGSO | 48 | 35 | 7 | 6 |
JAMSTEC | 538 | 379 | 80 | 79 |
JAMSTEC2 | 150 | 105 | 23 | 22 |
NGA | 1368 | 960 | 204 | 204 |
NGA2 | 24 | 17 | 4 | 3 |
NGDC | 1040 | 731 | 155 | 154 |
NOAA_geodas | 3672 | 2572 | 551 | 549 |
SIO | 243 | 172 | 35 | 36 |
US_multi | 615 | 432 | 92 | 91 |
total num. of measures | num. of measures in train set | num. of measures in validate set | num. of measures in test set | |
---|---|---|---|---|
AGSO | 19.63 | 13.53 | 2.84 | 3.26 |
JAMSTEC | 81.37 | 59.04 | 11.23 | 11.10 |
JAMSTEC2 | 5.44 | 4.14 | 0.81 | 0.49 |
NGA | 4.55 | 3.13 | 0.67 | 0.75 |
NGA2 | 9.59 | 6.73 | 1.51 | 1.35 |
NGDC | 110.27 | 76.19 | 17.02 | 17.06 |
NOAA_geodas | 35.28 | 24.38 | 5.56 | 5.34 |
SIO | 39.59 | 27.52 | 6.10 | 5.97 |
US_multi | 45.30 | 31.02 | 6.98 | 7.30 |
total num. of measures | num. of measures in train set | num. of measures in validate set | num. of measures in test set | |
---|---|---|---|---|
AGSO | 20.4 | 13.8 | 5.1 | 1.6 |
JAMSTEC | 89.3 | 64.9 | 11.6 | 12.9 |
JAMSTEC2 | 6.1 | 4.5 | 0.8 | 0.8 |
NGA | 4.7 | 3.5 | 0.7 | 0.5 |
NGA2 | 9.6 | 7.1 | 2.3 | 0.3 |
NGDC | 124.9 | 86.2 | 18.3 | 20.4 |
NOAA_geodas | 39.2 | 28.0 | 5.8 | 5.3 |
SIO | 40.9 | 30.4 | 5.4 | 5.1 |
US_multi | 52.5 | 36.0 | 8.7 | 7.9 |
unknown. | multi-beam | grid | single-beam | Point measurement | |
---|---|---|---|---|---|
AGSO | 21.0 M | 0 | 0 | 0 | 0 |
SIO | 0 | 11.6 M | 29.6 M | 0 | 0 |
NGDC | 0 | 112.8 M | 0 | 12.6 M | 0 |
US_multi | 0 | 52.6 M | 0 | 0 | 0 |
JAMSTEC | 0 | 90.1 M | 0 | 0 | 0 |
JAMSTEC2 | 0 | 6.1 M | 0 | 0 | 0 |
NOAA_geodas | 0 | 0 | 0 | 39.2 M | 0 |
NGA | 0 | 0 | 0 | 4.7 M | 0 |
NGA2 | 0 | 0 | 0 | 9.6 M | 0 |
We are not using the cruises provided by these agencies:
- 3DGBR
- DNC
- IFREMER
- GEBCO
- lakes
- IBCAO
- NAVO
- NOAA
- CCOM
- GEOMAR
This wiki is generated using notebook data-process/Count-Lines.ipynb.