-
Hello, I use ozone 1.4.0 and I am concerned about the hdds.datanode.replication.work.dir (default value /tmp) 1/ Is it used only for replication or also for erasure coding ?
Is it fault tolerant or is it a point of failure for a datanode ? Regards |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @julienlau, this config has been deprecated since #3648. In 1.4.0 datanodes will automatically import containers to a tmp directory on the same storage volume that the container is destined for. The Cloudera docs are outdated. For context, datanodes need a staging directory to import containers to before moving them to the main working directory. This ensures that partial state is not left behind in the main datanode working directories of the storage volumes if import fails partway through. Previously this was just using In neither scenario is this directory required for fault tolerance though. If import fails partway through, the operation will be retried. Datanodes will not report to SCM that they have successfully imported a container until it is fully moved into its final location on the volume. In the old implementation if |
Beta Was this translation helpful? Give feedback.
For EC reconstruction, the containers will be created in place but marked with the RECOVERING state until fully rebuilt. Reconstruction will not use this directory afaik. There are cases where we use plain copy/replication for EC containers, for example decommissioning. If we are decommissioning a node with replica index 1 (the first set of chunks in the stripe) we will copy it to a new node and this directory will be used in that case.