Bradley J Chambers, DigitalGlobe
- Docker Images
- Filter-only Pipelines
- Status of PCL Filters
- Python Package
- Python Examples
- Filter Roundup
Image | Tag | Size |
---|---|---|
pdal/dependencies | 1.5 | 3.1GB |
pdal/dependencies | latest | 3.31GB |
pdal/pdal | 1.5 | 3.67GB |
pdal/pdal | latest | 3.67GB |
- Images building plugins on top of the PDAL base image can grow even larger |
+++
- Prototype Alpine image with ~80% of the plugins
Image | Tag | Size |
---|---|---|
pdal/dependencies | alpine | 1.07GB |
pdal/pdal | alpine | 365MB |
+++
{
"pipeline":[
"input.las",
{
"type":"filters.whatever",
"some":"options"
},
"output.laz"
]
}
@[3](Inferred reader) @[4-7](Filter with options) @[8](Inferred writer)
+++
pdal pipeline pipeline.json
+++
pdal pipeline pipeline.json --readers.las.filename=input.las --writers.las.filename=output.las
@[1](Override options from the CLI)
+++
{
"pipeline":[
{
"type":"filters.whatever",
"some":"options"
}
]
}
+++
pdal translate input.las output.las --json pipeline.json
+++
Old (PCL) | New (PDAL) |
---|---|
filters.ground |
filters.pmf |
filters.radiusoutlier |
filters.outlier |
filters.statisticaloutlier |
filters.outlier |
filters.height |
filters.hag |
filters.dartsample |
filters.sample |
Native PDAL variants of PCL Plugin filters
+++
{
"pipeline": [
{
"type": "filters.pclblock",
"methods": [
{
"setLeafSize": {
"y": 2.0,
"x": 2.0,
"z": 2.0
},
"name": "VoxelGrid"
}
]
}
]
}
@[5-14](PCL JSON specification bumped to v0.2 → easier to embed in PDAL JSON)
+++
+++
-
The PDAL Python package can be installed via pip.
pip install pdal
-
Once installed, simply
import pdal
Note: The remainder of the presentation will present examples in the context of the PDAL Python package (though CLI samples will be provided as well).
+++
>>> json = u'''
... {
... "pipeline":[
... "./data/isprs/samp11-utm.laz"
... ]
... }'''
>>> p = pdal.Pipeline(json)
@[1-6](Define the pipeline JSON) @[9](Create the pipeline)
+++
>>> print('Is pipeline valid? %s' % p.validate())
Is pipeline valid? True
>>> print('Pipeline processed %d points.' % p.execute())
Pipeline processed 38010 points.
>>> arr = p.arrays[0]
>>> print('Pipeline contains %d array(s).' % (len(p.arrays)))
Pipeline contains 1 array(s).
@[1-2](Check for a valid pipeline)
@[5-6](Execute the pipeline)
@[9-11](Check how many ndarrays
were returned)
+++
>>> print(arr[0])
(512743.63, 5403547.33, 308.68, 0, 1, 1, 0, 0, 2, 0.0, 0, 0)
+++
>>> print(arr['X'][:10])
[ 512743.63 512743.62 512743.61 512743.6 512743.6 512741.5 512741.5
512741.49 512741.48 512741.47]
+++
>>> print(arr['Z'].mean())
356.17143357
+++
>>> import pandas as pd
>>> samp11 = pd.DataFrame(arr, columns=['X','Y','Z'])
>>> samp11.head()
X Y Z
0 512743.63 5403547.33 308.68
1 512743.62 5403547.33 308.70
2 512743.61 5403547.33 308.72
3 512743.60 5403547.34 308.68
4 512743.60 5403547.33 308.73
+++
>>> samp11.describe()
X Y Z
count 38010.000000 3.801000e+04 38010.000000
mean 512767.010570 5.403708e+06 356.171434
std 38.570375 8.587360e+01 29.212680
min 512700.870000 5.403547e+06 295.250000
25% 512733.530000 5.403645e+06 329.060000
50% 512766.940000 5.403705e+06 356.865000
75% 512799.900000 5.403790e+06 385.860000
max 512834.760000 5.403850e+06 404.080000
+++
>>> import seaborn as sns
>>> sns.kdeplot(samp11['Z'], cut=0, shade=True, vertical=True);
+++
>>> med = samp11.median()
>>> print(med)
X 512766.940
Y 5403705.460
Z 356.865
dtype: float64
+++
>>> from scipy import spatial
>>> tree = spatial.cKDTree(samp11)
>>> dists, idx = tree.query(med, k=3)
>>> print(dists)
[ 0.6213091 1.37645378 1.51757207]
+++
>>> samp11.iloc[idx]
X Y Z
31897 512767.16 5403706.02 357.02
31881 512767.93 5403706.29 356.39
31972 512765.75 5403706.19 356.27
+++
- A DimRange is a
- named dimension, and
- range of values.
- Bounds can be inclusive (
[]
) or exclusive (()
). - Ranges can be negated (
!
).
+++
{
"pipeline":[
{
"type":"filters.range",
"limits":"Z[10:]"
},
{
"type":"filters.range",
"limits":"Classification[2:2]"
},
{
"type":"filters.range",
"limits":"Red!(20:40]"
},
{
"type":"filters.assign",
"assignment":"Classification[:]=0"
}
]
}
@[3-6](Select all points with Z greater than or equal to 10) @[7-10](Select all points with classification of 2) @[11-14](Select points with red values less than or equal to 20 as well as those greater than 40) @[15-18](Reassign all classification values to 0)
+++
- Available to
filters.pmf
andfilters.smrf
- Eliminates the need to completely remove points (e.g., noise)
- Instead, points are ignored
+++
{
"pipeline":[
{
"type":"filters.range",
"limits":"Classification![7:7]"
},
{
"type":"filters.smrf"
}
]
}
@[1-11](Noise points are removed!)
+++
{
"pipeline":[
{
"type":"filters.smrf",
"ignore":"Classification[7:7]"
}
]
}
@[1-8](Noise points are left intact, just ignored.)
+++
filters.hag
- Creates new
HeightAboveGround
dimension
+++
- SMRF to segment ground and non-ground returns
- HAG to estimate the
HeightAboveGround
using the return information.
+++
Recall the kernel density of raw elevations...
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.smrf"
},
{
"type":"filters.hag"
}
]
}
@[7-9](Now, consider the HeightAboveGround
dimension)
+++
>>> p = pdal.Pipeline(json)
>>> count = p.execute()
>>> df = pd.DataFrame(p.arrays[0])
>>> sns.kdeplot(df['HeightAboveGround'], cut=0, shade=True, vertical=True);
+++
>>> df[['HeightAboveGround']].describe()
HeightAboveGround
count 15607.000000
mean 5.467956
std 5.006438
min -13.280000
25% 2.110000
50% 3.870000
75% 7.810000
max 63.700000
+++
+++
filters.assign
- Assign a value to a
DimRange
- Handy for resetting classifications
+++
filters.approximatecoplanar
- Ratios of eigenvalues
- Creates a new binary dimension called
Coplanar
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.approximatecoplanar"
}
]
}
+++
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.smrf"
},
{
"type":"filters.hag"
},
{
"type":"filters.range",
"limits":"HeightAboveGround[2:)"
},
{
"type":"filters.approximatecoplanar"
}
]
}
+++
+++
filters.elm
- Extended Local Minimum seeks to identify outliers below the ground surface
- Marks outliers with
Classification
value of 7
+++
filters.estimaterank
- Compute covariance of neighborhoods of points and estimate rank
- Potentially useful for identifying linear features, planes, etc.
- Creates new
Rank
dimension
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.estimaterank"
}
]
}
+++
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.smrf"
},
{
"type":"filters.hag"
},
{
"type":"filters.range",
"limits":"HeightAboveGround[2:)"
},
{
"type":"filters.estimaterank"
}
]
}
+++
+++
filters.groupby
- Split the incoming PointView into separate PointViews by given criteria
- Allows us to operate on each individually (e.g., find centroid of each cluster)
+++
filters.head
andfilters.tail
- Pass only the specified number of points from beginning or ending of the PointView
+++
filters.kdistance
- Compute the distance to the k-th nearest neighbor
- Creates new
KDistance
dimension
+++
- Local Outlier Factor
- Creates three new dimensions
KDistance
LocalReachabilityDistance
LocalOutlierFactor
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.lof"
}
]
}
+++
+++
+++
+++
+++
filters.locate
- Find and return only the min or max point in the incoming PointView (e.g., of a cluster)
+++
filters.mad
- Filter points by evaluating Median Absolute Deviation for a given dimension
+++
+++
filters.iqr
- Filter points by evaluating Interquartile Range for a given dimension
+++
+++
filters.cluster
- Cluster points by proximity (Euclidean distance)
- Iterate over newly added points until no more points can be added
- Creates a new integer dimension specifying the
ClusterID
+++
+++
filters.eigenvalues
- Filters like
filters.approximatecoplanar
use eigenvalues, but analysts may wish to precompute eigenvalues and operate directly on them - Creates three new dimensions
Eigenvalue0
Eigenvalue1
Eigenvalue2
+++
+++
filters.radialdensity
- Return the number of points within sphere of given radius
- Creates new
RadialDensity
dimension
+++
filters.smrf
- New alternative to PMF, still uses morphological operators
- Marks ground points as
Classification
value of 2 (else 1)
+++
filters.voxelcenternearestneighbor
andfilters.voxelcentroidnearestneighbor
- Eventual replacement of PCL
filters.voxelgrid
- Thins the point cloud to one point per voxel
+++
+++
+++
filters.sample
- Poisson disk sampling
- No two points can be closer than a given radius
+++
+++
This tutorial is meant to walk through the use of and theory behind one of PDAL's outlier
filters.
+++
The basic idea of a statistical outlier removal has been implemented in both PCL and PDAL.
+++
We begin by computing the mean distance
+++
A threshold is then formed by
Any point whose mean distance
+++
Let's begin by iterating through our DataFrame, keeping track of the mean distance to our eight nearest neighbors.
>>> dists = []
>>> for _, point in samp11[['X','Y','Z']].iterrows():
... dist, _ = tree.query(point, k=9)
... dists = np.append(dists, dist[1:].mean())
+++
>>> sns.kdeplot(dists, cut=0, shade=True);
KDE plot of mean distances.
+++
Now, we compute the threshold as described.
>>> threshold = dists.mean() + 3 * dists.std()
>>> noise = dists[dists > threshold]
>>> signal = dists[dists <= threshold]
>>> print(noise.size, "points detected with a mean distance exceeding the global threshold of", threshold)
241 points detected with a mean distance exceeding the global threshold of 3.81763516967
+++
>>> sns.kdeplot(signal, cut=0, shade=True);
Updated KDE plot after noise removal.
+++
import numpy as np
import pandas as pd
from scipy import spatial
def sor(ins, outs):
dists=[]
df = pd.DataFrame(ins, columns=['X','Y','Z'])
tree = spatial.cKDTree(df)
for _, point in df.iterrows():
dist, _ = tree.query(point, k=9)
dists = np.append(dists, dist[1:].mean())
threshold = dists.mean() + 3 * dists.std()
outs['Mask'] = dists <= threshold
return True
+++
{
"pipeline":[
"./data/isprs/samp11-utm.laz",
{
"type":"filters.python",
"script":"noise-predicate.py",
"function":"sor",
"module":"anything"
}
]
}
+++
{
"pipeline": [
"./data/isprs/samp11-utm.laz",
{
"type": "filters.smrf"
}, {
"type": "filters.hag"
}, {
"type": "filters.range", "limits": "HeightAboveGround[3:]"
}, {
"type": "filters.cluster", "tolerance": 3
}, {
"type": "filters.groupby", "dimension": "ClusterID"
}, {
"type": "filters.locate", "dimension": "HeightAboveGround", "minmax": "max"
}, {
"type": "filters.merge"
}, {
"type": "filters.range", "limits": "HeightAboveGround[20:]"
}
]
}
+++
>>> p = pdal.Pipeline(json)
>>> count = p.execute()
>>> vo = pd.DataFrame(p.arrays[0], columns=['X','Y','Z','HeightAboveGround'])
>>> vo.describe()
X Y Z HeightAboveGround
count 15.000000 1.500000e+01 15.000000 15.000000
mean 512799.513333 5.403632e+06 352.667333 38.212000
std 23.111817 4.812817e+01 22.661835 16.576486
min 512730.790000 5.403557e+06 317.300000 20.050000
25% 512795.435000 5.403623e+06 333.970000 21.895000
50% 512798.290000 5.403624e+06 354.110000 38.450000
75% 512813.460000 5.403626e+06 367.255000 54.255000
max 512831.280000 5.403739e+06 401.930000 63.700000
+++
>>> vo
X Y Z HeightAboveGround
0 512794.22 5403576.38 317.30 21.99
1 512827.97 5403630.85 329.92 24.45
2 512786.89 5403626.56 366.60 58.15
3 512811.06 5403612.84 326.88 20.26
4 512792.11 5403626.03 368.89 59.78
5 512797.05 5403624.26 338.02 28.91
6 512796.65 5403624.90 350.39 41.28
7 512798.29 5403625.87 361.53 52.31
8 512797.34 5403623.67 347.56 38.45
9 512798.47 5403623.67 354.11 44.89
10 512798.73 5403624.20 365.42 56.20
11 512831.28 5403557.39 323.52 20.96
12 512815.86 5403621.44 370.03 63.70
13 512815.99 5403739.10 367.91 20.05
14 512730.79 5403738.80 401.93 21.80