Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit vector data from camera #14

Closed
semsitivity opened this issue Sep 29, 2015 · 17 comments
Closed

Unit vector data from camera #14

semsitivity opened this issue Sep 29, 2015 · 17 comments
Assignees

Comments

@semsitivity
Copy link

The o3d3xx PCIC interface can return unit vector matrices related to camera calibration data, and depending the used IR frequency.
Those vector already includes intrinsic and extrinsic transformations that can be applied on distance image to obtain calibrated and eventually customized cartesian 3D coordinates.
It would be important to access this vector matrix for some processing done on distance data only and to be able to compute totally or partially their 3D coordinates.

@graugans
Copy link
Member

Unit Vector description

The unit vector matrix contains 3 values [ex, ey, ez] for each pixel, the data layout is [ex_1,ey_1,ez_1, ... ex_N, ey_N, ez_N], where N is the number of pixels. Multiplying a distance measurement by the appropriate component leads to the corresponding cartesian coordinate: [X_i, Y_i, Z_i] = D_i * [ex_i, ey_i ez_i]. The rotational component of the extrinsic calibration, specified by the user, is already applied to the unit vectors.
Data type: 32 bit float

The image resolution for the unit vector depends on the selected binning mode of the camera

Chunk Type

The chunk type used for the unit vector matrix is: 223

Chunk Header

The Unit Vector data is prefixed with a header. This header is called chunk header.

Offset Name Description Size in bytes
0x0000 CHUNK_TYPE Defines the type of the chunk. For each distinct chunk type a own type is defined. 4
0x0004 CHUNK_SIZE Size of the whole image chunck in bytes. 4
0x0008 HEADER_SIZE Number of bytes starting from 0x0000 until PIXEL_DATA 4
0x000C HEADER_VERSION Version number of the header 4
0x0010 IMAGE_WIDTH Image width in pixel 4
0x0014 IMAGE_HEIGTH Image height in pixel 4
0x0018 PIXEL_FORMAT Pixel-Format 4
0x001C TIME_STAMP Timestamp in uS 4
0x0020 FRAME_COUNT Frame count according to algorithm output 4
0x0024 PIXEL_DATA The pixel data in the given type and dimension of the image. -

Example Code

This is how the data is assembled on the camera

Currently supported Chunk Types

typedef enum ChunkType {
    CT_RADIAL_DISTANCE_IMAGE  = 100,  
    CT_NORM_AMPLITUDE_IMAGE   = 101,
    CT_AMPLITUDE_IMAGE        = 103,
    CT_CARTESIAN_X_COMPONENT  = 200,
    CT_CARTESIAN_Y_COMPONENT  = 201,
    CT_CARTESIAN_Z_COMPONENT  = 202,
    CT_CARTESIAN_ALL          = 203, 
    CT_UNIT_VECTOR_ALL        = 223,
    CT_CONFIDENCE_IMAGE       = 300,
    CT_DIAGNOSTIC             = 302,
    CT_EXTRINSIC_CALIBRATION  = 400,
    CT_JSON_MODEL             = 500,
    CT_SNAPSHOT_IMAGE         = 600,
    CT_MAX
} ChunkType_t;

Available pixel formats

typedef enum PixelFormat {
    PF_FORMAT_8U   =  0,
    PF_FORMAT_8S   =  1,
    PF_FORMAT_16U  =  2,
    PF_FORMAT_16S  =  3,
    PF_FORMAT_32U  =  4,
    PF_FORMAT_32S  =  5,
    PF_FORMAT_32F  =  6,
    PF_FORMAT_64U  =  7,
    PF_FORMAT_64F  =  8,
    PF_FORMAT_16U2 =  9,
    PF_FORMAT_32F3 = 10,
    PF_MAX
} PixelFormat_t;
chunkHeader.chunkType = ifm::CT_UNIT_VECTOR_ALL;
chunkHeader.chunkSize = sizeof(ifm::ChunkHeader) + result->width * result->height * 3 * sizeof(float);
chunkHeader.headerSize = sizeof(chunkHeader);
chunkHeader.headerVersion = 1;
chunkHeader.imageWidth = result->width;
chunkHeader.imageHeight = result->height;
chunkHeader.pixelFormat = ifm::PF_FORMAT_32F3; // [eX,eY,eZ] triple
chunkHeader.timestamp = result->timestamp;
chunkHeader.frameCount = getFrameCounter();

How to request the Unit Vector data

You can request the data either by the all_unit_vector_matrices keyword in the JSON format send by PCIC c command or by sending I<image-ID>?

<image-ID> 2 digits for the image type

ID Description
01 amplitude image
02 normalized amplitude image
03 distance image
04 x-image (Distance information)
05 y-image (Distance information)
06 z-image (Distance information)
07 confidence image (status information)
08 extrinsic calibration
09 unit_vector_matrix_ex, ey,ez
10 last result output as formatted for this connection
11 all distance images: x, y, and z

@graugans
Copy link
Member

The unit vector provided by the latest official ifm firmware has a bug which makes them useless. This will be fixed in the firmware version 1.2.x

@tpanzarella
Copy link

Great info. Thanks @graugans I think we need to think about the best way to deal with this in the library. While I completely appreciate the concerns of @semsitivity there is certainly a whole host of use cases that will continue to want to have the Cartesian data constructed as a PCL point cloud as it is done now (i.e., not manually converting the depth image when needed). That said, I could imagine a few scenarios:

  1. We provision for pluggable frame grabber implementations that would allow for flexibility in what we ask the camera to return to us.
  2. The current frame grabber could be changed to let the JSON schema drive its parsing rather than exploiting a priori information as it does today. It follows then that we will allow for a mutator to be called so that the user can specify its desired JSON schema.
  3. We allow for the caller to provide some kind of bitmask (or similar) denoting what images the frame grabber should populate in the ImageBuffer object.

Regardless of what we do, I think we should grab some performance metrics to ensure it is worth the effort. Given that the FrameGrabber is running in a separate thread from the user's algo code, the speed-ups (if any) may not be meaningful. I'm speculating and could only speak definitively to this once we capture some performance data. We should probably let the O3D303's on-board iMX6 be the ground truth architecture for which we grab this performance data as it seems there is consensus among many of us that, ideally, our algo code runs on the camera.

Related to this, I am currently planning (unless there are serious objections), of moving the xyzi_image that is currently being constructed in the o3d3xx-ros code into this library as well. It is basically a 4 channel OpenCV image encoding of the point cloud where the first three channels are spatial planes (x, y, z) and the fourth is the amplitude data. You can see that implementation here NOTE: the notion of numpy_cloud has been changed to a more general xyzi_image. But the code is effectively the same. I think having this also addresses a concern of @semsitivity where (for example) an ROI can be defined in depth or amplitude space and similarly masked directly to the point cloud (i.e., the xyzi_image) since they are now all cv::Mat. I should also note that this will be in addition to the PCL point cloud that is also constructed. Again, I expect the overhead to be minimal as the construction of all the images happen concurrently (i.e., O(n) where n is the number of pixels on the active imager array).

@graugans Quick question since you mention the extrinsics in your note above... Does the O3D303 apply a rotation and translation to the image data based on the extrinsics configured via the JSON interface? I could figure this out empirically, but since you are here, I thought I'd be lazy and just ask :) FWIW, I like having the ability to store the extrinsics on the camera however, it would be nice to have a flag or switch to tell the camera whether or not you actually want it to perform the transform. For example, we have a current use case where we want to do all algo processing in the camera frame and then once we localize our object of interest we will simply transform the object pose based on the extrinsic calibration.

@graugans
Copy link
Member

@tpanzarella You can change the extrinsic calibration by a xml-rpc call to the following object:
http://<sensor-ip>/api/rpc/v1/com.ifm.efector/session_<session-id>/edit/device/
The following parameters are available:

  • ExtrinsicCalibRotX
  • ExtrinsicCalibRotY
  • ExtrinsicCalibRotZ
  • ExtrinsicCalibTransX
  • ExtrinsicCalibTransY
  • ExtrinsicCalibTransZ

The unit of the rotation is in ° not in rad. There is no flag as far as I know, but you can store multiple configurations on the camera and switch between them. We call them application.

@tpanzarella
Copy link

@graugans Right. I guess the question is, does this simply store the data or does the O3D303 firmware apply this transformation prior to returning the data over the socket interface?

(sorry, I think in my original question, I was not clear. I said "JSON interface" when I meant "xml-rpc interface". I'm just so used to looking at the JSON serialization of the parameters as a result of running o3d3xx-dump. Sorry for the confusion. ... however, the above question still stands in terms of asking if the transformation is applied or not).

@graugans
Copy link
Member

@tpanzarella The extrinsic calibration is applied during calculation of x,y,z data.

BTW this plugabble framegrabber approach sounds interesting.

@tpanzarella
Copy link

@graugans Thanks for the clarification of the extrinsics. As you note in an earlier comment, the effect I am looking for (i.e., storing the extrinsics on the camera while operating in the camera frame) can be achieved by having multiple applications loaded on the camera with different extrinsic calibration values (i.e., one with the real extrinsics and one with all zeros for the rot and trans).

Many thanks again Christian! This is good information to know.

I'll keep thinking about the FrameGrabber solution we have been discussing above. Do you have a current target date for the 1.2.x firmware? It sounds like we have until (at least) then to come up with a direction we'd like to go in to support the use case of @semsitivity

@graugans
Copy link
Member

@tpanzarella Last week we did a feature freeze for the 1.2.x branch, but there is still some testing needed.

Frankly I am not involved in the official release process. My team and myself provide release candidates when those go online is the decision by the sales team. I guess by the end of October 1.2.x is released. Maybe I can provide the next test candidate for a beta test.

@semsitivity
Copy link
Author

@tpanzarella, just a comment about the xyzi_image.
I read in the o3d3xx-ros code that you use a CV_32FC4 opencv matrix. I guess that you consider the unit in meter here.
My feeling , and say me if you disagree, is that if we keep the unit in millimeter a CV_16UC4 would be more suitable. For those reasons:

  • currently, soft-float is used on the board...so it's always some cpu saved by this.
  • the best camera accuracy is arround 4mm if I remember well, and of course more depending the range and the configuration.
  • maximum precision is of course better, but I'm not sure that it's really pertinent to use this format using twice the memory size and probably a bit time consuming for the cpu for so few real information in the data.
    Concerning the point cloud format, keeping it in meter with double seems to be a good choice, but in opencv world, I think that unsigned short is closer to reality, even if unit vector are given as float or double.
    What do you think?

@tpanzarella
Copy link

@semsitivity Your point on keeping the xyzi_image in mm (and by association the data as uint16_t) is well taken and works for me too (especially if we keep the point cloud in meters). I don't think we have too many users of the xyzi_image in the ROS code yet, so, I am not concerned with backward compatibility. I'll just be sure to clearly note this change in the documentation and changelog.

I will work this change into the next release of the code. Tentatively, I'm shooting for next week. I am bogged down with lots of other stuff this week and early next week, but I'll have time at the end of next week to update both libo3d3xx and o3d3xx-ros to reflect this (and other) changes.

Thanks for the comment and technical rationale for this request.

@graugans
Copy link
Member

graugans commented Oct 1, 2015

@semsitivity The camera is using hardware floating point. But we do use the softfp ABI. We are planning to switch to hardfp by the end of the year. The main purpose for using unsigned int 16 for the data is to save Ethernet bandwidth. Internally everything is calculated in float precision. We started with softfp because there was no hardfp GPU userland. We made some Benchmark and the performance gain softfp vs hardfp was ~3-4%.

This was referenced Oct 5, 2015
@graugans
Copy link
Member

graugans commented Oct 7, 2015

@tpanzarella After digging deeper into the whole extrinsic and unit vector business I realized that I made a big mistake and claimed the extrinsic calibration is an application specific parameter. It is not!

http://<sensor-ip>/api/rpc/v1/com.ifm.efector/session_<session-id>/edit/device/ <-- device !!! 

The whole idea of extrinsic calibration was for a fixed mounting of the Camera.

Anyway I started some internal discussion if it make sense to have such an flag. If there is more benefit for such a flag let me know.

Sorry for my misleading assumption about the extrinsic calibration.

@tpanzarella
Copy link

No worries Christian. In fact, I am a bit embarrassed that I did not catch that. It is clear that the extrinsics are configured at the device level as indicated by the JSON serialization. For example this. Thanks for the explicit clarification anyway.

In terms of the flag to either apply or not apply the extrinsic transformation, my request is motivated by three concrete concerns:

  1. As you know, the libo3d3xx coordinate frame is not consistent with the O3D303 returned coordinate frame. That said, inspecting the stored extrinsics on the camera would be highly unintuitive to someone using libo3d3xx as they would have to mentally remap the coordinate system in their head while debugging, etc. I believe this would lead to a lot of unnecessarily wasted engineering cycles by many (very smart) people simply due to a misunderstanding. That said, if there were a flag that determined if the transformation were applied or not, those using libo3d3xx could store the extrinsics according to the libo3d3xx coord frame and simply apply the transform manually once getting the point cloud returned to them in the camera frame. Conversely, for those using the built-in O3D303 coord frame, they could set the flag to apply the extrinsics as is if it makes sense for their application. Effectively this decouples the stored parameters from the coordinate system that is in place for the application.
  2. I think it is really useful to be able to store the extrinsics close to the camera and there isn't much closer than directly on the camera. This gets even more valuable for multi-camera systems where you are managing a fleet of n-cameras each with their own unique extrinsic calibration. Now, just because you have an extrinsic calibration in place does not necessarily mean you want to run your vision algorithm in that coord frame. For example, we have a concrete object recognition use case where we want to do all processing in the camera frame, then once we segment out our object of interest, we need to return its pose transformed via the extrinsics. So, we want the camera data in the camera frame, then to be able to query the camera for the extrinsic calibration, and manually transform the final pose to the parent frame.
  3. From a practical engineering perspective, many of our applications involve embedded linux systems where we highly prefer to configure the system to run with a read-only file system to avoid any corruption -- we typically provide a very small read/write filesystem for logging or whatever (sometimes just a block of volatile memory -- e.g., tmpfs or similar). That said, it would be super convenient to be able to store in persistent memory the extrinsics of the camera and since the O3D303 already provides a facility for doing so, we would just exploit that. Again, I realize that this is sort of minutia but it is a real practical concern and of high value to have this option.

In summary, I hope the above three examples help further motivate real-world concerns for: 1) keeping the ability to store the extrinsics on the camera, but, 2) provide a boolean flag to indicate whether or not you actually want the transformation applied by the O3D firmware.

Let me know if anything is unclear.

@semsitivity
Copy link
Author

This is not an anwser, but just a remark for @graugans and the information he gave about the extrinsic.
The extrinsic are at the device level, yes, and they are not in deg, but in rad.
The translations unit is in meter.
Thanks to o3d3xx-config, @tpanzarella 's help on how to use it, I finally figured it out.

@graugans
Copy link
Member

graugans commented Oct 8, 2015

@semsitivity , @tpanzarella Yes, at the moment the values are in m and rad. But this is more a bug than an intended feature. We are discussing this internally, because the general camera interface is mm and degrees we may fix this in a next release. And we are preparing some sort of application note how to deal with the extrinsic calibration and the unit vectors. There are some points to pay attention off. I hope you can wait a couple of days until we can provide a pre release of the application note to you.

@semsitivity
Copy link
Author

@tpanzarella , my target was to be able to adapt my algorithm to the device for the end of the month.
The first step is obviously the image capture which is was blocking problem, and even if it will change really soon, I'm happy to be able to go on on my next development steps.
So don't hesitate to be clear on next release changed, and I'll adapt my code.

Today, I have this sequence working great:

  • Set up camera application settings for a good quality/precision image
  • capture a single image
  • detect floor normal vector and needed transformation to align points with the floor
  • set other camera settings for a long run at 10fps with less precision to take care about temperature
  • set extrinsic values to camera with those needed transformation
  • start image capture loop and all my detection stuff, starting from floor oriented cartesian coordinates

I'm happy because, previously, I was applying my self all the transformation. Now I obtain what I was expecting: less CPU consumption + a simplified algorithm, and that's great.

@tpanzarella
Copy link

@graugans Thanks. Take the time you need to sort this out. Speaking for LPR, we can wait until you are ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants