Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cglm structs #80

Merged
merged 15 commits into from
May 25, 2019
Merged

cglm structs #80

merged 15 commits into from
May 25, 2019

Conversation

acoto87
Copy link
Contributor

@acoto87 acoto87 commented Apr 4, 2019

Hi there,

I wanted to upload these changes with a minimal and initial implementation for the vec4s type. There are several things missing, like function docs and other struct types. My intention here is to start reviewing it (not to merge this branch yet, it isn't finished by any means), take your feedback as you envision this feature and make the necessary changes.

So far I was able to add a test, and make check pass all of them. I'll make more tests as the feature is developed.

I also did some changes when I start testing to be able to compile on the Raspberry PI, since it doesn't suppor SSE{2} instructions. Specifically the change in the test_quat.c file: #if defined( __SSE__ ) || defined( __SSE2__ )

So, let me know what you think, and start shaping the feature from there.

@acoto87 acoto87 changed the title Cglm structs cglm structs Apr 4, 2019
@coveralls
Copy link

coveralls commented Apr 4, 2019

Coverage Status

Coverage remained the same at 11.081% when pulling 8affe98 on acoto87:cglm-structs into f26601b on recp:master.

include/cglm/structs/vec4-ext.h Outdated Show resolved Hide resolved
include/cglm/structs/vec4-ext.h Outdated Show resolved Hide resolved
include/cglm/types.h Outdated Show resolved Hide resolved
include/cglm/types.h Outdated Show resolved Hide resolved
include/cglm/types.h Outdated Show resolved Hide resolved
include/cglm/types.h Outdated Show resolved Hide resolved
@recp
Copy link
Owner

recp commented Apr 4, 2019

Hi @acoto87

Thank you very much for your contribution,

It looks good except some style nits 👍

  • I think we should move struct types to new header,
  • Please follow existing coding style
    • No C++ comments (//)
    • 2 spaces for tabs
    • No newline before {
    • ...
  • Tests for structs (ot not) ar welcome 🤗

r.

@acoto87
Copy link
Contributor Author

acoto87 commented Apr 4, 2019

Yeah, don't worry, I will fix those styles things, it was just to know if you agreed at first instance with the approach. Will make other suggested changes too.

I will pull latest changes from master too, to be updated.

@recp recp mentioned this pull request Apr 5, 2019
@recp
Copy link
Owner

recp commented Apr 28, 2019

@acoto87 ping.

@acoto87
Copy link
Contributor Author

acoto87 commented Apr 29, 2019

Hey, sorry about the extended delay, I got caught up at work and with other projects as well, but I haven't forgot this PR, will come back soon with more changes.

@acoto87
Copy link
Contributor Author

acoto87 commented May 1, 2019

Hey @recp I'm not sure about how to proceed with struct types for mat3 and mat4, so I'm leaving it right now as it is, and just updating the functions of mat3 and mat4 that use vec3 or vec4.

The main problem is that it feels more natural to have matrices as m[i][j] than having something like m.m_i_j, so it probably makes sense to leave matrix types as it is right now. Thoughts?

@recp
Copy link
Owner

recp commented May 1, 2019

Leaving it as it is seems good to me. But this will make struct API heterogeneous. m.m_i_j is ugly but m.raw[i][j] is also accessible but not as good as m[i][j].

Let's remember that why there is struct API:

  1. Type safety
  2. Return a value instead of dest parameter
  3. Makes codes more readable for who don't like arrays

Since matrices are two dimensional array compiler can throw warnings if wrong type is sent to a function as matrix parameter. In this case, there is no lose for type safety I guess (better ideas?)

If there are no structs for matrices then it will not be possible to return matrices like vectors:

c = glms_mat4_mul(a, b);

/* or */

glm_mat4_mul(a, b, c);

I'm not sure which way to go for now, let's postpone structs for matrix types for now. In the future things may be changed.


EDIT: We could also access columns like below:

typedef union CGLM_ALIGN_MAT mat4s {
  #ifndef CGLM_NO_ANONYMOUS_STRUCT
    struct {
      float m00;
      float m01;
      /* ... */
    };
    struct {
      vec4s col0;
      vec4s col1;
      vec4s col2;
      vec4s col3;
    };
    #endif
    mat4 raw;
} mat4s;
mat.col0 /* --> first column */
mat.col1 /* --> second column */

@acoto87
Copy link
Contributor Author

acoto87 commented May 1, 2019

Yes, I would like to be able to return matrices in functions. Your proposal looks good, I will implement that. What doesn't look very appealing to me is the m.raw[i][j] but is better than m.m_i_j.

What do you think about a macro to make this access "prettier":

#define MAT(m, i, j) (m.raw[i][j])

(the name MAT could be changed)
and then

mat4s m;
float m12 = MAT(m, 1, 2);
...

The user could always access the elements via .raw field, of course. We could even do this for vectors as well.

@recp
Copy link
Owner

recp commented May 2, 2019

+1 for m.m00 instead of m.m_0_0

I would prefer to add namespace to macros to prevent macro name conflicts. For instance Microsoft has used near and far macros in windows.h and we can't use these names (we could override it, then restore it end of headers...), this is why I used nearDist / farDist (or nearVal / farVal) instead of near/ far...

if we add namespace to macros like this:

float m12 = MAT(m, 1, 2);

/* vs. */

float m12 = GLM_MAT(m, 1, 2);

/* vs. */

float m12 = m.raw[1][2] 

/* vs. */

float m12 = m.m12 

which one looks better? I think m.raw[1][2] and m.m12 looks more readable and neat (to me at least)

My suggestion is that let's not define a getter macro for now, we can add it later as improvement...

We may want to use this getter (MAT(m, i, j)) for something else : #81 (waits for feedbacks)

@acoto87
Copy link
Contributor Author

acoto87 commented May 2, 2019

I think m.m12 and m.raw[1][2] are fine. We should keep those two, since one allow readable and neat access (m.m12) and the other one allow passing i and j as variables (m.raw[i][j]). Apart from that, any macro that we come up with is a syntactic sugar to not have a .raw in the middle of accessing a matrix element. In any case, matrix functions in the library will be implemented with .raw access.

@acoto87
Copy link
Contributor Author

acoto87 commented May 4, 2019

@recp How would you like these functions to work?

CGLM_INLINE
void
glm_mat4_transpose_to(mat4 m, mat4 dest) {
  ...
}

CGLM_INLINE
void
glm_mat4_transpose(mat4 m) {
  ...
}

So far, I've this:

CGLM_INLINE
mat4s
glms_mat4_transpose_to(mat4s m) {
  mat4s r;
  glm_mat4_transpose_to(m.raw, r.raw);
  return r;
}

CGLM_INLINE
mat4s
glms_mat4_transpose(mat4s m) {
  glm_mat4_transpose(m.raw);
  return m;
}

This implementation seems like we don't need one of those functions, unless you want glms_mat4_transpose to modify the struct even if it's returned.

There are other similar functions in mat3.h.

@recp
Copy link
Owner

recp commented May 4, 2019

In your implementation, since glms_mat4_transpose() returns result, there is no benefit to use glms_mat4_transpose_to().

We can only keep this version:

CGLM_INLINE
mat4s
glms_mat4_transpose(mat4s m) {
  mat4s r;
  glm_mat4_transpose_to(m.raw, r.raw);
  return r;
}

since glms_mat4_transpose_to() don't use temp variable, it could be better to use that if destination is not same as the input.

Alternative:

CGLM_INLINE
mat4s
glms_mat4_transpose_to(mat4s m) {
  mat4s r;
  glm_mat4_transpose_to(m.raw, r.raw);
  return r;
}

CGLM_INLINE
void
glms_mat4_transpose(mat4s *m) {
  glm_mat4_transpose(m->raw);
}

I think it would be better to keep only first version (mat4s glms_mat4_transpose(mat4s m)) for now. Users can use array api's glm_mat4_transpose_to if they want.

Let's keep the struct api simple for now, it will evolve by time.

include/cglm/structs/mat4.h Outdated Show resolved Hide resolved
include/cglm/structs/mat4.h Outdated Show resolved Hide resolved
float m20;
float m21;
float m22;
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget that cglm stores matrices as column-major order. So second item must be m10 not m01 right? Am I wrong?

Also formatting that like below could help to keep order correct and more readable:

struct {
  float m00, m10, m20,
        m01, m11, m21,
        m02, m12, m22;
};

/* or */

struct {
  float m00, m10, m20; /* first column  */
  float m01, m11, m21; /* second column */
  float m02, m12, m22; /* third column  */
};

What does m01 mean? First column's first item or first row's first item? What did you think? Or I must do some research for this :S

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe m.m01 should be identical to m[0][1]. So it could be:

struct {
  float m00, m01, m02; /* first column  */
  float m10, m11, m12; /* second column */
  float m20, m21, m22; /* third column  */
};

in this case, your implementation seems ok.

Copy link
Contributor Author

@acoto87 acoto87 May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what I thought about this is that in reality it doesn't matter as long the consistency is maintained. But your last comment make it more clear, it's true that m.m01 should be the same as m.raw[0][1].

So, should I keep it as:

struct {
  float m00, m01, m02,
        m10, m11, m12,
        m20, m21, m22;
};

But then, what about the:

struct {
  vec3s col1, /* EDIT: vec4s to vec3s */
        col2, 
        col3;

because it will be more like rows and columns and it will not match the memory layout of mat3s anymore?

(everything is the same with mat4s)

Copy link
Contributor Author

@acoto87 acoto87 May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More thinking about this:

Actually the dominant field in the mat3s is the mat3 raw; field, those are defined as:

vec3 mat3[3];

so, when the user do: m[0][1] is really accessing the second element of the first column (the second element of the first vec3), so every other field in the mat3s struct/union should reflect this:

struct {
  float m00, m10, m20,
        m01, m11, m21,
        m02, m21, m22
};

All of this matter on how you treat matrices in the code, specially in the multiplication of a matrix and vector, and since you treat them as column ordered, it should be like you put it at the start of this thread.

Copy link
Owner

@recp recp May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean this:

struct {
  vec3s col1,  col2, col3; /* you used vec4s here */
}

in this case since there is no alignment on vec3s, it should match, am I wrong? We cannot define rows like that since we only have columns. But something like this could be possible:

typedef union mat3s {
#ifndef CGLM_NO_ANONYMOUS_STRUCT
  /* row accessor */
  struct row {
    float m00, m10, m20; /* first row  */
    float m01, m11, m21; /* second row */
    float m02, m12, m22; /* third row  */
  };

  /* column accessor (DEFAULT) */
  struct {
    float m00, m01, m02; /* first column  */
    float m10, m11, m12; /* second column */
    float m20, m21, m22; /* third column  */
  };
  
  /* column vector accessor */
  struct {
    vec3s col0;
    vec3s col1;
    vec3s col2;
  };
#endif
  mat3 raw;
} mat3s;

Example:

m3.m01.    /* first column's first item: same as m3.raw[0][1] */
m3.row.m01 /* first row's first item */
m3.col1    /* first column */

I don't say that we should that, just thoughts...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if it's useful to some people in a real case scenario that struct can be included later. Don't know, your call.

Let's postpone to add .row selector/accessor.

For this struct:

struct {
    vec3s col0;
    vec3s col1;
    vec3s col2;
};

which one is better:

m.col0
m.col1
m.col2

/* vec3s col[3]; instead of anonymous struct */
m.col[0]
m.col[1]
m.col[2]

what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... both?:

typedef union mat3s {
#ifndef CGLM_NO_ANONYMOUS_STRUCT
  /* column accessor */
  struct {
    float m00, m01, m02; /* first column  */
    float m10, m11, m12; /* second column */
    float m20, m21, m22; /* third column  */
  };
  
  /* column vector accessor */
  struct {
    vec3s col0;
    vec3s col1;
    vec3s col2;
  };
#endif

  /* column array accessor */
  vec3s col[3];

  /* raw accessor */
  mat3 raw;
} mat3s;

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not both.

Copy link
Contributor Author

@acoto87 acoto87 May 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, look a the glms_frustum_planes implementation, how do you like it better?

CGLM_INLINE
void
glms_frustum_planes(mat4s m, vec4s dest[6]) {
  mat4s t;

  t = glms_mat4_transpose(m);

  dest[0] = glms_vec4_add(t.col[3], t.col[0]); /* left   */
  dest[1] = glms_vec4_sub(t.col[3], t.col[0]); /* right  */
  dest[2] = glms_vec4_add(t.col[3], t.col[1]); /* bottom */
  dest[3] = glms_vec4_sub(t.col[3], t.col[1]); /* top    */
  dest[4] = glms_vec4_add(t.col[3], t.col[2]); /* near   */
  dest[5] = glms_vec4_sub(t.col[3], t.col[2]); /* far    */

  ...
}

or

CGLM_INLINE
void
glms_frustum_planes(mat4s m, vec4s dest[6]) {
  mat4s t;

  t = glms_mat4_transpose(m);

  dest[0] = glms_vec4_add(t.col[3], t.col0); /* left   */
  dest[1] = glms_vec4_sub(t.col[3], t.col0); /* right  */
  dest[2] = glms_vec4_add(t.col[3], t.col1); /* bottom */
  dest[3] = glms_vec4_sub(t.col[3], t.col1); /* top    */
  dest[4] = glms_vec4_add(t.col[3], t.col2); /* near   */
  dest[5] = glms_vec4_sub(t.col[3], t.col2); /* far    */

  ...
}

if it must be one or another, I think we should go with t.col[0] because it offer the possibility of t.col[i] where i is a variable previously calculated.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's go with array (col[i]) for column accessors


#ifdef __AVX__
typedef union CGLM_ALIGN_IF(32) mat4s {
#else
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use CGLM_ALIGN_MAT instead of CGLM_ALIGN_IF. CGLM_ALIGN_MAT already checks these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this change I notice that in the types.h file:

#ifdef __AVX__
#  define CGLM_ALIGN_MAT CGLM_ALIGN(32)
#else
#  define CGLM_ALIGN_MAT CGLM_ALIGN(16)
#endif

and later mat4 is defined like this:

#ifdef __AVX__
  typedef CGLM_ALIGN_IF(32) vec4  mat4[4];
#else
  typedef CGLM_ALIGN_IF(16) vec4  mat4[4];
#endif

shouldn't that just be:

typedef CGLM_ALIGN_MAT vec4  mat4[4];

CGLM_ALIGN_MAT is defined in terms of CGLM_ALIGN and not CGLM_ALIGN_IF, so I could be wrong here.

Copy link
Owner

@recp recp May 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I have updated it in on the master.

@acoto87
Copy link
Contributor Author

acoto87 commented May 7, 2019

Here is one thing nice about the struct API, type safety:

image

In the current implementation you rely on been able to pass vec4 as vec3 because they conform the same time: array of floats. In the struct API, you simply can't pass a vec4s to a vec3s and the compiler detect it. In this case, we need to call glms_vec4_distance instead of glms_vec3_distance.

- Added `ivec3s` type
- Struct implementation of: affine.h, box.h, color.h, curve.h, frutum.h, io.h, plane.h, project.h
- Deleted `glms_mat3_transpose_to` and `glms_mat4_transpose_to`
- Bug fixes in mat4.h
@recp
Copy link
Owner

recp commented May 8, 2019

Since glm_vec4_distance() is optimized with SIMD, it would be better to use it instead of vec3_distance.

But keep in mind that corners[GLM_RTF]'s last item and corners[GLM_RTN]'s last item must be same (equal 1 in general). If glm_frustum_corners() is used then this is guaranteed.

Maybe we should add new function called glm_vec4_distance3() which is safer than glm_vec4_distance().

But I'm not sure how glm_vec4_distance3() could work:

  1. Ignore last items; one could be 0, one could be 1 or greater.... One is projected, other one is not...
  2. Make a division like glm_vec4_scale(a[0], 1.0f / a[3], tmp[0]); then call glm_vec4_distance(a, b)

@acoto87
Copy link
Contributor Author

acoto87 commented May 8, 2019

But keep in mind that corners[GLM_RTF]'s last item and corners[GLM_RTN]'s last item must be same (equal 1 in general). If glm_frustum_corners() is used then this is guaranteed.

If we can ensure this, maybe as a precondition of the function, maybe put it in the comments that this is how the function expect it's params, it will not have problems since the w=1 will cancel each other in the distance calculation.

Maybe we should add new function called glm_vec4_distance3() which is safer than glm_vec4_distance().

I think this would be to assume so much about the vectors, for the reasons you put in your comment the vectors could be in any "format" (unprojected, projected, with w=0 or w=1, etc.). So it would be better to avoid dealing with that.

Maybe a solution could be dropping the w value of the vectors and go back to glms_vec3_distance:

CGLM_INLINE
void
glms_frustum_corners_at(vec4s  corners[8],
                        float splitDist,
                        float farDist,
                        vec4s  planeCorners[4]) {
  vec4s corner;
  float dist, sc;

  /* because distance and scale is same for all */
  dist = glms_vec3_distance(glms_vec3(corners[GLM_RTF]), glms_vec3(corners[GLM_RTN]));
  sc   = dist * (splitDist / farDist);

  ...
}

What do you think?

@recp
Copy link
Owner

recp commented May 8, 2019

What do you think?

Actually this is what array api did. It ignores w since it could be different... :(

For now, let's add a comment as your proposal, in the future things may be changed.

Maybe you should use array api by calling glm_frustum_corners_at()? To do that copy of corners array is required but compiler may optimize this step (or maybe not). I'm suggesting this because array api may be more optimized in the future, then the optimization will affect the struct api directly.

Thoughts?

@acoto87
Copy link
Contributor Author

acoto87 commented May 16, 2019

Yeah, I already did some changes to glms_vec*_pack and glms_vec*_unpack array of vec4s and vec4, like this:

CGLM_INLINE
void
glms_vec4_pack(vec4s dst[], vec4 src[], size_t len) {
    size_t i;
	
    for (i = 0; i < len; i++) {
        dst[i].x = src[i][0];
	dst[i].y = src[i][1];
	dst[i].z = src[i][2];
    }
}

CGLM_INLINE
void
glms_vec4_unpack(vec4 dst[], vec4s src[], size_t len) {
    size_t i;

    for (i = 0; i < len; i++) {
	dst[i][0] = src[i].x;
	dst[i][1] = src[i].y;
	dst[i][2] = src[i].z;
    }
}

so we can implement the functions that needs this like this:

CGLM_INLINE
void
glms_frustum_corners_at(vec4s corners[8],
                        float splitDist,
                        float farDist,
                        vec4s planeCorners[4]) {
    vec4 rawCorners[8];
    glms_vec4_unpack(rawCorners, corners, 8);

    vec4 rawPlaneCorners[4];
    glm_frustum_corners_at(rawCorners, splitDist, farDist, rawPlaneCorners);

    glms_vec4_pack(planeCorners, rawPlaneCorners, 8);
}

@acoto87
Copy link
Contributor Author

acoto87 commented May 16, 2019

What should we do about this functions (there are other in similar situation):

/*!
 * @brief decompose affine transform, TODO: extract shear factors.
 *        DON'T pass projected matrix here
 *
 * @param[in]  m affine transfrom
 * @param[out] t translation vector
 * @param[out] r rotation matrix (mat4)
 * @param[out] s scaling vector [X, Y, Z]
 */
CGLM_INLINE
void
glms_decompose(mat4s m, vec4s t, mat4s r, vec3s s) {
    glm_decompose(m.raw, t.raw, r.raw, s.raw);
}

They don't work if implemented like this, because when the user do:

mat4s m = ...;
vec4s t = ...;
mat4s r = ...;
vec3s s = ...;

glms_decompose(m, t, r, s);

their local variables are copied to the function parameter, thus the local variables aren't modified by the function.

So, the question is, what should we do if we want to output multiple values? I don't like to pass pointers (but if there is nothing else, I could live with that due to the nature of the function), and returning a struct bundled with the outputs is not good looking either.

Thoughts?

@acoto87
Copy link
Contributor Author

acoto87 commented May 16, 2019

There are functions like this one:

CGLM_INLINE
void
glms_frustum_corners_at(vec4s corners[8],
                        float splitDist,
                        float farDist,
                        vec4s planeCorners[4]) {

    vec4 rawCorners[8];
    vec4 rawPlaneCorners[4];

    glms_vec4_unpack(rawCorners, corners, 8);
    glm_frustum_corners_at(rawCorners, splitDist, farDist, rawPlaneCorners);
    glms_vec4_pack(planeCorners, rawPlaneCorners, 8);
}

in which makes sense to _unpack and _pack arrays from and to vec4s, because like you point out the non-struct version can be further optimized later and this implementation will benefit for free without any change (just with the overhead of copying the arrays), but there are functions like this one:

CGLM_INLINE
float
glms_aabb_size(vec3s box[2]) {
  return glms_vec3_distance(box[0], box[1]);
}

in which I think it would be less efficient to copy arrays when it can be easily implemented like above. What do you think about this? I think performance is always a priority here, but where do you stand on this type of consistency for the struct API implementation?

- Added `glms_vec4_pack` and `glms_vec4_unpack` to pack and unpack arrays of `vec4s`.
- Added `glms_vec3_pack` and `glms_vec3_unpack` to pack and unpack arrays of `vec3s`.
- Fixes in functions that accumulates in one parameter
-
@recp
Copy link
Owner

recp commented May 18, 2019

Sorry for the late response,

I think using pointers with restrict would be fine:

CGLM_INLINE
void
glms_decompose(mat4s m, vec4s * __restrict t, mat4s * __restrict r, vec3s * __restrict s) {
  glm_decompose(m.raw, t.raw, r.raw, s.raw);
}
mat4s m = ...;
vec4s t;
mat4s r;
vec3s s;

glms_decompose(m, &t, &r, &s);

this isn't so ugly, is it? I think it is acceptable and better than returning a big struct.

CGLM_INLINE
float
glms_aabb_size(vec3s box[2]) {
return glms_vec3_distance(box[0], box[1]);
}
in which I think it would be less efficient to copy arrays when it can be easily implemented like above.

instead of using glms_ or copying structs again, you could get some help from array api:

CGLM_INLINE
float
glms_aabb_size(vec3s box[2]) {
  return glm_vec3_distance(box[0].raw, box[1].raw);
}

I think performance is always a priority here, but where do you stand on this type of consistency for the struct API implementation?

let's make performance + simplicity our major priority...

where do you stand on this type of consistency for the struct API implementation?

I would prefer to not copy structs again and again inside implementation :/ I just don't want to duplicate implementation if it is possible.

  • optimizations: array apis can be more optimized in the future
  • bug fixes: when we fixed a bug in array api, struct api will be fixed too if it is just wrapper otherwise we have to apply same fix here too.
  • ...

it seems struct arrays make things harder for to be simple wrapper. I'll spend some time on this later. But your solution (_pack and _unpack) seems good. I hope compiler will optimize loops inside pack/unpack, if not we can use memcpy maybe...

@acoto87
Copy link
Contributor Author

acoto87 commented May 19, 2019

it seems struct arrays make things harder for to be simple wrapper.

For simple functions with non-array parameters it's fine, the issues begin when dealing with functions that have array as parameters or output values through out parameters. I agree with you that when reusing the API implementation, the struct API will gain performance improvements right out of the box, so we should rely on it for almost all (if not all) struct implementation.

I'm gonna make some checks about the optimizations that the compiler could do with the _pack/_unpack functions, to see if we can allow that overhead in some of the functions. I though of using memcpy, but I wasn't entirely sure about how alignment could affect that, if it safe, memcpy could deliver faster results.

- Now the functions that output mutliple values, such as glms_decompose_rs and glms_decompose receive pointers.
- Added missing comments to struct/vec3 and struct/vec4 files.
@acoto87
Copy link
Contributor Author

acoto87 commented May 20, 2019

@recp Can you check for the overall implementation of the struct API? I think it got to a point where we can check if is a valid implementation as a whole. Let me know what you think should be changed and/or improved.

Still need testing of course.

Gonna make those tests of _pack and _unpack functions shortly.

void
glms_aabb_invalidate(vec3s box[2]) {
// FIX: Modify param
//
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this?

vec4s col1;
vec4s col2;
vec4s col3;
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have col[4] array do we still need an anonymous struct for columns?

@recp
Copy link
Owner

recp commented May 21, 2019

@recp Can you check for the overall implementation of the struct API? I think it got to a point where we can check if is a valid implementation as a whole. Let me know what you think should be changed and/or improved.

I think it looks good, thank you.

A few things:

Styles:

  • 2 spaces for indentation (2 spaces to TAB)
  • No C++ style comments ( // )
  • No new line[s] for {

Proposal:

  • struct (singular) vs structs (plural) for folder name? Or glms is another option for folder name?

@recp
Copy link
Owner

recp commented May 21, 2019

Also I have forgot that we have vec2 and mat2 types/funcs which are (or were) in progress: https://github.com/recp/cglm/tree/vec2_mat2 (0729fd4)

I'll implement struct wrappers when they have finished, probably it will not finish before this PR, I guess.

- Include files in `cglms.h`
- Fix style issues (tabs to spaces, open brace without newline)
@acoto87
Copy link
Contributor Author

acoto87 commented May 21, 2019

I fix the style issues, let me know if there is anything else regarding this.
Also complete the cglms.h file to include the files under struct folder.

I prepare an example to see how much the compilers will optimize the _pack and _unpack functions. Here it is: https://godbolt.org/z/cTgeOH. Let me know if what you see is good enough, at first glance I see the GCC optimize quite a bit with -O2. I also include a variant with memcpy to compare.

@recp
Copy link
Owner

recp commented May 22, 2019

It seems compiler did not unroll the loop. Also instead of memcpy or assignments we could use glm_vec4_copy(src[i].raw, dst[i].raw); for vec4 pack/unpack. Maybe it could improve performance a bit. Because vec4 and vec4s are aligned if alignment is not disabled, compiler may not guess it is aligned so it may not use movaps for copying vec4.

EDIT:

We may also optimize glm_vec3_copy later; so to copy individual vectors, using vector's copy function could best option, I guess.

To copy vec4[] to vec4s[] maybe single memcpy could be an option and could be better maybe, just thinking 🤔

@acoto87
Copy link
Contributor Author

acoto87 commented May 22, 2019

To copy vec4[] to vec4s[] maybe single memcpy could be an option and could be better maybe, just thinking

Well yeah, that would be another option, a single memcpy should be faster also. Do you know if there could be any issues with alignment and memcpy?

For now I put glm_vec{3,4}_copy functions instead of direct assignments.

@acoto87
Copy link
Contributor Author

acoto87 commented May 22, 2019

Also I've this signature for _pack and _unpack functions:

glms_vec4_pack(vec4s dst[], vec4 src[], size_t len);
glms_vec4_unpack(vec4 dst[], vec4s src[], size_t len);

do you prefer to switch src and dst parameters?

glms_vec4_pack(vec4 src[], vec4s dst[], size_t len);
glms_vec4_unpack(vec4s src[], vec4 dst[], size_t len);

I modeled initially as a memcpy-like function, which have destination as the first parameter.

@recp
Copy link
Owner

recp commented May 23, 2019

Well yeah, that would be another option, a single memcpy should be faster also. Do you know if there could be any issues with alignment and memcpy?

Actually I'm not sure for certain, but I think compiler may generate instructions for unaligned memory when inlined memcpy. I must do some research for this.

I think real issue is about how vector and struct arrays lay out on the memory. I think memory layout of vec3|4[] and vec4|3s is same but I don't think is it guaranteed by C standards or compilers. I also need to do research about this.

Until we sure, let's ignore memcpy for now.

do you prefer to switch src and dst parameters?

No it looks good as it is.

Do you think it could be better to put dest as first parameter like memcpy or other C functions? Maybe we should create an issue for this to discuss.

@acoto87
Copy link
Contributor Author

acoto87 commented May 23, 2019

Do you think it could be better to put dest as first parameter like memcpy or other C functions? Maybe we should create an issue for this to discuss.

Well is that I was thinking that your API almost always have destination parameters after the source ones (except when the first parameter is an in-out one). I don't mind either way, but was wondering if it would be better to be consistent on that. Your call.

@soulfoam
Copy link

soulfoam commented May 24, 2019

If I may give my opinion, I’d say that I think the API should remain consistent and have src first. BUT, I also dislike the API of src being first. I always use dest first in my own code because most of the standard library is structured that way. When I use cglm I always end up switching the parameters because I type dest first and end up remembering it’s src first, and for some reason I do this every time haha. I’m used to 95% of the code I use taking dest first.

@recp
Copy link
Owner

recp commented May 24, 2019

@acoto87 @soulfoam thank you for your feedbacks.

most of the standard library is structured that way

/* multiply a and b store in dest */

mat4_mul(dest, a, b);    // dest < (a > b)  or  dest = a * b
mat4_mul(a, b, dest);    // (a > b) > dest

in the second one the flow is one direction (a is left matrix, b is right matrix). The first one is similar to assignment but it is not an assignment, actually. Probably this is why I used dst as last parameter. Now I'm not sure everyone is happy for this decision, if not we can update it.

API consistency is really really important but on the other hand, if everyone would like to pass destination as first parameter we can consider to switch src and dst order in the future to make everyone happy. But if we change the order, users may not realize that, this would be really bad. I don't think everyone can catch up this change. I'll create an issue for this later to get more feedbacks.

EDIT:

Similar discussion: https://internals.rust-lang.org/t/memcpy-is-backwards/1797

@recp recp added this to the v0.6.0 milestone May 25, 2019
@recp recp merged commit 56339b9 into recp:master May 25, 2019
@recp
Copy link
Owner

recp commented May 25, 2019

@acoto87 this PR is merged now. Thank you for your contribution, many thanks 🤗

Feel free to create new PRs if you have some local changes, or if you want to add new features or bug fixes...

@acoto87 acoto87 deleted the cglm-structs branch May 26, 2019 07:28
@FrostKiwi
Copy link
Contributor

As a result of how the struct API is implemented, clang + -Wall complains when initializing things.
vec2s topleft = {-1, 1}; results in

src/draw_border.c:149:20: warning: suggest braces around initialization of
      subobject [-Wmissing-braces]
                vec2s topleft = {-1, 1};
                                 ^~~~~
                                 {    }

https://stackoverflow.com/questions/13905200/is-it-wise-to-ignore-gcc-clangs-wmissing-braces-warning

Writing {{-1, 1}} is annoying, so I went with the option of WARN=-Wall -Wno-missing-braces -Wmissing-field-initializers. That disables the missing braces warning, but keeps the warning related to that alive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants