Maybe use std::bitset instead of packed_t? #73

linas · 2018-12-13T03:26:24Z

The packed_t is used to pack bits together. However, it seems that perhaps std::bitset might do this better, and also offer more features, functions, etc. It might simplify the code!? and might? run faster?

The text was updated successfully, but these errors were encountered:

ngeiswei · 2018-12-13T06:53:32Z

I think it might simplify the code a bit, however the coding complexity, IIRC, is more at the level of instance which is std::vector<packed_t>. Ideally we'd replace std::vector<packed_t> by std::bitset, but std::bitset doesn't allow us to dynamically define its size, which might be a problem.

ngeiswei · 2018-12-13T07:05:05Z

According to https://stackoverflow.com/questions/3134718/define-bitset-size-at-initialization there would be 2 ways to address that

std::vector<bool>, which has a curious specialization for bool
boost.dynamic_bitset https://www.boost.org/doc/libs/1_69_0/libs/dynamic_bitset/dynamic_bitset.html

ngeiswei · 2018-12-13T07:11:38Z

I tagged it as "good first issue". It is not an easy task, but it doesn't require a full (or much) understanding of MOSES to tackle it.

linas · 2018-12-13T07:25:17Z

BTW, fairly off-topic, but: take a look at the brand new opencog/util/numeric.h circa line 130: nbits_to_pack -- that function is used to pack bits, and tries to align so that bits are on byte-boundaries or nibble-boundaries, or power-of-two boundaries. Presumably for performance??

But I learned that I can #define ALIGNED_NOT_ACTUALLY_REQUIRED and the moses unit tests still pass, and they don't go any slower (or faster) so there doesn't seem to be any need for byte-alignment (or possibly the unit test cases don't test anything that needs this?)

ngeiswei · 2018-12-13T10:28:19Z

Interesting. I don't understand why alignment would increase performance in the first place but I'm not very knowledgeable about the low level parts of a computer. Well, I had courses but that was always with toy architectures just to get a feel.

linas · 2018-12-14T05:29:25Z

The difference between adding two 64-bit ints, and adding 2x 32-bit ints is simply cutting the carry bit-line between bit 31 and bit 32. Thus all the architectures introduced "SSE" instructions circa mid 1990's onwards, that support many kinds of ops in 2x32 or 4x16 or 8x8 ints with just one instruction. This was a boon that allowed MP3's and youtube to flourish. But of course, for this to work, you have to be aligned on appropriate boundaries.

There's even a trick where you can do if-statements without any branches at all: you use the sign-bit in one reg to route the result of an operation to one of two different regs. Really slick, since normally conditionals and branches fart out pipeline bubbles.

Its possible that the default build flags on moses don't enable these insns. Hmm. They are often disabled, so that binaries built on a cpu with these insns will also run on a cpu without these insns... That's a good question, I don't know the answer.

linas · 2018-12-14T05:32:21Z

gcc -Q --help=target

gives me:

  -mlong-double-128           		[disabled]
  -mlong-double-64            		[disabled]
  -mlong-double-80            		[enabled]

which explains earlier commentary about long double!!

linas · 2018-12-14T05:34:13Z

and this:

  -msse                       		[disabled]
  -msse2                      		[disabled]
  -msse2avx                   		[disabled]
  -msse3                      		[disabled]
  -msse4                      		[disabled]
  -msse4.1                    		[disabled]
  -msse4.2                    		[disabled]
  -msse4a                     		[disabled]
  -msse5                      		
  -msseregparm                		[disabled]
  -mssse3                     		[disabled]

so that optimization is disabled, as are pretty much all the others that I vaguelt recognize.

linas · 2018-12-14T05:53:39Z

I just recompiled moses with -march=native which turns on all the special insns for my machine. Running unit tests, no speed difference at all in most of them. A handful ran 1% or 2% slower. But selectionUTest runs 15% faster.

linas · 2018-12-14T05:55:03Z

Ooops, no, it makes no difference to selectionUTest; I was looking at the wrong thing. So -march=native is officially a big disapointment...

linas · 2018-12-14T06:16:54Z

-march=native on atomspace is maybe? 5% or maybe 10% faster. Or maybe not at all.? A lot of jitter in the numbers, hard to say, without a deeper look.

... took a deeper look. Effectively it makes no difference at all.

ngeiswei added enhancement good first issue Good for newcomers labels Dec 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maybe use std::bitset instead of packed_t? #73

Maybe use std::bitset instead of packed_t? #73

linas commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

linas commented Dec 13, 2018

ngeiswei commented Dec 13, 2018 •

edited

Loading

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018 •

edited

Loading

Maybe use std::bitset instead of packed_t? #73

Maybe use std::bitset instead of packed_t? #73

Comments

linas commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

ngeiswei commented Dec 13, 2018

linas commented Dec 13, 2018

ngeiswei commented Dec 13, 2018 • edited Loading

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018

linas commented Dec 14, 2018 • edited Loading

ngeiswei commented Dec 13, 2018 •

edited

Loading

linas commented Dec 14, 2018 •

edited

Loading