Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement indexed map access (incubating feature) #2068

Closed
wants to merge 1 commit into from

Conversation

mgreter
Copy link
Contributor

@mgreter mgreter commented May 2, 2016

Fixes #1930

Was even simpler than anticipated. Had to add a deque object to the hash map implementation. I also tried to switch to a std::map, but it doesn't seem to support a custom hasher and only works with compare operator (which is not good, since we cannot really compare different types of ast nodes). It wouldn't had much benefit anyway (we would only get an ordered iterator over all pairs). As always with hash maps, it more of a guess which implementation would lead to the best performance.

IMO this implementation is pretty close to being optimal (in terms of performance). Random/Indexed access is constant O(1) due to the use of deque. I didn't put too much thought into the API, but it should be straight forward to use. You can fetch key or value by index. As I already wrote in other threads, the list/map interfaces/types/classes should be refactored and unified at some point (basically hiding all the logic currently implemented in functions.cpp and elsewhere).

If you can point me to other functions that allow random/indexed access to maps, please add them in this thread. Will see if I can also implement them. And yes, this is still missing proper spec test coverage.

@mgreter mgreter added this to the 3.3.7 milestone May 2, 2016
@mgreter mgreter self-assigned this May 2, 2016
@mgreter mgreter force-pushed the bugfix/issue_1930 branch from 5d40866 to e6bf263 Compare May 2, 2016 23:18
@mgreter mgreter force-pushed the bugfix/issue_1930 branch from e6bf263 to 2e0177f Compare May 3, 2016 00:23
@mgreter mgreter force-pushed the bugfix/issue_1930 branch from 2e0177f to 14da927 Compare May 3, 2016 00:33
@coveralls
Copy link

coveralls commented May 3, 2016

Coverage Status

Coverage decreased (-0.1%) to 79.095% when pulling 14da927 on mgreter:bugfix/issue_1930 into d8fb898 on sass:master.

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

it doesn't seem to support a custom hasher

This is why unordered map is used. We maintain the ordering by keeping the keys ordered in a vector on insert.

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

Can you elaborate why you need the ordered vector? Hashed is already ordered. The has itself is an unordered list but the insert order of the keys is tracked.

It looks like Hashed's iterator has been broken. It should iterate over the list_ vector and look up the value in the elements_ hash.

adjust_after_pushing(p);
return *this;
}
Expression* key(size_t i) {
return ordered.at(i * 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys().at(i)

@mgreter
Copy link
Contributor Author

mgreter commented May 3, 2016

@xzyfer Can you elaborate where "we maintain the ordering by keeping the keys ordered in a vector on insert". I did not find any vector where we "keep" the insertion order, that is exactly why I have added "deque".

Edit. Guess I should have looked more closely. Found it sigh

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

FWIW I don't get email updates when editing comments. It's worth just posting a new comment so I'm not email back out of context replies. Timezones 😞

@mgreter
Copy link
Contributor Author

mgreter commented May 3, 2016

I also added the values to the "vector", since it should make it faster to iterate over values then by fetching each key from the hash map itself. It may also be more effecient to use deque instead of a vector. We can also easily implement optimal iterators over values and keys. To access keys or values you just have to multiply the index by two (and adding 1 for values).

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

The extra memory doesn't seem worth it. Is there a use case for iterating over values?

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

The overwhelming majority of map usage in Sass is direct key look up. We should optimise for that use case above all else.

@mgreter
Copy link
Contributor Author

mgreter commented May 3, 2016

I agree that direct key look up is the most important one, but I would like to guarantee that it doesn't degrade too much in worst case scenario. I need to make some performance assessments first, to determine if this is the case with repeated key access vs. random access via deque. IMO the memory overhead is very little compared to the actual data that is stored (4 to 8 additional bytes per item, so 1MB should fit 128k items on x64 arch). We do not have many places where this is currently used. On a first glimpse it seems to only be the case in map destructors and we could probably refactor that to use the iterator (no need to destroy in insert order).

On the other hand there are the C-API functions that basically have to traverse a map in this way if they need to do so. It seems that i.e. sass_map_get_value etc. are broken anyway currently, since it returns v->map.pairs[i].value (i of type size_t). It looks the performance implications for access by key are "Average case O(1), worst case O(size()).". I figure that depends on how well our hash function avoids collisions. On the other hand each access via keys has to go through the hashing function. So there is always a constant overhead for that. But as I said, not sure how that translates to real world scenarios, so again not sure if the memory/cpu tradeoff is worth it.

But I think it probably makes sense to use "deque" instead of vector in "Hashed", as appending stuff is cheaper for deque and random index access is also O(1). There are always pros and cons.

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

I'm 👍 for make list_ a deque

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

The tradeoff in code complexity is also something to consider. I'll take slight less efficient, user friendly, maintainable code instead of overly complicated, over optimized code any day.

@xzyfer
Copy link
Contributor

xzyfer commented May 3, 2016

With that said adopting deque isn't necessary to fix the issue at hand. I'd
avoid conflating the two is this PR.

@xzyfer xzyfer modified the milestones: 3.3.7, 3.4.1 May 20, 2016
@mgreter mgreter removed their assignment May 22, 2016
@mgreter mgreter closed this May 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support maps in list context (index, nth, etc)
3 participants