Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove -DLONGDOUBLE_TYPE=double from Makefile #337

Closed
wants to merge 4 commits into from

Conversation

brodycj
Copy link
Contributor

@brodycj brodycj commented Feb 3, 2020

in order to resolve the long double issue reported in #325

How tested:

Thanks to @AnyhowStep for raising issue #325 and pointing out the suspect -DLONGDOUBLE_TYPE=double flag in https://github.com/kripken/sql.js/issues/325#issuecomment-573239220.

I suspect that this flag was needed in an old version of asm.js, does not seem to be needed in new emsdk version. Note that I have tested with emsdk version 1.39.4; looks like they published emsdk version 1.39.7 pretty recently.

Follow-up steps if this proposal is accepted and merged:

Christopher J. Brody added 2 commits February 3, 2020 15:44
@AnyhowStep
Copy link

Thank you for looking into this! This has been gnawing at me for a while, now.

@brodycj brodycj mentioned this pull request Feb 3, 2020
1 task
Copy link
Member

@lovasoa lovasoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution ! Could you please add a test for this ?

A simple

assert.deepEqual(
    db.exec("SELECT 1.7976931348623157e+308 as a"), 
    [{columns:['a'],values:[[1.7976931348623157e+308]]}]
);

would be enough.

Also, does this impact performance ?

@brodycj
Copy link
Contributor Author

brodycj commented Feb 4, 2020

Could you please add a test for this ?

I encountered 2 problems:

  • Exact match of the number value did not work for me, likely due to the characteristics of binary encoding of decimal numbers.
  • The test case I proposed in PR [DRAFT ...] test super-large double value ref: issue #325 #336 succeeds in my work area but will not succeed in GitHub CI until someone rebuilds the dist artifacts. I thought this info was in the description, evidently not clear enough.

I would recommend maintainers consider merging Makefile changes proposed here and in #339 (SQLite 3.31.1 update), rebuild dist, then we can update and merge PR #336 with the added test case.

Also, does this impact performance ?

I cannot say anything for 100% sure without some kind of benchmarking or other measurement. And performance measurements can be a bit challenging to get right due to the number of possible factors that can lead to variance.

I would be very, very surprised if this change could lead to any significant impact, though. The increased amount of generated code should be extremely small in comparison to the amount of processing code that would be used to store and retrieve this kind of data in SQLite, and it should not introduce any new loops.

@lovasoa
Copy link
Member

lovasoa commented Feb 4, 2020

I'm not sure I understand what you are saying about the tests. Can you please add tests to this PR ? The github CI should be re-building dist before launching the tests, so the fact that dist is not updated in the PR itself should not matter.

About the numbers that do not match: either the number that is returned matches exactly what was was input and #325 is fixed, or it does not and the issue is still present. In what case are we ?

@brodycj
Copy link
Contributor Author

brodycj commented Feb 4, 2020

Looks like test is now included. To clarify about the exact match: it will not work due to the super-high precision, and I think this is very well known. So I am testing the following for both inline SELECT and SELECT with parameter:

  • returned value is greater than 1.7976931348623e+308
  • returned value is less than 1.797693134862316e+308
  • returned value is not equal to Infinity

I think and hope this is enough to prove that this behavior is now correct and will be tested moving forward.

Please let me know if there is anything else I need to do to get this proposal ready to merge. Thanks!

@lovasoa
Copy link
Member

lovasoa commented Feb 4, 2020

I included the tests but they do not pass (see CI).

There should not be any precision loss. What is the exact value that is returned and why is it not exactly the input value ?

Also, testing that the returned value is less than 1.797693134862316e+308 is not necessary. In javascript 1.797693134862316e+308 === Infinity

@brodycj
Copy link
Contributor Author

brodycj commented Feb 4, 2020

I don't have much more extra time to investigate this right now. I think I have already demonstrated that my proposal does improve the situation.

@lovasoa
Copy link
Member

lovasoa commented Feb 4, 2020

Ok, thank you for what you have done already @brodybits ! I'll try to spend some time on this to make it work later.

@brodycj
Copy link
Contributor Author

brodycj commented Mar 12, 2020

I think the precision loss observed here is not due to this library and maybe not even due to JavaScript itself, see this photo:

from this tweet:

Go home JavaScript...you've done enough pic.twitter.com/c8orjpz7W4

— Mike Hartington ([...]) March 12, 2020

@lovasoa
Copy link
Member

lovasoa commented Mar 12, 2020

@brodybits What you are showing is just a very well known property of floating point numbers. It has nothing to do with what we are dealing with here: both javascript and sqlite have the notion of a 64-bits IEEE754 number, so it should be possible to get the values from one to the other without any precision loss.

@brodycj
Copy link
Contributor Author

brodycj commented Mar 12, 2020

For 64-bit integer values, I would definitely agree that we should not suffer from loss of precision. But I can still see potential for loss of precision on 64-bit floating point numbers. I found interesting sections in a couple resources from a quick search:

Maybe I am arguing too much and thinking too little, but I am really not surprised by this kind of rounding error. In general, floating points are stored with the fractional part in 2s complement, while human-readable fractional part would be in 10s complement. My understanding from many many years ago is that there is no way to prevent rounding issues between 2s complement and 10s complement, except for using excessive precision beyond what we are really interested in. I wish I had more time to research, investigate, and explain this better.

For now, I will just use this proposal in my own build since I think removing -DLONGDOUBLE_TYPE=double does give improved precision with minuscule percentage resource cost. Sorry that my answers are more argumentative than properly researched and thought out here.

@AnyhowStep
Copy link

AnyhowStep commented Mar 13, 2020

Except, they both use the same format to represent the double data type. And they can both parse and perform math on the numbers with the same answer.

I have an idea I'll need to test when I get home.

  1. sql.js : INSERT INTO myTable VALUES (1.7976931348623157e+308)
  2. sql.js : Export to database file
  3. Any other SQLite implementation : Import database file
  4. SELECT * FROM myTable

If it returns 1.7976931348623157e+308, then we know the problem happens when sending values from the wasm to JS.

If it returns Infinity, then we know the wasm itself is parsing the string 1.7976931348623157e+308 as Infinity before storing it.

For reference, db-fiddle SQLite can handle the number just fine,
https://www.db-fiddle.com/f/kGBw8zdD9oJ8EYXBKHT2zV/0

@kaizhu256
Copy link
Member

tried out @AnyhowStep 's experiment both ways in sql.js <-> python. the value ends up as infinity in both.

  1. save db in sql.js -> import in python
    image

  2. save db in python -> import in sql.js
    image

@AnyhowStep
Copy link

AnyhowStep commented Mar 13, 2020

1.7976931348623157e+308
Not
1.797693134862316e+308

1.797693134862316e+308 evaluating to Infinity is fine.
I made a typo when making the original comment on my phone. Sorry =(

@AnyhowStep
Copy link

AnyhowStep commented Mar 14, 2020

sqlite3 binary to sql.js

sqlite> CREATE TABLE T (version VARCHAR, f REAL);
sqlite> INSERT INTO T VALUES (sqlite_version(), 1.7976931348623157e+308);
sqlite> .save sqlite-1.7976931348623157e+308.db
sqlite> .exit

image

version f
3.31.1 1.7976931348623157e+308

sql.js to sqlite3 binary

image

sqlite> .open ../Downloads/sql.js.1.7976931348623157e+308.db
sqlite> SELECT * FROM T;
3.28.0|Inf

So, looks to me like the wasm can load 1.7976931348623157e+308 just fine from a .db file.
But it can't parse that value from a SQL string.

@kaizhu256
Copy link
Member

yep, 1.7976931348623157e+308 is finite in javascript, sql.js, python.

fyi, there's a subtle sqlite-gotcha, where if you inline it (or CAST from string), its infinity:

image

you can reproduce above by going to https://kaizhu256.github.io/demo-sqljs-csv/ and copy-paste following code in dev-console:

var worker = new Worker("https://kaizhu256.github.io/demo-sqljs-csv/assets.sqljs-v2019.11.5.rollup.js");
worker.onmessage = function (evt) {
    console.log(JSON.stringify(evt.data.results[0], undefined, 4));
}
worker.postMessage({
    action: "exec",
    params: {
        "@str": "1.7976931348623157e+308",
        "@num": 1.7976931348623157e+308,
},
    sql: `
DROP TABLE IF EXISTS myTable;
CREATE TABLE myTable (type TEXT, val REAL);
INSERT INTO myTable (type, val) VALUES ('inline', 1.7976931348623157e+308);
INSERT INTO myTable (type, val) VALUES ('param-string', @str);
INSERT INTO myTable (type, val) VALUES ('param-number', @num);
SELECT * FROM myTable
`
});

/* console-output:
{
    "columns": [
        "type",
        "val"
    ],
    "values": [
        [
            "inline",
            null // Infinity
        ],
        [
            "param-string",
            null // Infinity
        ],
        [
            "param-number",
            1.7976931348623157e+308
        ]
    ]
}
 */

@AnyhowStep
Copy link

AnyhowStep commented Mar 14, 2020

yep, 1.7976931348623157e+308 is finite in ... sql.js ...

Except, it's not.
image


It's weird that using the @num with the number 1.7976931348623157e+308 works...

Did you do something different with your fork?

@kaizhu256
Copy link
Member

@AnyhowStep, it is finite in sql.js. the gotcha is it cannot be inlined in sql-statement:

-- gotcha - infinite (because sqlite has to parse inlined string)
INSERT INTO myTable (type, val) VALUES ('inline', 1.7976931348623157e+308);
-- gotcha - infinite (because sqlite has to cast from string)
INSERT INTO myTable (type, val) VALUES ('param-string', @str);
-- finite
INSERT INTO myTable (type, val) VALUES ('param-number', @num);

@AnyhowStep
Copy link

Invoking your code on https://sql-js.github.io/sql.js/examples/GUI/ doesn't work,

image

So, I'm wondering if your fork is somehow different.

@kaizhu256
Copy link
Member

kaizhu256 commented Mar 14, 2020

@AnyhowStep the difference is that official demo is lagging and doesn't support passing params to workers (see merged commit 632effa). you can clone the gh-pages branch, and apply following patch (and use the debug version of sql.js):

--- a/dist/worker.sql-wasm-debug.js
+++ b/dist/worker.sql-wasm-debug.js
     * @param {string} sql a string containing some SQL text to execute
     * @return {Database.QueryExecResult[]} The results of each statement
     */
-    Database.prototype["exec"] = function exec(sql) {
+    Database.prototype["exec"] = function exec(sql, params) {
         var curresult;
         var stmt;
         if (!this.db) {
...
                 if (pStmt !== NULL) {
                     curresult = null;
                     stmt = new Statement(pStmt, this);
+                    if (params != null) {
+                        stmt.bind(params);
+                    }
                     while (stmt["step"]()) {
                         if (curresult === null) {
                             curresult = {
...
             return postMessage({
                 id: data["id"],
-                results: db.exec(data["sql"])
+                results: db.exec(data["sql"], data["params"])
             });
         case "each":
             if (db === null) {

@AnyhowStep
Copy link

AnyhowStep commented Mar 14, 2020

So.. All that's left is to figure out why inlining causes it to be Infinity on sql.js, but inlining works with the sqlite3 binary?

Also, thanks for taking the time to investigate this! I know not many people get bothered by small precision differences but it's been nagging at me =(

@kaizhu256
Copy link
Member

yes, and maybe CAST <str> AS REAL as well

@lovasoa
Copy link
Member

lovasoa commented Apr 4, 2020

closed in favor of #381

@lovasoa lovasoa closed this Apr 4, 2020
@brodycj brodycj deleted the fix-long-double-issue branch April 5, 2020 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants