Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: struct field names in DDLs should be quoted #8771

Closed
1 task done
NickCrews opened this issue Mar 25, 2024 · 5 comments · Fixed by #8777
Closed
1 task done

bug: struct field names in DDLs should be quoted #8771

NickCrews opened this issue Mar 25, 2024 · 5 comments · Fixed by #8777
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

What happened?

found in #8765.

import ibis

url = "https://storage.googleapis.com/ibis-debugging/election2020_single_chunk.jsonl"
t = ibis.read_json(url)
t = t.cache()

gives ParserException: Parser Error: syntax error at or near "order"

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Mar 25, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 25, 2024
@cpcloud
Copy link
Member

cpcloud commented Mar 25, 2024

@NickCrews Thanks for the issue! It would appear that the yaks require shaving 😂

@NickCrews
Copy link
Contributor Author

🐄 🪒

@cpcloud
Copy link
Member

cpcloud commented Mar 25, 2024

Based only on the error message I suspect something isn't quoted where it should be, since "order" is part of the "order by" keyword sequence.

@NickCrews
Copy link
Contributor Author

NickCrews commented Mar 25, 2024

You are correct. It looks like struct field names should be quoted. Post #8772, I see the sql is

CREATE TEMPORARY TABLE "ibis_cache_2qenquif65d65pb2apheifvvfu" (
    "data" STRUCT(
        races STRUCT(
            race_id TEXT,
            race_slug TEXT,
            url TEXT,
            state_page_url TEXT,
            ap_polls_page TEXT,
            race_type TEXT,
            election_type TEXT,
            election_date DATE,
            runoff BOOLEAN,
            race_name TEXT,
            office TEXT,
            officeid TEXT,
            race_rating TEXT,
            seat TEXT,
            seat_name TEXT,
            state_id TEXT,
            state_slug TEXT,
            state_name TEXT,
            state_nyt_abbrev TEXT,
            state_shape TEXT,
            state_aspect_ratio DOUBLE,
            party_id TEXT,
            uncontested BOOLEAN,
            report BOOLEAN,
            result TEXT,
            result_source TEXT,
            gain BOOLEAN,
            lost_seat TEXT,
            votes BIGINT,
            electoral_votes BIGINT,
            absentee_votes BIGINT,
            absentee_counties BIGINT,
            absentee_count_progress TEXT,
            absentee_outstanding JSON,
            absentee_max_ballots JSON,
            provisional_outstanding JSON,
            provisional_count_progress JSON,
            poll_display TEXT,
            poll_countdown_display TEXT,
            poll_waiting_display TEXT,
            poll_time TEXT,
            poll_time_short TEXT,
            precincts_reporting BIGINT,
            precincts_total BIGINT,
            reporting_display TEXT,
            reporting_value TEXT,
            eevp BIGINT,
            tot_exp_vote BIGINT,
            eevp_source TEXT,
            eevp_value TEXT,
            eevp_display TEXT,
            county_data_source TEXT,
            incumbent_party TEXT,
            no_forecast BOOLEAN,
            last_updated TEXT,
            candidates STRUCT(
                candidate_id TEXT,
                candidate_key TEXT,
                first_name TEXT,
                last_name TEXT,
                order BIGINT,
                name_display TEXT,
                party_id TEXT,
                incumbent BOOLEAN,
                runoff BOOLEAN,
                winner BOOLEAN,
                votes BIGINT,
                percent DOUBLE,
                percent_display TEXT,
                electoral_votes BIGINT,
                absentee_votes BIGINT,
                absentee_percent DOUBLE,
                img_url TEXT,
                has_image BOOLEAN,
                link TEXT,
                pronoun TEXT,
                result_source TEXT
            ) [],
            has_incumbent BOOLEAN,
            leader_margin_value DOUBLE,
            leader_margin_votes BIGINT,
            leader_margin_display TEXT,
            leader_margin_name_display TEXT,
            leader_party_id TEXT,
            counties STRUCT(
                fips TEXT,
                name TEXT,
                votes BIGINT,
                absentee_votes BIGINT,
                reporting BIGINT,
                precincts BIGINT,
                absentee_method TEXT,
                eevp BIGINT,
                tot_exp_vote BIGINT,
                eevp_value TEXT,
                eevp_display TEXT,
                eevp_source TEXT,
                turnout_stage BIGINT,
                absentee_count_progress TEXT,
                absentee_outstanding JSON,
                absentee_max_ballots BIGINT,
                provisional_outstanding JSON,
                provisional_count_progress JSON,
                results STRUCT(
                    trumpd BIGINT,
                    bidenj BIGINT,
                    jorgensenj BIGINT,
                    venturaj BIGINT,
                    pierceb BIGINT,
                    blankenshipd BIGINT,
                    de_la_fuenter BIGINT,
                    write_ins BIGINT,
                    la_rivag BIGINT,
                    collinsp BIGINT,
                    westk BIGINT,
                    hawkinsh BIGINT,
                    gammonc BIGINT,
                    carrollb BIGINT,
                    myersj BIGINT,
                    hammonsb BIGINT,
                    charlesm BIGINT,
                    kopitkek BIGINT,
                    mchughj BIGINT,
                    jacob_fambrop BIGINT,
                    huberb BIGINT,
                    hunterd BIGINT,
                    kennedya BIGINT,
                    scottj BIGINT,
                    kishorej BIGINT,
                    kingr BIGINT,
                    simmonsj BIGINT,
                    boddiep BIGINT,
                    hoeflingt BIGINT,
                    segalj BIGINT,
                    tittles BIGINT,
                    none_of_these_candidates BIGINT,
                    paigeh BIGINT,
                    lafontainec BIGINT,
                    duncanr BIGINT,
                    mccormick BIGINT,
                    swingg BIGINT,
                    scalfz BIGINT
                ),
                results_absentee STRUCT(
                    trumpd BIGINT,
                    bidenj BIGINT,
                    jorgensenj BIGINT,
                    venturaj BIGINT,
                    pierceb BIGINT,
                    blankenshipd BIGINT,
                    de_la_fuenter BIGINT,
                    write_ins BIGINT,
                    la_rivag BIGINT,
                    collinsp BIGINT,
                    westk BIGINT,
                    hawkinsh BIGINT,
                    gammonc BIGINT,
                    carrollb BIGINT,
                    myersj BIGINT,
                    hammonsb BIGINT,
                    charlesm BIGINT,
                    kopitkek BIGINT,
                    mchughj BIGINT,
                    jacob_fambrop BIGINT,
                    huberb BIGINT,
                    hunterd BIGINT,
                    kennedya BIGINT,
                    scottj BIGINT,
                    kishorej BIGINT,
                    kingr BIGINT,
                    simmonsj BIGINT,
                    boddiep BIGINT,
                    hoeflingt BIGINT,
                    segalj BIGINT,
                    tittles BIGINT,
                    none_of_these_candidates BIGINT,
                    paigeh BIGINT,
                    lafontainec BIGINT,
                    duncanr BIGINT,
                    mccormick BIGINT,
                    swingg BIGINT,
                    scalfz BIGINT
                ),
                last_updated TEXT,
                leader_margin_value DOUBLE,
                leader_margin_display TEXT,
                leader_margin_name_display TEXT,
                leader_party_id TEXT,
                margin2020 DOUBLE,
                votes2016 BIGINT,
                margin2016 DOUBLE,
                votes2012 BIGINT,
                margin2012 DOUBLE
            ) [],
            votes2016 BIGINT,
            margin2016 DOUBLE,
            clinton2016 BIGINT,
            trump2016 BIGINT,
            votes2012 BIGINT,
            margin2012 DOUBLE,
            expectations_text TEXT,
            expectations_text_short TEXT,
            absentee_ballot_deadline BIGINT,
            absentee_postmark_deadline BIGINT,
            update_sentences STRUCT(
                top_level STRUCT(
                    timestamp BIGINT,
                    is_new BOOLEAN,
                    hide_timestamp BOOLEAN,
                    overrideText TEXT,
                    sentence TEXT,
                    generatedText TEXT,
                    sentence_type TEXT
                ),
                winner_card_leadin STRUCT(
                    timestamp BIGINT,
                    is_new BOOLEAN,
                    hide_timestamp BOOLEAN,
                    overrideText TEXT,
                    sentence TEXT,
                    generatedText TEXT,
                    sentence_type TEXT
                ),
                eevp STRUCT(
                    sentence TEXT,
                    timestamp BIGINT,
                    is_new BOOLEAN,
                    hide_timestamp BOOLEAN,
                    overrideText JSON,
                    generatedText TEXT,
                    sentence_type TEXT
                ),
                eevp_leadin STRUCT(
                    sentence TEXT,
                    timestamp BIGINT,
                    is_new BOOLEAN,
                    hide_timestamp BOOLEAN,
                    overrideText JSON,
                    generatedText TEXT,
                    sentence_type TEXT
                ),
                counties STRUCT(
                    sentence TEXT,
                    timestamp BIGINT,
                    is_new BOOLEAN,
                    hide_timestamp BOOLEAN,
                    overrideText TEXT,
                    generatedText TEXT,
                    sentence_type TEXT
                )
            ),
            race_diff STRUCT(
                race_slug TEXT,
                boolean_things_that_happened STRUCT(
                    zero_votes BOOLEAN,
                    results_expected_within_hour BOOLEAN,
                    show_nothing_votes_decreased BOOLEAN,
                    went_to_runoff BOOLEAN,
                    race_won BOOLEAN,
                    race_just_won BOOLEAN,
                    race_unwon BOOLEAN,
                    additional_votes_reported BOOLEAN,
                    additional_precincts_reported BOOLEAN,
                    has_eevp BOOLEAN,
                    first_precincts_reported BOOLEAN,
                    first_votes_reported BOOLEAN,
                    candidate_took_the_lead BOOLEAN,
                    candidate_took_the_lead_first_alignment BOOLEAN,
                    candidate_took_the_lead_final_alignment BOOLEAN,
                    candidates_tied BOOLEAN,
                    precinct_percentage_points_grew_by_more_than_2_percent BOOLEAN,
                    precinct_percentage_points_grew_by_more_than_5_percent BOOLEAN,
                    eevp_grew_by_more_than_2_percent BOOLEAN,
                    eevp_grew_by_more_than_5_percent BOOLEAN,
                    candidate_lead_margin_grew_by_more_than_1_with_more_than_5_percent_precincts_reporting BOOLEAN,
                    candidate_lead_margin_shrank_by_more_than_1_with_more_than_5_percent_precincts_reporting BOOLEAN,
                    candidate_lead_margin_grew_by_more_than_1_with_more_than_5_eevp BOOLEAN,
                    candidate_lead_margin_shrank_by_more_than_1_with_more_than_5_eevp BOOLEAN,
                    top_two_candidates_less_than_three_percentage_points_away_with_at_least_50_percent_reporting BOOLEAN,
                    prev_top_two_candidates_less_than_three_percentage_points_away_with_at_least_50_percent_reporting BOOLEAN,
                    precinct_percentage_reporting_passes_80 BOOLEAN,
                    precinct_percentage_reporting_passes_90 BOOLEAN,
                    precinct_percentage_reporting_reaches_100 BOOLEAN,
                    top_two_candidates_less_than_three_percentage_points_away_with_at_least_50_eevp BOOLEAN,
                    prev_top_two_candidates_less_than_three_percentage_points_away_with_at_least_50_eevp BOOLEAN,
                    first_votes_for_some_county BOOLEAN,
                    candidate_wins_some_county BOOLEAN,
                    additional_votes_for_at_least_one_county BOOLEAN,
                    first_votes_for_only_one_county BOOLEAN,
                    biden_takes_at_least_one_more_2016_trump_county BOOLEAN,
                    trump_takes_at_least_one_more_2016_clinton_county BOOLEAN,
                    at_least_one_county_flipped BOOLEAN,
                    at_least_one_county_just_flipped BOOLEAN
                ),
                details_about_changes JSON
            ),
            winnerCalledTimestamp BIGINT,
            timeseries STRUCT(
                vote_shares STRUCT(trumpd DOUBLE, bidenj DOUBLE),
                votes BIGINT,
                eevp BIGINT,
                eevp_source TEXT,
                timestamp TEXT
            ) [],
            edison_exit_polls_page TEXT,
            townships STRUCT(
                name TEXT,
                fips TEXT,
                fips_town TEXT,
                votes BIGINT,
                absentee_votes BIGINT,
                reporting BIGINT,
                precincts BIGINT,
                absentee_method TEXT,
                results STRUCT(
                    bidenj BIGINT,
                    trumpd BIGINT,
                    jorgensenj BIGINT,
                    hawkinsh BIGINT,
                    write_ins BIGINT,
                    de_la_fuenter BIGINT,
                    carrollb BIGINT,
                    la_rivag BIGINT,
                    westk BIGINT,
                    paigeh BIGINT,
                    lafontainec BIGINT,
                    kennedya BIGINT,
                    blankenshipd BIGINT,
                    kopitkek BIGINT,
                    segalj BIGINT,
                    collinsp BIGINT,
                    duncanr BIGINT,
                    huberb BIGINT,
                    mccormick BIGINT,
                    pierceb BIGINT,
                    scalfz BIGINT,
                    swingg BIGINT
                ),
                eevp BIGINT,
                tot_exp_vote BIGINT,
                eevp_value TEXT,
                eevp_display TEXT,
                turnout_stage BIGINT,
                last_updated TEXT,
                results_absentee STRUCT(
                    bidenj BIGINT,
                    trumpd BIGINT,
                    jorgensenj BIGINT,
                    hawkinsh BIGINT,
                    write_ins BIGINT,
                    de_la_fuenter BIGINT,
                    carrollb BIGINT,
                    la_rivag BIGINT,
                    westk BIGINT,
                    paigeh BIGINT,
                    lafontainec BIGINT,
                    kennedya BIGINT,
                    blankenshipd BIGINT,
                    kopitkek BIGINT,
                    segalj BIGINT,
                    collinsp BIGINT,
                    duncanr BIGINT,
                    huberb BIGINT,
                    mccormick BIGINT,
                    pierceb BIGINT,
                    scalfz BIGINT,
                    swingg BIGINT
                ),
                leader_margin_value DOUBLE,
                leader_margin_display TEXT,
                leader_margin_name_display TEXT,
                leader_party_id TEXT
            ) [],
            nyt_race_description TEXT,
            precinct_metadata JSON,
            model_metadata JSON,
            electoral_vote_details STRUCT(
                type TEXT,
                name TEXT,
                electoral_votes BIGINT,
                winning_candidate_id TEXT,
                source TEXT,
                accept_ap BOOLEAN
            ) [],
            congressional_districts STRUCT(
                votes BIGINT,
                name TEXT,
                reporting BIGINT,
                precincts BIGINT,
                results STRUCT(
                    jorgensenj BIGINT,
                    trumpd BIGINT,
                    hawkinsh BIGINT,
                    bidenj BIGINT,
                    de_la_fuenter BIGINT,
                    write_ins BIGINT
                )
            ) []
        ) [],
        party_control STRUCT(
            race_type TEXT,
            state_id TEXT,
            needed_for_control BIGINT,
            total BIGINT,
            no_election STRUCT(democrat BIGINT, republican BIGINT, other BIGINT),
            winner TEXT,
            parties STRUCT(
                democrat STRUCT(
                    party_id TEXT,
                    name_display TEXT,
                    name_abbr TEXT,
                    count BIGINT,
                    votes BIGINT,
                    percent TEXT,
                    change BIGINT,
                    flip_count BIGINT,
                    leader_count BIGINT,
                    change_full TEXT,
                    change_abbr TEXT,
                    flip_text_full TEXT,
                    flip_text_extra TEXT,
                    flip_text_abbr TEXT,
                    winner BOOLEAN
                ),
                republican STRUCT(
                    party_id TEXT,
                    name_display TEXT,
                    name_abbr TEXT,
                    count BIGINT,
                    votes BIGINT,
                    percent TEXT,
                    change BIGINT,
                    flip_count BIGINT,
                    leader_count BIGINT,
                    change_full TEXT,
                    change_abbr TEXT,
                    flip_text_full TEXT,
                    flip_text_extra TEXT,
                    flip_text_abbr TEXT,
                    winner BOOLEAN
                ),
                other STRUCT(
                    party_id TEXT,
                    name_display TEXT,
                    name_abbr TEXT,
                    count BIGINT,
                    votes BIGINT,
                    percent TEXT,
                    change BIGINT,
                    flip_count BIGINT,
                    leader_count BIGINT,
                    change_full TEXT,
                    change_abbr TEXT,
                    flip_text_full TEXT,
                    flip_text_extra TEXT,
                    flip_text_abbr TEXT,
                    winner BOOLEAN
                )
            ),
            winnerCalledTimestamp BIGINT
        ) [],
        liveUpdates STRUCT(
            id TEXT,
            author TEXT,
            author_title_or_location TEXT,
            text TEXT,
            link_url TEXT,
            link_text TEXT,
            linked_state_1 TEXT,
            linked_state_2 TEXT,
            linked_state_3 TEXT,
            image_url TEXT,
            image_credit TEXT,
            datetime BIGINT,
            author_headshot TEXT,
            is_today BOOLEAN,
            linked_states STRUCT(state_id TEXT, state_name TEXT, url TEXT) [],
            hide_from_homepage BOOLEAN,
            hide_from_liveblog BOOLEAN,
            include_on_homepage BOOLEAN,
            include_in_reporter_updates_feed BOOLEAN,
            office TEXT,
            type TEXT,
            call_type TEXT,
            race_id TEXT,
            winner STRUCT(
                candidate_id TEXT,
                candidate_key TEXT,
                first_name TEXT,
                last_name TEXT,
                order BIGINT,
                name_display TEXT,
                party_id TEXT,
                incumbent BOOLEAN,
                runoff BOOLEAN,
                winner BOOLEAN,
                votes BIGINT,
                percent DOUBLE,
                percent_display TEXT,
                electoral_votes BIGINT,
                absentee_votes BIGINT,
                absentee_percent DOUBLE,
                img_url TEXT,
                has_image BOOLEAN,
                link TEXT,
                result_source TEXT,
                pronoun TEXT
            ),
            party_id TEXT,
            candidate_last_name TEXT,
            candidate_name_display TEXT,
            candidate_id TEXT,
            race_call_party_winner TEXT,
            state_name TEXT,
            link TEXT,
            bop BOOLEAN
        ) []
    ),
    "meta" STRUCT(version BIGINT, track DATE, timestamp TEXT)
)

@NickCrews NickCrews changed the title bug: caching json read from GS bug: struct fields in DDL schemas should be quoted Mar 25, 2024
@NickCrews
Copy link
Contributor Author

miminum repro for a new test:

import ibis

s = ibis.struct({"order": 4})
s.as_table().cache()

@NickCrews NickCrews changed the title bug: struct fields in DDL schemas should be quoted bug: struct field names in DDLs should be quoted Mar 25, 2024
@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants