Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustfmt is prohibitively slow for sufficiently nested structs #5128

Closed
aarashy opened this issue Dec 8, 2021 · 5 comments · Fixed by #5139
Closed

rustfmt is prohibitively slow for sufficiently nested structs #5128

aarashy opened this issue Dec 8, 2021 · 5 comments · Fixed by #5139

Comments

@aarashy
Copy link

aarashy commented Dec 8, 2021

The following snippet is about 220 lines containing a total of 13570 characters.

Running rustfmt against this file takes an inordinately long amount of time, with the following output from --verbose on
rustfmt 1.4.38-nightly (0fb1c371 2021-12-06)
Spent 0.020 secs in the parsing phase, and 444.417 secs in the formatting phase

This is actually only a substruct of the greater struct which I'm trying to validate. For the complete struct that I'm trying to validate, rustfmt exceeds 10 minutes of runtime, which makes it fatally time-out in my CI.

For context, these symbols come from Prost generated bindings against the https://github.com/pganalyze/libpg_query protobufs.

Is there known to be super-linear runtime scaling w.r.t. the amount of nesting of a struct or code block? This seems to be so slow that it ought to be considered a bug.

See also: #4867

    use libpgquery_sys::proto::{
        self, node, AConst, AExpr, AStar, BoolExpr, ColumnRef, CommonTableExpr, FuncCall, Node,
        RangeVar, RawStmt, ResTarget, SelectStmt, SortBy, TypeCast, TypeName, WindowDef,
        WithClause,
    };

    fn takes_a_long_time_to_rustfmt() {
        let inner_cte = vec![Node {
            node: Some(node::Node::CommonTableExpr(Box::new(CommonTableExpr {
                ctename: String::from("ranked_by_age_within_key"),
                aliascolnames: vec![],
                ctematerialized: libpgquery_sys::proto::CteMaterialize::Default as i32,
                ctequery: Some(Box::new(Node { node: None })),
                location: 29,
                cterecursive: false,
                cterefcount: 0,
                ctecolnames: vec![],
                ctecoltypes: vec![],
                ctecoltypmods: vec![],
                ctecolcollations: vec![],
            }))),
        }];
        let outer_cte = vec![Node {
            node: Some(node::Node::CommonTableExpr(Box::new(CommonTableExpr {
                ctename: String::from("table_name"),
                aliascolnames: vec![],
                ctematerialized: libpgquery_sys::proto::CteMaterialize::Default as i32,
                ctequery: Some(Box::new(Node {
                    node: Some(node::Node::SelectStmt(Box::new(SelectStmt {
                        distinct_clause: vec![],
                        into_clause: None,
                        target_list: vec![
                            Node {
                                node: Some(node::Node::ResTarget(Box::new(ResTarget {
                                    name: String::from("column1"),
                                    indirection: vec![],
                                    val: Some(Box::new(Node {
                                        node: Some(node::Node::ColumnRef(ColumnRef {
                                            fields: vec![Node {
                                                node: Some(node::Node::String(proto::String {
                                                    str: String::from("c1"),
                                                })),
                                            }],
                                            location: 301,
                                        })),
                                    })),
                                    location: 301,
                                }))),
                            },
                            Node {
                                node: Some(node::Node::ResTarget(Box::new(ResTarget {
                                    name: String::from("column2"),
                                    indirection: vec![],
                                    val: Some(Box::new(Node {
                                        node: Some(node::Node::ColumnRef(ColumnRef {
                                            fields: vec![Node {
                                                node: Some(node::Node::String(proto::String {
                                                    str: String::from("c2"),
                                                })),
                                            }],
                                            location: 324,
                                        })),
                                    })),
                                    location: 324,
                                }))),
                            },
                        ],
                        from_clause: vec![Node {
                            node: Some(node::Node::RangeVar(RangeVar {
                                catalogname: String::from(""),
                                schemaname: String::from(""),
                                relname: String::from("ranked_by_age_within_key"),
                                inh: true,
                                relpersistence: String::from("p"),
                                alias: None,
                                location: 347,
                            })),
                        }],
                        where_clause: Some(Box::new(Node {
                            node: Some(node::Node::BoolExpr(Box::new(BoolExpr {
                                xpr: None,
                                boolop: libpgquery_sys::proto::BoolExprType::AndExpr as i32,
                                args: vec![
                                    Node {
                                        node: Some(node::Node::AExpr(Box::new(AExpr {
                                            kind: proto::AExprKind::AexprOp as i32,
                                            name: vec![Node {
                                                node: Some(node::Node::String(proto::String {
                                                    str: String::from("="),
                                                })),
                                            }],
                                            lexpr: Some(Box::new(Node {
                                                node: Some(node::Node::ColumnRef(ColumnRef {
                                                    fields: vec![Node {
                                                        node: Some(node::Node::String(
                                                            proto::String {
                                                                str: String::from("rank_in_key"),
                                                            },
                                                        )),
                                                    }],
                                                    location: 382,
                                                })),
                                            })),
                                            rexpr: Some(Box::new(Node {
                                                node: Some(node::Node::AConst(Box::new(AConst {
                                                    val: Some(Box::new(Node {
                                                        node: Some(node::Node::Integer(
                                                            proto::Integer { ival: 1 },
                                                        )),
                                                    })),
                                                    location: 396,
                                                }))),
                                            })),
                                            location: 394,
                                        }))),
                                    },
                                    Node {
                                        node: Some(node::Node::AExpr(Box::new(AExpr {
                                            kind: proto::AExprKind::AexprOp as i32,
                                            name: vec![Node {
                                                node: Some(node::Node::String(proto::String {
                                                    str: String::from("="),
                                                })),
                                            }],
                                            lexpr: Some(Box::new(Node {
                                                node: Some(node::Node::ColumnRef(ColumnRef {
                                                    fields: vec![Node {
                                                        node: Some(node::Node::String(
                                                            proto::String {
                                                                str: String::from("is_deleted"),
                                                            },
                                                        )),
                                                    }],
                                                    location: 402,
                                                })),
                                            })),
                                            rexpr: Some(Box::new(Node {
                                                node: Some(node::Node::TypeCast(Box::new(
                                                    TypeCast {
                                                        arg: Some(Box::new(Node {
                                                            node: Some(node::Node::AConst(
                                                                Box::new(AConst {
                                                                    val: Some(Box::new(Node {
                                                                        node: Some(
                                                                            node::Node::String(
                                                                                proto::String {
                                                                                    str:
                                                                                        String::from(
                                                                                            "f",
                                                                                        ),
                                                                                },
                                                                            ),
                                                                        ),
                                                                    })),
                                                                    location: 415,
                                                                }),
                                                            )),
                                                        })),
                                                        type_name: Some(TypeName {
                                                            names: vec![
                                                                Node {
                                                                    node: Some(node::Node::String(
                                                                        proto::String {
                                                                            str: String::from(
                                                                                "pg_catalog",
                                                                            ),
                                                                        },
                                                                    )),
                                                                },
                                                                Node {
                                                                    node: Some(node::Node::String(
                                                                        proto::String {
                                                                            str: String::from(
                                                                                "bool",
                                                                            ),
                                                                        },
                                                                    )),
                                                                },
                                                            ],
                                                            type_oid: 0,
                                                            setof: false,
                                                            pct_type: false,
                                                            typmods: vec![],
                                                            typemod: -1,
                                                            array_bounds: vec![],
                                                            location: -1,
                                                        }),
                                                        location: -1,
                                                    },
                                                ))),
                                            })),
                                            location: 413,
                                        }))),
                                    },
                                ],
                                location: 398,
                            }))),
                        })),
                        group_clause: vec![],
                        having_clause: None,
                        window_clause: vec![],
                        values_lists: vec![],
                        sort_clause: vec![],
                        limit_offset: None,
                        limit_count: None,
                        limit_option: libpgquery_sys::proto::LimitOption::Default as i32,
                        locking_clause: vec![],
                        with_clause: Some(WithClause {
                            ctes: inner_cte,
                            recursive: false,
                            location: 24,
                        }),
                        op: libpgquery_sys::proto::SetOperation::SetopNone as i32,
                        all: false,
                        larg: None,
                        rarg: None,
                    }))),
                })),
                location: 5,
                cterecursive: false,
                cterefcount: 0,
                ctecolnames: vec![],
                ctecoltypes: vec![],
                ctecoltypmods: vec![],
                ctecolcollations: vec![],
            }))),
        }];
    }
@aarashy
Copy link
Author

aarashy commented Dec 8, 2021

pg_query.rs.zip

Here is a zip file containing the generated Prost where these structs are defined, which should help anybody repro

@calebcartwright
Copy link
Member

I can't reproduce this locally, nor on the playground with the provided snippet.

Are you sure you've provided a reproducible snippet in the issue description? I appreciate a binary attachment, but need to get to a point where the snippet in the issue description is reproducible, or if not possible (due to needing multiple files, etc.) then a repro repo is preferable over having to download and extract an arbitrary archive.

Finally, most generated code skips formatting, and that's why many generators use one of our several available mechanisms to support ignoring an entire file.

Obviously the latency reported is problematic, but to be fully transparent, with it being related to generated/non-human written code it's also not something I'm likely to have sufficient bandwidth to review any time soon.

@aarashy
Copy link
Author

aarashy commented Dec 8, 2021

Thanks for the prompt response!

Here's a compile-able playground that includes my an extended version of my snippet plus the Prost bindings. rustfmt indeed times out on this.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c598c59219fe59388aaf65a95c5d2799

This new snippet includes 1 function definition and then lots of struct definitions. If you delete the function but keep the struct definitions, rustfmt runs quickly without an issue. So it seems clear that the function snippet is responsible for the slowness/timeout.

Understandable that you may not have bandwidth to look. I was able to unblock myself by running rustfmt on my local machine (it took 12 minutes on the contents of this playground), and then slapping #[rustfmt::skip] on the result.

I also wonder whether having many different struct kinds referenced in a block would in-and-of-itself make rustfmt slower/more resource intensive while formatting the block. Or instead, would an equally nested code block be equally slow if it used 1 recursive struct repeatedly, rather than this code-generated web of different nested structs? Issue #4867 makes it seem like deep nesting causes a problem in-and-of-itself.

@HKalbasi
Copy link
Member

It is because of max_width. Rustfmt currently tries all possible formattings, in order to find best one that fits in max_width limit. When max_width is tight, first tries won't succeed and rustfmt eventually should try near all of them, leads to a exponential runtime.

@aarashy
Copy link
Author

aarashy commented Dec 13, 2021

That's really interesting @HKalbasi . I suppose I could try changing my rustfmt configuration to increase the max_width parameter and see if that makes the runtime acceptable.

It seems like the max_width parameter can only be configured at the crate level and not at the level of an individual file/mod/block using an attribute directive. Do you think making that config more granular is a reasonable feature request?

I wonder if there's something else that can be done for rustfmt to be more intelligent about trying only certain possible formattings, or for the user to give a hint about how a certain block should be formatted to circumvent the exponential runtime.

EDIT:

Even using max_width = 106 was enough to make my snippet format extremely quickly
Spent 0.003 secs in the parsing phase, and 0.055 secs in the formatting phase

However, max_width = 105 takes on the order of minutes again 😸

I will close my issue with the following takeaway:

  • If rustfmt is prohibitively slow for a given snippet, play with the max_width configuration until you are able to make it run in under a second.
  • After doing so, one can just slap #[rustfmt::skip] on any problematic code blocks, and go back to using the default max_width everywhere else in your crate by undoing the configuration change.

If you were able to have granular configuration such as #[rustfmt::max_width=XXXX] with an attribute directive on a mod or block, I think that would be cool, but I figure making the rustfmt configuration dynamic in this way may be difficult, so it's mostly just a nice-to-have.

Another suggestion: Since rustfmt is really struggling to fit within the max_width constraint, I would've liked it if the tool gave me a warning message to relax the constraint, especially when using the --verbose flag.

Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants