Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer rule 'projection_push_down' failed due to unexpected error: Error during planning: Aggregate schema has wrong number of fields. Expected 3 got 8 #3704

Closed
Tracked by #822
andygrove opened this issue Oct 4, 2022 · 8 comments · Fixed by #3710
Assignees
Labels
bug Something isn't working

Comments

@andygrove
Copy link
Member

andygrove commented Oct 4, 2022

Describe the bug
There was a recent regression causing several test failures in a test suite that we use with Dask SQL.

Optimizer rule 'projection_push_down' failed due to unexpected error: Error during planning: Aggregate schema has wrong number of fields. Expected 3 got 8

To Reproduce

#[test]
fn case_when_aggregate() -> Result<()> {
    let sql = "SELECT col_utf8, SUM(CASE WHEN col_int32 > 0 THEN 1 ELSE 0 END) AS n FROM test GROUP BY col_utf8";
    let plan = test_sql(sql)?;
    let expected = "";
    assert_eq!(expected, format!("{:?}", plan));
    Ok(())
}

Fails with

Error: Internal("Optimizer rule 'projection_push_down' failed due to unexpected error: 
  Error during planning: Aggregate schema has wrong number of fields. Expected 1 got 2")

Expected behavior
Should not see this error

Additional context
None

@andygrove andygrove added the bug Something isn't working label Oct 4, 2022
@andygrove andygrove self-assigned this Oct 4, 2022
@alamb
Copy link
Contributor

alamb commented Oct 4, 2022

For what it is worth, I am working through some errors upgrading IOx to use the latest datafusion (see https://github.com/influxdata/influxdb_iox/pull/5792) and I am also hitting errors related to type coercion as well. I will file and fix issues in DataFusion if appropriate when I have debugged it

@andygrove
Copy link
Member Author

I have now added a repro case in this issue

@alamb
Copy link
Contributor

alamb commented Oct 4, 2022

I will work on this later today if no one else beats me to it -- I would like a reason to get more hands on exposure to the new casting code

@andygrove
Copy link
Member Author

andygrove commented Oct 4, 2022

I have been looking into this, and I have found one issue so far. We have some code in type_coercion.rs that attempts to prevent aggregate expression names from being modified:

            // ensure aggregate names don't change:
            // https://github.com/apache/arrow-datafusion/issues/3555
            if matches!(expr, Expr::AggregateFunction { .. }) {
                if let Some((alias, name)) = original_name.zip(expr.name().ok()) {
                    if alias != name {
                        return Ok(expr.alias(&alias));
                    }
                }
            }

This code does not work as intended because it compares the original name to expr.name(), and this method produces the name of the expression as it should appear in the schema, and it will never include any CAST expressions. If I change this to use format!("{}", expr) instead, then the test passes, but it breaks other tests.

@andygrove
Copy link
Member Author

Related: #3706

@alamb
Copy link
Contributor

alamb commented Oct 4, 2022

Starting to look into this

@alamb
Copy link
Contributor

alamb commented Oct 4, 2022

I believe I have found the root cause. Fix incoming

@alamb
Copy link
Contributor

alamb commented Oct 4, 2022

Here is a proposed fix: #3710

It seems to have some sort of other planning regression (something is no longer simplified). I am about to run out of time tonight but I will continue investigation tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
2 participants