Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error hint for getproperty(::Dict, ::Symbol) #54504

Merged
merged 19 commits into from
Jun 12, 2024

Conversation

arhik
Copy link
Contributor

@arhik arhik commented May 17, 2024

error hint for getproperty(::Dict, ::Symbol)

addresses #53618

Co-authored-by: Lilith Orion Hafner lilithhafner@gmail.com

@arhik arhik changed the title Update errorshow.jl error hint for getproperty(::Dict, ::Symbol) May 17, 2024
@arhik
Copy link
Contributor Author

arhik commented May 17, 2024

@oscardssmith This is the least invasive commit with computational overhead on getproperty that I could come up with. I see a 10ns increase in getproperty calls. On mac M1 its jumping from 13ns to 24ns on average. Since Other exceptions are not carrying the information we need, I introduced an Exception type. Not sure if it's worth it. If it is a standard requirement we can make a separate commit on the MemberAccessError exception. (Should come up with better name may be). I am open to discussion on this PR.

@KristofferC
Copy link
Member

If this is to be done it has to be done without any overhead at all for the working case. Isn't it possible to add an error hint for the already thrown exception or is it missing something?

@arhik
Copy link
Contributor Author

arhik commented May 17, 2024

@KristofferC I tried to fit other Exceptions available. I will take a look again. Even if there is an existing exception that fits the purpose, we will have to filter symbols that are actually Dict field members. So the problem should still exist. I am starting to think if this is necessary feature.

EDIT:
Sorry I think I misunderstood you. Already thrown exception should go through similar strategy I think.

Actually there is nuance to it. I will explain the overhead part clearly in the first message. Allow me some time to do that. Simply put user end code like keys(dict) won't see any overhead. but dict.jl using dict.keys anywhere will see an overhead.

base/dict.jl Outdated Show resolved Hide resolved
Copy link
Member

@LilithHafner LilithHafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this hint! We should definitely add it to the language, though it will take a bit of work to make it fit in nicely with existing systems.

base/boot.jl Outdated Show resolved Hide resolved
base/errorshow.jl Outdated Show resolved Hide resolved
base/errorshow.jl Outdated Show resolved Hide resolved
base/dict.jl Outdated
Comment on lines 119 to 125
function getproperty(d::Dict, x::Symbol)
if x in propertynames(d)
return getfield(d, x)
else
throw(MemberAccessError(typeof(d), x))
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to add an UndefFieldError or MemberAccessError, instead of special casing Dict, it makes more sense to update the error thrown by getfield itself. This would require digging into the c code:

diff --git a/base/boot.jl b/base/boot.jl
index 4a59bf6279..1569f6277e 100644
--- a/base/boot.jl
+++ b/base/boot.jl
@@ -205,8 +205,8 @@ export
     ErrorException, BoundsError, DivideError, DomainError, Exception,
     InterruptException, InexactError, OutOfMemoryError, ReadOnlyMemoryError,
     OverflowError, StackOverflowError, SegmentationFault, UndefRefError, UndefVarError,
-    TypeError, ArgumentError, MethodError, AssertionError, LoadError, InitError,
-    UndefKeywordError, ConcurrencyViolationError,
+    UndefFieldError, TypeError, ArgumentError, MethodError, AssertionError, LoadError,
+    InitError, UndefKeywordError, ConcurrencyViolationError,
     # AST representation
     Expr, QuoteNode, LineNumberNode, GlobalRef,
     # object model functions
@@ -388,6 +388,11 @@ struct UndefKeywordError <: Exception
     var::Symbol
 end
 
+struct UndefFieldError <: Exception
+    type::DataType
+    field::Symbol
+end
+
 const typemax_UInt = Intrinsics.sext_int(UInt, 0xFF)
 const typemax_Int = Core.Intrinsics.udiv_int(Core.Intrinsics.sext_int(Int, 0xFF), 2)
 
diff --git a/src/datatype.c b/src/datatype.c
index abbec420bb..1a4694a594 100644
--- a/src/datatype.c
+++ b/src/datatype.c
@@ -1736,7 +1736,7 @@ JL_DLLEXPORT int jl_field_index(jl_datatype_t *t, jl_sym_t *fld, int err)
         }
     }
     if (err)
-        jl_has_no_field_error(t->name->name, fld);
+        jl_has_no_field_error(t, fld);
     return -1;
 }
 
diff --git a/src/jl_exported_data.inc b/src/jl_exported_data.inc
index 79ff437841..90920df78a 100644
--- a/src/jl_exported_data.inc
+++ b/src/jl_exported_data.inc
@@ -138,6 +138,7 @@
     XX(jl_uint8_type) \
     XX(jl_undefref_exception) \
     XX(jl_undefvarerror_type) \
+    XX(jl_undeffielderror_type) \
     XX(jl_unionall_type) \
     XX(jl_uniontype_type) \
     XX(jl_upsilonnode_type) \
diff --git a/src/jltypes.c b/src/jltypes.c
index 420b88aeb7..89607a97de 100644
--- a/src/jltypes.c
+++ b/src/jltypes.c
@@ -3700,6 +3700,7 @@ void post_boot_hooks(void)
     jl_diverror_exception  = jl_new_struct_uninit((jl_datatype_t*)core("DivideError"));
     jl_undefref_exception  = jl_new_struct_uninit((jl_datatype_t*)core("UndefRefError"));
     jl_undefvarerror_type  = (jl_datatype_t*)core("UndefVarError");
+    jl_undeffielderror_type  = (jl_datatype_t*)core("UndefFieldError");
     jl_atomicerror_type    = (jl_datatype_t*)core("ConcurrencyViolationError");
     jl_interrupt_exception = jl_new_struct_uninit((jl_datatype_t*)core("InterruptException"));
     jl_boundserror_type    = (jl_datatype_t*)core("BoundsError");
diff --git a/src/julia.h b/src/julia.h
index 0d46f15776..2501040797 100644
--- a/src/julia.h
+++ b/src/julia.h
@@ -868,6 +868,7 @@ extern JL_DLLIMPORT jl_datatype_t *jl_initerror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_typeerror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_methoderror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_undefvarerror_type JL_GLOBALLY_ROOTED;
+extern JL_DLLIMPORT jl_datatype_t *jl_undeffielderror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_atomicerror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_missingcodeerror_type JL_GLOBALLY_ROOTED;
 extern JL_DLLIMPORT jl_datatype_t *jl_lineinfonode_type JL_GLOBALLY_ROOTED;
@@ -2023,7 +2024,7 @@ JL_DLLEXPORT void JL_NORETURN jl_type_error_rt(const char *fname,
                                                jl_value_t *ty JL_MAYBE_UNROOTED,
                                                jl_value_t *got JL_MAYBE_UNROOTED);
 JL_DLLEXPORT void JL_NORETURN jl_undefined_var_error(jl_sym_t *var, jl_value_t *scope JL_MAYBE_UNROOTED);
-JL_DLLEXPORT void JL_NORETURN jl_has_no_field_error(jl_sym_t *type_name, jl_sym_t *var);
+JL_DLLEXPORT void JL_NORETURN jl_has_no_field_error(jl_datatype_t *t, jl_sym_t *var);
 JL_DLLEXPORT void JL_NORETURN jl_atomic_error(char *str);
 JL_DLLEXPORT void JL_NORETURN jl_bounds_error(jl_value_t *v JL_MAYBE_UNROOTED,
                                               jl_value_t *t JL_MAYBE_UNROOTED);
diff --git a/src/rtutils.c b/src/rtutils.c
index f18a1ac112..29fa145880 100644
--- a/src/rtutils.c
+++ b/src/rtutils.c
@@ -152,9 +152,10 @@ JL_DLLEXPORT void JL_NORETURN jl_undefined_var_error(jl_sym_t *var, jl_value_t *
     jl_throw(jl_new_struct(jl_undefvarerror_type, var, scope));
 }
 
-JL_DLLEXPORT void JL_NORETURN jl_has_no_field_error(jl_sym_t *type_name, jl_sym_t *var)
+JL_DLLEXPORT void JL_NORETURN jl_has_no_field_error(jl_datatype_t *t, jl_sym_t *var)
 {
-    jl_errorf("type %s has no field %s", jl_symbol_name(type_name), jl_symbol_name(var));
+    // jl_errorf("type %s has no field %s", jl_symbol_name(t->name->name), jl_symbol_name(var));
+    jl_throw(jl_new_struct(jl_undeffielderror_type, t, var));
 }
 
 JL_DLLEXPORT void JL_NORETURN jl_atomic_error(char *str) // == jl_exceptionf(jl_atomicerror_type, "%s", str)
diff --git a/src/staticdata.c b/src/staticdata.c
index f6461e0b42..eaa5d66754 100644
--- a/src/staticdata.c
+++ b/src/staticdata.c
@@ -235,6 +235,7 @@ jl_value_t **const*const get_tags(void) {
         INSERT_TAG(jl_loaderror_type);
         INSERT_TAG(jl_initerror_type);
         INSERT_TAG(jl_undefvarerror_type);
+        INSERT_TAG(jl_undeffielderror_type);
         INSERT_TAG(jl_stackovf_exception);
         INSERT_TAG(jl_diverror_exception);
         INSERT_TAG(jl_interrupt_exception);

(This approach also has no overhead)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. Can I borrow this and check ?

@JeffBezanson
Copy link
Member

This needs a different name. We use "undef" to mean a null reference. Maybe just FieldError or PropertyError?

@arhik
Copy link
Contributor Author

arhik commented May 22, 2024

I like FieldError. Going with it.

src/jltypes.c Outdated
Comment on lines 3727 to 3731
jl_undefvarerror_type = (jl_datatype_t*)core("UndefVarError");
jl_fielderror_type = (jl_datatype_t*)core("FieldError");
jl_atomicerror_type = (jl_datatype_t*)core("ConcurrencyViolationError");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
jl_undefvarerror_type = (jl_datatype_t*)core("UndefVarError");
jl_fielderror_type = (jl_datatype_t*)core("FieldError");
jl_atomicerror_type = (jl_datatype_t*)core("ConcurrencyViolationError");
jl_undefvarerror_type = (jl_datatype_t*)core("UndefVarError");
jl_fielderror_type = (jl_datatype_t*)core("FieldError");
jl_atomicerror_type = (jl_datatype_t*)core("ConcurrencyViolationError");

Consistent style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will batch it.

@LilithHafner LilithHafner added the needs tests Unit tests are required for this change label May 22, 2024
@LilithHafner
Copy link
Member

This looks pretty good! It should have tests to make sure the new error is being thrown and the hint looks right.

@nospecialize
field = exc.field
type = exc.type
if type == :Dict
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting a type Dict but I was a bit surprised to see I actually get Symbol. I should familiarize with internals more. But if anyone sees this as anomaly let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's seriously broken. I think the struct construction on the C side is buggy.

@arhik
Copy link
Contributor Author

arhik commented May 24, 2024

This looks pretty good! It should have tests to make sure the new error is being thrown and the hint looks right.

Added tests. Let me know if they are reasonable.

@arhik
Copy link
Contributor Author

arhik commented May 24, 2024

@LilithHafner I raised another PR JuliaSparse/SparseArrays.jl#539 related to this. How do we resolve this dependency. We need SparseArrays PR to be accepted for julia tests to pass. ref (https://buildkite.com/julialang/julia-master/builds/36697#018fa506-1c87-42bf-b902-09ef3cb581ec/826-1173)

LilithHafner pushed a commit to JuliaSparse/SparseArrays.jl that referenced this pull request May 25, 2024
@LilithHafner
Copy link
Member

This PR now includes the changes you made to SparseArrays tests

@LilithHafner LilithHafner removed the needs tests Unit tests are required for this change label May 26, 2024
Copy link
Member

@LilithHafner LilithHafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part, this LGTM, but the fact that the exception's type field is a Symbol when it is type checked to be a Type is very broken.

It would be nice to add a failing test for that, and we definitely can't merge before fixing it.

@arhik
Copy link
Contributor Author

arhik commented May 26, 2024

Understood the concern. I didn't get chance to look into it. I will try make time soon.

@arhik arhik force-pushed the dictErrorHint branch 2 times, most recently from 586d3f0 to 25e9ccb Compare May 27, 2024 13:04
@arhik
Copy link
Contributor Author

arhik commented May 27, 2024

@LilithHafner The type information is lost even before it comes to jl_new_struct. On inspection found issue in codegen. We didn't make corresponding changes in llvm related emit function signatures.

@gbaraldi I made some codegen changes for FieldError. Its trivial but could have unintended consequences. Your view would help.

https://github.com/JuliaLang/julia/pull/54504/files#diff-62cfb2606c6a323a7f26a3eddfa0bf2b819fa33e094561fee09daeb328e3a1e7R4082

@LilithHafner
Copy link
Member

I'm a little out of my depth here, I would appreciate @gbaraldi's feedback or for @gbaraldi to reccomend an alternative reviewer.

@LilithHafner LilithHafner dismissed their stale review May 27, 2024 14:43

out of date

@arhik arhik force-pushed the dictErrorHint branch 2 times, most recently from ad114f1 to b2676de Compare June 8, 2024 14:48
@arhik
Copy link
Contributor Author

arhik commented Jun 8, 2024

@LilithHafner I rebased code. Can you check and validate whenever you are free ?

@arhik
Copy link
Contributor Author

arhik commented Jun 11, 2024

Quickly tried this and pasting this. Let me know if this is what you are implying. (overhead version obviously)

struct PropertyError <: Exception
	val::Any
	x::Symbol
end

using DataFrames

Base.getproperty(d::DataFrame, x::Symbol) = begin
	if x in propertynames(d)
		return d[!, x]
	else
		throw(PropertyError(d, x))
	end
end

function Base.showerror(io::IO, exc::PropertyError)
	@nospecialize
	println(io, "PropertyError: could not access property $(exc.x) on this $(typeof(exc.val)) type\n")
	Base.Experimental.show_error_hints(io, exc)
end

function propertyAccessHandler(io, exc)
	@nospecialize
	x = exc.x
	val = exc.val
	if val isa DataFrame
		println(io, "Looks like there is no column `$x` defined on this DataFrame obj.")
		print(io, "Only following columns are defined on this object: ")
		print(io, "$(propertynames(val))")
	else
		println(io, 
		"""
		A simple else message here
		"""
		)
	end
end


Base.Experimental.register_error_hint(propertyAccessHandler, PropertyError)

# Test

h = DataFrame(ENV)

h.c

image

An implementation without overhead like this PR should do the job.

@oscardssmith
Copy link
Member

I don't think the overhead is a problem here since it's only overhead in throwing a user-facing error.

@oscardssmith
Copy link
Member

@arhik I agree that this would probably be an improvement, but I am tempted to just merge this now and make the change to Any in a followup PR. (how can you not merge when CI is fully green? It's a sign...)

base/errorshow.jl Outdated Show resolved Hide resolved
Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>
@LilithHafner LilithHafner added merge me PR is reviewed. Merge when all tests are passing error messages Better, more actionable error messages labels Jun 11, 2024
@IanButterworth IanButterworth merged commit 7e5d9a3 into JuliaLang:master Jun 12, 2024
9 checks passed
@oscardssmith oscardssmith added collections Data structures holding multiple items, e.g. sets and removed merge me PR is reviewed. Merge when all tests are passing labels Jun 12, 2024
@arhik arhik deleted the dictErrorHint branch June 12, 2024 03:31
@nsajko nsajko mentioned this pull request Jun 17, 2024
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.

Update basedocs.jl
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
FieldError i introduced in JuliaLang#54504. Updating NEWS.md for broader notice.
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
[Doc] NEWS.md entry for FieldError

FieldError i introduced in JuliaLang#54504. Updating NEWS.md for broader notice.
arhik added a commit to arhik/julia that referenced this pull request Jun 18, 2024
[Doc] NEWS.md entry for FieldError

FieldError i introduced in JuliaLang#54504. Updating NEWS.md for broader notice.
arhik added a commit to arhik/julia that referenced this pull request Jun 19, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.
staticfloat pushed a commit that referenced this pull request Jun 20, 2024
A new Fielderror exception is introduced in #54504 to raise/handle
exceptions from `getfield`.
arhik added a commit to arhik/julia that referenced this pull request Jun 20, 2024
FieldError was introduced in JuliaLang#54504. This PR adds documentation for the same.

add types to signature in docs

Co-Authored-By: Neven Sajko <s@purelymail.com>
Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>
LilithHafner added a commit that referenced this pull request Jun 20, 2024
FieldError was introduced in #54504. This PR adds documentation for the
same.

Fixes #54830

Co-authored-by: Neven Sajko <s@purelymail.com>
Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets error messages Better, more actionable error messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants