Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add new Datum type for column name literals #12281

Closed
wants to merge 3 commits into from

Conversation

jordanlewis
Copy link
Member

@jordanlewis jordanlewis commented Dec 12, 2016

This PR adds a new column type, name, that describes fields that are SQL identifiers. This type is primarily for postgres compatibility -there are several fields in pg_catalog tables with type name in
postgres, and at least one ORM (sequelize) expects that the pgwire type returned for these columns corresponds with the name type.

pg_catalog is also updated to use the new type.

This PR ended up being larger than I would have liked. Ideally, adding name shouldn't require adding a new column type to the enum in structured.proto since all name types are going to stay in virtual tables and not be persisted. Unfortunately it seems like we need to add that new column type anyway - it's used in various non-persistence operations that get done during e.g. GROUP BY or array aggregation.

Resolves #12207.

Feedback about ways to simplify this would be very welcome.


This change is Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Dec 12, 2016

Review status: 0 of 21 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/parser/datum.go, line 512 at r1 (raw file):

		panic(makeUnsupportedComparisonMessage(d, other))
	}
	if *d < *v {

I know this code didn't originate with you, but isn't the rest of the method equivalent to return strings.Compare(*d.BackingDString(), *v.BackingDString())?


pkg/sql/parser/datum.go, line 576 at r1 (raw file):

// DName is the Datum for column name literals.
type DName struct {
	DString DString

Could half of the code below be eliminated by embedding DString? e.g. DString DString -> DString


Comments from Reviewable

@jordanlewis
Copy link
Member Author

Review status: 0 of 21 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/parser/datum.go, line 512 at r1 (raw file):

Previously, cuongdo (Cuong Do) wrote…

I know this code didn't originate with you, but isn't the rest of the method equivalent to return strings.Compare(*d.BackingDString(), *v.BackingDString())?

Good point


pkg/sql/parser/datum.go, line 576 at r1 (raw file):

Previously, cuongdo (Cuong Do) wrote…

Could half of the code below be eliminated by embedding DString? e.g. DString DString -> DString

Yes - I had this originally, but I got rid of it because it made it really hard to track down some issues. There are lots of methods that DString defines, scattered across a few different files, and some of them need to be redefined for DName. I could re-embed the struct now that I've tracked down all of the required redefinitions, but it feels slightly disingenuous. Thoughts?


Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Dec 12, 2016

Review status: 0 of 21 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/parser/datum.go, line 576 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Yes - I had this originally, but I got rid of it because it made it really hard to track down some issues. There are lots of methods that DString defines, scattered across a few different files, and some of them need to be redefined for DName. I could re-embed the struct now that I've tracked down all of the required redefinitions, but it feels slightly disingenuous. Thoughts?

Yeah, I see that now. It's probably better to not embed then.


Comments from Reviewable

@jordanlewis
Copy link
Member Author

Review status: 0 of 21 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/parser/datum.go, line 512 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Good point

Actually, if you look at the docs of strings.Compare, it seems that it's better not to use it: https://golang.org/src/strings/compare.go?s=490:519#L3 . If I had to guess, this doc is why we're not using strings.Compare in the first place. Weird!


Comments from Reviewable

@jordanlewis
Copy link
Member Author

I should mention that I tried the approach of modifying the Compare and Equals methods on DString and TypeString to allow comparisons between the two types so as to reduce the number of new comparison operators in this pull request. This approach didn't work, because the type system uses the Equals methods on the TypeFoo structs to do overload matching as well as comparisons. Pointers on how to clean this aspect up would be appreciated.

@knz
Copy link
Contributor

knz commented Dec 12, 2016

I think we need nathan's input on this. @nvanbenschoten care to comment?

(Perhaps even we'll need to keep this sit until the end of the week when nathan is available again)


Reviewed 19 of 19 files at r1.
Review status: 17 of 21 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


pkg/sql/executor.go, line 1422 at r1 (raw file):

	case parser.TypeTimestampTZ:
	case parser.TypeInterval:
	case parser.TypeStringArray:

Here I'm not sure why we special case the array types. Can't we have a general case with FamilyEqual for all the array types?


pkg/sql/parser/aggregate_builtins.go, line 74 at r1 (raw file):

// Exported for use in documentation.
var Aggregates = map[string][]Builtin{
	"array_agg": {

This ought to be replaced by a single aggregate which takes TypeAny as argument and returns tArray{...} as return value. If you can't figure this out here perhaps file an issue and mark this here with a TODO.


Comments from Reviewable

This commit adds a new column type, `name`, that describes fields that
are SQL identifiers. This type is primarily for postgres compatibility -
there are several fields in pg_catalog tables with type `name` in
postgres, and at least one ORM (sequelize) expects that the pgwire type
returned for these columns corresponds with the `name` type.
This commit changes the type of fields in pg_catalog tables to `name` to
correspond with the types in postgres.
This commit allows TypeString and TypeName to compare to each other via
`Equals` but not via `FamilyEquals`, and adds an extra rule to the
overload finder that attempts to narrow the list of overloads by
`FamilyEquals`-based equality.

This allows us to safely remove the comparison overloads for operations
between names and strings.
@jordanlewis
Copy link
Member Author

Review status: 13 of 23 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending.


pkg/sql/executor.go, line 1422 at r1 (raw file):

Previously, knz (kena) wrote…

Here I'm not sure why we special case the array types. Can't we have a general case with FamilyEqual for all the array types?

Only certain array types are supported. Decimal array types, for instance, are unsupported.


pkg/sql/parser/aggregate_builtins.go, line 74 at r1 (raw file):

Previously, knz (kena) wrote…

This ought to be replaced by a single aggregate which takes TypeAny as argument and returns tArray{...} as return value. If you can't figure this out here perhaps file an issue and mark this here with a TODO.

I made a TODO.


Comments from Reviewable

@jordanlewis
Copy link
Member Author

Ready for review - I'd definitely like some feedback on the modifications I made to the type checker in commit 3. I think the way I've (ab?)used the Type Equals/FamilyEquals methods warrants attention - specifically I'm not sure if it makes sense to have two types that are equal via Equals but not via FamilyEquals. The rationale here is that String and Name should be comparable but are not created via the same constructor, the way FamilyEquals requires.

@nvanbenschoten
Copy link
Member

Most of this looks good! However, I don't think I like what's going on in Equals/FamilyEquals for TypeName. It looks like the change is trying to get around the previous decision to disallow implicit mixed-type comparisons. If we want to revisit that discussion, I'd be happy to, but this doesn't seem like the right approach. The change also looks like it's missing a good amount of testing in the parser package.

An alternative idea that came to mind when I heard about adding this change was to instead add better support for type aliasing without needing to add all new datum types. Right now, we map a large number of type names to a single Datum type and then map this single Datum type back to a single OID. As an example, we have a similar issue to what prompted this change with INT2. We currently map this to TypeInt/DInt, which then maps to oid.T_int8. If instead, we could remember the aliasing while using the same internal Datum representation/behavior, we wouldn't need a new Datum type but could support returning the correct OID. Correct me if I'm wrong, but the only reason we're adding this new Datum type is so we can return the right OID. If so, this seems like a somewhat unscalable solution to a problem we'll likely see again.


Reviewed 13 of 19 files at r1, 1 of 1 files at r2, 3 of 4 files at r3, 4 of 6 files at r4.
Review status: 21 of 23 files reviewed at latest revision, 8 unresolved discussions, all commit checks successful.


pkg/sql/parser/aggregate_builtins.go, line 74 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I made a TODO.

The issue is that the signature would be array_agg(T, tArray{T}), not array_agg(*, tArray{*}) (ie. the arg type and the array's param type need to be the same, not arbitrary). We don't currently expose a good way to represent this and have pretty much avoided the issue in the past with solutions like this.

@knz: we should chat sometime next week in person about how we'd want to represent cases like this.


pkg/sql/parser/constant.go, line 303 at r4 (raw file):

var (
	strValAvailAllParsable = []Type{

You're missing some code here that would allow literal type inference to properly work with the Name type.


pkg/sql/parser/constant.go, line 330 at r4 (raw file):

	switch typ {
	case TypeString:
		expr.resString = DString(expr.s)

While you're here, could you fix this and the TypeBytes case to use their respective constructors?


pkg/sql/parser/datum.go, line 512 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Actually, if you look at the docs of strings.Compare, it seems that it's better not to use it: https://golang.org/src/strings/compare.go?s=490:519#L3 . If I had to guess, this doc is why we're not using strings.Compare in the first place. Weird!

Strange! Nice find though


pkg/sql/parser/datum.go, line 576 at r1 (raw file):

Previously, cuongdo (Cuong Do) wrote…

Yeah, I see that now. It's probably better to not embed then.

Maybe just name the field s (capital if it needs to be exported) then.


pkg/sql/parser/expr.go, line 1029 at r4 (raw file):

// validCastTypes returns a set of types that can be cast into the provided type.
func validCastTypes(t Type) []Type {

Don't we need some changes here (and in eval.go) to allow the expected casts to work?


pkg/sql/sqlbase/structured.go, line 1644 at r4 (raw file):

		return ColumnType_STRING
	case parser.TypeName:
		return ColumnType_STRING

Is this intentional?


Comments from Reviewable

@jordanlewis
Copy link
Member Author

TFTR!

I agree that the Equals/FamilyEquals stuff is sketchy. I didn't realize there was a previous decision to disallow implicit mixed-type comparison actually - agreed that this would not be the right approach for a general solution.

I love the idea of switching this approach out for a type aliasing approach. I think that would be so much cleaner - as you note, name types are completely equivalent to string types except for how they are represented over pgwire and SHOW TABLE and the like. Did you already have ideas about how to add type aliasing to the system? Let's talk about it.

In the meantime, I'll put this PR on hold.


Review status: 21 of 23 files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


pkg/sql/parser/expr.go, line 1029 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Don't we need some changes here (and in eval.go) to allow the expected casts to work?

Yeah, I just didn't get to that in this commit.


pkg/sql/sqlbase/structured.go, line 1644 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this intentional?

No - in fact I'm surprised tests pass with this. I left this here by accident when I was experimenting to see whether we could get away with not adding a new column type kind to structured.proto.


Comments from Reviewable

@jordanlewis
Copy link
Member Author

Closing this - superseded by #12641

@jordanlewis jordanlewis closed this Jan 3, 2017
@jordanlewis jordanlewis deleted the name-type branch June 23, 2017 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants