Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Array and Map literals for the java scala codebase #50

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jbergeroncoveo
Copy link

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-899560: Support Array and Map type as java/scala Literal #47

  2. Fill out the following pre-review checklist:

    • [:white_check_mark: ] I am adding a new automated test(s) to verify correctness of my new code
    • [:x: ] I am adding new logging messages
    • [:x: ] I am adding a new telemetry message
    • [:x: ] I am adding new credentials
    • [:x: ] I am adding a new dependency
  3. Please describe how your code solves the related issue.

Added a trait TLiteral with 3 children:

  1. ArrayLiteral: Represents an array literal
  2. MapLiteral: Represents a map literal
  3. Literal: Represents everything else.

The Array and Map Literal holds list of literals corresponding to their elements. Those lists are used to generate the SQL for those types recursively.

Pre-review checklist

(For Snowflake employees)

  • This change has passed precommit
  • I have reviewed code coverage report for my PR in (Sonarqube)

item: Any => toScala(item)
}
case y: java.util.Iterator[_] =>
toScala(y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infinite loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry about that, I copied an old function from our code base and it is indeed an infinite loop 💥
Fixed it by converting to scala iterator and calling the method toScala to each of its elements.

case _ =>
x
}
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Utility function that recursively convert a java map/array to a scala map/array

case _ =>
throw new UnsupportedOperationException(
s"Unsupported datatype by ToSql: ${value.getClass.getName} => $dataType")
private[analyzer] def toSql(literal: TLiteral): String = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now takes an instance of TLIteral, it is match against the children classes MapLiteral, ArrayLiteral and Literal. The Literal case contains the original code without change except for BinaryType that is moved inside of ArrayLiteral.

case mapLiteral: MapLiteral =>
"OBJECT_CONSTRUCT" + mapLiteral.entriesLiterals.flatMap { case (keyLiteral, valueLiteral) =>
Seq(toSql(keyLiteral), toSql(valueLiteral))
}.mkString("(", ", ", ")")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the ARRAY_CONSTRUCT AND OBJECT_CONSTRUCT here and recursively call toSql to populate the arguments of those methods.

case Seq(_, _*) => Some(MapType(StringType, VariantType))
}
case _ =>
throw ErrorMessage.PLAN_CANNOT_CREATE_LITERAL(value.getClass.getCanonicalName, s"$value")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All keys of a Map must be of String type otherwise an exception is thrown. Maybe the exception could be more precise here ?

case (v, dType) => DataTypeMapper.toSql(v, Option(dType))
case (v: Seq[Any], _: ArrayType) => DataTypeMapper.toSql(ArrayLiteral(v))
case (v: Map[Any, Any], _: MapType) => DataTypeMapper.toSql(MapLiteral(v))
case (v, dType) => DataTypeMapper.toSql(Literal(v, Option(dType)))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code is a bit puzzling me, I think Array and Map are not supported by the Values function, so maybe I should throw something here ?

In any case, the design of the current code is a bit flawed here. Take for example a list of ints that represent DateType, when creating the ArrayLiteral the type is not taken into consideration and is instead infered (so Int). When converting to SQL we will not call the DATE function.

@@ -36,7 +35,8 @@ private[snowpark] object Literal {
case t: Timestamp => Literal(DateTimeUtils.javaTimestampToMicros(t), Option(TimestampType))
case ld: LocalDate => Literal(DateTimeUtils.localDateToDays(ld), Option(DateType))
case d: Date => Literal(DateTimeUtils.javaDateToDays(d), Option(DateType))
case a: Array[Byte] => Literal(a, Option(BinaryType))
case s: Seq[Any] => ArrayLiteral(s)
case m: Map[Any, Any] => MapLiteral(m)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We build the Array and Map Literal here, the data types are infered recursively in the classes themselves.

item: Any => toScala(item)
}
case y: java.util.Iterator[_] =>
toScala(y)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry about that, I copied an old function from our code base and it is indeed an infinite loop 💥
Fixed it by converting to scala iterator and calling the method toScala to each of its elements.

@mpetitclerc
Copy link

Is there something else missing to be able to integrate this change?

@jeanfrancisroy
Copy link

@sfc-gh-bli any news about this PR? I believe the requested changes were addressed.

@mpetitclerc
Copy link

@sfc-gh-bli Is there any reason for this PR to be still open?

I am noticing, there is now a conflict since the PR is open for long time. Is there anything else missing?

Thank you!

@sfc-gh-bli
Copy link
Collaborator

sfc-gh-bli commented May 7, 2024

@jbergeroncoveo @mpetitclerc
We are working on a related feature now.
Since there are a lot code conflicts, we will not merge this PR directly.
However, we will file a new PR base this PR to implement this feature later this month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SNOW-899560: Support Array and Map type as java/scala Literal
4 participants