Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from apache:master #124

Open
wants to merge 6,294 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
6294 commits
Select commit Hold shift + click to select a range
aaf8590
[SPARK-50448][SQL][TESTS] Extract postgres image common conf as Postg…
panbingkun Nov 29, 2024
b45045e
[SPARK-49992][SQL] Default collation resolution for DDL and DML queries
stefankandic Nov 29, 2024
3791de9
[SPARK-50444][BUILD] Upgrade dropwizard metrics to 4.2.29
panbingkun Nov 29, 2024
0c16e93
[SPARK-50446][PYTHON] Concurrent level in Arrow-optimized Python UDF
HyukjinKwon Nov 29, 2024
376bd4a
[SPARK-50333][SQL][FOLLOWUP] Codegen Support for `CsvToStructs`(`from…
panbingkun Nov 29, 2024
aad3575
[SPARK-50453][BUILD] Upgrade Netty to 4.1.115
LuciferYang Nov 29, 2024
cd687ff
[SPARK-50310][PYTHON] Add a flag to disable DataFrameQueryContext for…
itholic Nov 29, 2024
5476610
[SPARK-50311][PYTHON][FOLLOWUP] Remove @remote_only from supported APIs
itholic Nov 29, 2024
d5b534d
[SPARK-50295][DOCS][FOLLOWUP] Document `build-docs` in `docs/README.md`
panbingkun Nov 29, 2024
4b97e11
[SPARK-42746][SQL] Implement LISTAGG function
mikhailnik-db Nov 29, 2024
3fab712
[SPARK-50441][SQL] Fix parametrized identifiers not working when refe…
mihailotim-db Nov 29, 2024
faf74ad
[SPARK-50032][SQL] Allow use of fully qualified collation name
stevomitric Dec 1, 2024
b45e3c0
[MINOR] Use putAll to populate Properties
tedyu Dec 1, 2024
7d46fdb
[SPARK-48148][FOLLOWUP] Fix JSON parser feature flag
chenhao-db Dec 1, 2024
dc73342
[SPARK-50436][PYTHON][TESTS] Use assertDataFrameEqual in pyspark.sql.…
xinrong-meng Dec 1, 2024
e7071c0
[SPARK-50435][PYTHON][TESTS] Use assertDataFrameEqual in pyspark.sql.…
xinrong-meng Dec 2, 2024
05728e4
[SPARK-49294][UI] Add width attribute for shuffle-write-time checkbox
xunxunmimi5577 Dec 2, 2024
171e2ce
[SPARK-50373] Prohibit Variant from set operations
harshmotw-db Dec 2, 2024
f382cdf
[SPARK-50432][BUILD] Remove workaround for THRIFT-4805
pan3793 Dec 2, 2024
ae4625c
[SPARK-50433][DOCS][TESTS] Fix confguring log4j2 guide docs for Spark…
pan3793 Dec 2, 2024
9835009
[SPARK-50452][BUILD] Upgrade jackson to 2.18.2
panbingkun Dec 2, 2024
be12eb7
[SPARK-50465][PYTHON][TESTS] Use assertDataFrameEqual in pyspark.sql.…
xinrong-meng Dec 2, 2024
6c84f15
[SPARK-50468][BUILD] Upgrade Guava to 33.3.1-jre
LuciferYang Dec 2, 2024
d0e2c06
[SPARK-50372][CONNECT][SQL] Make all DF execution path collect observ…
xupefei Dec 2, 2024
31abad9
[SPARK-50471][PYTHON] Support Arrow-based Python Data Source Writer
allisonwang-db Dec 2, 2024
758b5c9
[SPARK-50466][PYTHON] Refine the docstring for string functions - part 1
zhengruifeng Dec 2, 2024
33f248c
[SPARK-50458][CORE][SQL] Proper error handling for unsupported file s…
yaooqinn Dec 2, 2024
75397c9
[MINOR][INFRA] Add a space in the JIRA user assignment message
HyukjinKwon Dec 2, 2024
2041519
[SPARK-47993][DOCS][FOLLOWUP] Update RDD programing guide to Python 3.9+
dongjoon-hyun Dec 2, 2024
4abaab3
[SPARK-50430][CORE][FOLLOW-UP] Keep the logic of manual putting key a…
HyukjinKwon Dec 3, 2024
38ab95b
[SPARK-50467][PYTHON] Add `__all__` for builtin functions
zhengruifeng Dec 3, 2024
f572ad1
[SPARK-50474][BUILD] Upgrade checkstyle to 10.20.2
LuciferYang Dec 3, 2024
98e94af
[SPARK-48898][SQL] Fix Variant shredding bug
cashmand Dec 3, 2024
d427aa3
[SPARK-50473][SQL] Simplify classic Column handling
hvanhovell Dec 3, 2024
6cd1334
[SPARK-50425][BUILD] Bump Apache Parquet to 1.15.0
Fokko Dec 3, 2024
7b974ca
[SPARK-50463][SQL] Fix `ConstantColumnVector` with Columnar to Row co…
richardc-db Dec 3, 2024
d0dbc6c
[SPARK-50470][SQL] Block usage of collations for map keys
Alexvsalexvsalex Dec 3, 2024
ecc33d2
[SPARK-50482][CORE] Deprecated no-op `spark.shuffle.spill` config
dongjoon-hyun Dec 4, 2024
13315ee
[SPARK-50481][CORE] Improve `SortShuffleManager.unregisterShuffle` to…
dongjoon-hyun Dec 4, 2024
0e45e21
[SPARK-50486][PYTHON][DOCS] Refine the docstring for string functions…
zhengruifeng Dec 4, 2024
784a97b
[MINOR][INFRA] Update labeler
zhengruifeng Dec 4, 2024
5fc6b71
[SPARK-50405][SQL] Handle collation type coercion of complex data typ…
stefankandic Dec 4, 2024
45da6f6
[SPARK-50477][INFRA] Add a separate docker file for python 3.9 daily …
zhengruifeng Dec 4, 2024
3d063a0
[SPARK-50487][DOCS] Update broken jira link
huangxiaopingRD Dec 4, 2024
74c3757
[MINOR][DOCS] Add a migration guide for encode/decode unmappable char…
yaooqinn Dec 4, 2024
10e0b61
[SPARK-49670][SQL] Enable trim collation for all passthrough expressions
jovanpavl-db Dec 4, 2024
812a9ad
[MINOR] Fix some typos
Dec 4, 2024
f1eecd3
[SPARK-50485][SQL] Unwrap SparkThrowable in (Unchecked)ExecutionExcep…
yaooqinn Dec 4, 2024
4248397
[SPARK-49695][SQL] Postgres fix xor push-down
andrej-db Dec 4, 2024
fe904e6
[SPARK-49709][CONNECT][SQL] Support ConfigEntry in the RuntimeConfig …
hvanhovell Dec 4, 2024
7278bc7
[SPARK-50489][SQL][PYTHON] Fix self-join after `applyInArrow`
zhengruifeng Dec 5, 2024
af4f37c
[SPARK-50339][SPARK-50360][SS] Enable changelog to store lineage info…
WweiL Dec 5, 2024
0570390
[SPARK-50495][K8S][INFRA][DOCS] Upgrade Volcano to 1.10.0
panbingkun Dec 5, 2024
3628595
[SPARK-50498][PYTHON] Avoid unnecessary py4j call in `listFunctions`
zhengruifeng Dec 5, 2024
faa2b04
[SPARK-50421][CORE] Fix executor related memory config incorrect when…
zjuwangg Dec 5, 2024
ee8db4e
[SPARK-50343][SPARK-50344][SQL] Add SQL pipe syntax for the DROP and …
dtenedor Dec 5, 2024
6add9c8
[SPARK-50501][BUILD] Update cross-spawn to surpress a warning in lint
sarutak Dec 5, 2024
21451fb
[SPARK-50505][DOCS] Fix `spark.storage.replication.proactive` default…
dongjoon-hyun Dec 5, 2024
fc69194
[SPARK-50310][CONNECT][PYTHON] Call `with_origin_to_class` when the `…
itholic Dec 6, 2024
a435b2e
[SPARK-50489][SQL][PYTHON][FOLLOW-UP] Add applyInArrow in `Deduplicat…
zhengruifeng Dec 6, 2024
d67ca73
[SPARK-50132][SQL][PYTHON] Add DataFrame API for Lateral Joins
ueshin Dec 6, 2024
ecd1911
[SPARK-50506][DOCS] Codify Spark Standalone documentation consistently
dongjoon-hyun Dec 6, 2024
b1c118f
[SPARK-50507][PYTHON][TESTS] Group pandas function related tests
zhengruifeng Dec 6, 2024
851f5f2
[SPARK-50492][SS] Fix java.util.NoSuchElementException when event tim…
liviazhu-db Dec 6, 2024
c149942
[SPARK-50449][SQL] Fix SQL Scripting grammar allowing empty bodies fo…
dusantism-db Dec 6, 2024
ede9cfc
[SPARK-50494][INFRA] Add a separate docker file for python 3.10 daily…
zhengruifeng Dec 6, 2024
1d6932c
[SPARK-50478][SQL] Fix StringType matching
jovanm-db Dec 6, 2024
934a387
[SPARK-50483][CORE][SQL] BlockMissingException should be thrown even …
wangyum Dec 6, 2024
be0780b
[SPARK-50350][SQL] Avro: add new function `schema_of_avro` (`scala` s…
panbingkun Dec 6, 2024
28766d4
[SPARK-50514][DOCS] Add `IDENTIFIER clause` page to `menu-sql.yaml`
dongjoon-hyun Dec 7, 2024
c1267d6
[SPARK-50512][SQL][DOCS] Fix `CREATE TABLE` syntax in `sql-pipe-synta…
dongjoon-hyun Dec 7, 2024
2fea84e
[SPARK-48426][DOCS][FOLLOWUP] Add `Operators` page to `sql-ref.md`
dongjoon-hyun Dec 7, 2024
fff6793
[SPARK-50516][SS][MINOR] Fix the init state related test to use Strea…
anishshri-db Dec 7, 2024
bb17665
[SPARK-49249][SQL][FOLLOWUP] Rename `spark.sql.artifact.isolation.(al…
dongjoon-hyun Dec 7, 2024
6c2e87a
[SPARK-50507][PYTHON][TESTS][FOLLOW-UP] Add refactored package into p…
HyukjinKwon Dec 9, 2024
85d92d7
[SPARK-50517][PYTHON][TESTS] Group arrow function related tests
zhengruifeng Dec 9, 2024
88102d3
[SPARK-50497][SQL] Fail queries with proper message if MultiAlias con…
mihailom-db Dec 9, 2024
88c3813
[SPARK-50503][SQL] Prohibit partitioning by Variant data
harshmotw-db Dec 9, 2024
32431cf
[SPARK-50063][SQL][CONNECT] Add support for Variant in the Spark Conn…
harshmotw-db Dec 9, 2024
74eeced
[SPARK-50523][SQL] Fix casts on complex types in collation coercion
stefankandic Dec 9, 2024
290b4b3
[SPARK-49461][SS][TESTS][FOLLOWUP] Add compatibility test for new com…
WweiL Dec 9, 2024
0830a19
[SPARK-50329][SQL] fix InSet$toString
averyqi-db Dec 10, 2024
2cb8183
[SPARK-50480][SQL] Extend CharType and VarcharType from StringType
jovanm-db Dec 10, 2024
156cf16
[SPARK-50524][SQL] Lower `RowBasedKeyValueBatch.spill` warning messag…
dongjoon-hyun Dec 10, 2024
a89fcfc
[SPARK-50507][PYTHON][TESTS][FOLLOW-UP] Add refactored package into p…
HyukjinKwon Dec 10, 2024
5f34af9
[SPARK-50477][INFRA][FOLLOW-UP] Python 3.9 testing image clean up
zhengruifeng Dec 10, 2024
02ebf12
[SPARK-50460][PYTHON][CONNECT] Generalize and simplify Connect except…
itholic Dec 10, 2024
03c5799
[SPARK-50513][SS][SQL] Split EncoderImplicits from SQLImplicits and p…
anishshri-db Dec 10, 2024
bac386d
[SPARK-50528][CONNECT] Move `InvalidCommandInput` to common module
zhengruifeng Dec 10, 2024
12967fe
[SPARK-49349][SQL] Improve error message for LCA with Generate
wangyum Dec 10, 2024
faef3fa
[SPARK-50527][INFRA] Add a separate docker file for python 3.12 daily…
zhengruifeng Dec 10, 2024
559fda7
[SPARK-50530][SQL] Fix bad implicit string type context calculation
stefankandic Dec 10, 2024
e70f8ab
[SPARK-50491][SQL] Fix bug where empty BEGIN END blocks throw an error
dusantism-db Dec 10, 2024
18f0e23
[SPARK-45891][SQL] Rebuild variant binary from shredded data
chenhao-db Dec 10, 2024
2093bae
[SPARK-50536][CORE] Log downloaded archive file sizes in `SparkContex…
dongjoon-hyun Dec 10, 2024
f903efb
[SPARK-50537][CONNECT][PYTHON] Fix compression option being overwritt…
alexkh-db Dec 10, 2024
6cdc96f
[SPARK-50517][PYTHON][TESTS][FOLLOW-UP] Add refactored package into p…
HyukjinKwon Dec 11, 2024
58c77ba
[SPARK-49349][SQL][FOLLOWUP] Rename isContainsUnsupportedLCA function
tedyu Dec 11, 2024
0f27d73
[SPARK-49565][SQL] Add SQL pipe syntax for the FROM operator
jiashenC Dec 11, 2024
2f5728f
[SPARK-50543][BUILD] Fix log printing in `dev/check-protos.py`
LuciferYang Dec 11, 2024
d268e0c
[SPARK-46934][SQL][FOLLOWUP] Read/write roundtrip for struct type wit…
yaooqinn Dec 11, 2024
2fa72d2
[SPARK-50542][INFRA] Add a separate docker file for python 3.13 daily…
zhengruifeng Dec 11, 2024
9394b35
[SPARK-48898][SQL] Set nullability correctly in the Variant schema
cashmand Dec 11, 2024
3bb9a72
[SPARK-50545][CORE][SQL] `AccessControlException` should be thrown ev…
pan3793 Dec 11, 2024
b2c8b30
Revert "[SPARK-48898][SQL] Set nullability correctly in the Variant s…
dongjoon-hyun Dec 11, 2024
48efe3f
[SPARK-50134][SPARK-50132][SQL][CONNECT][PYTHON] Support DataFrame AP…
ueshin Dec 11, 2024
d464e85
[SPARK-50549][DOCS] Use `rouge` `4.5.x` by removing the upper bound `…
dongjoon-hyun Dec 12, 2024
bfc5c22
[SPARK-50517][PYTHON][TESTS][FOLLOW-UP] Add refactored package into p…
HyukjinKwon Dec 12, 2024
1f6cb60
[SPARK-50544][PYTHON][CONNECT] Implement `StructType.toDDL`
zhengruifeng Dec 12, 2024
d84b2d4
[SPARK-50428][SS][PYTHON] Support TransformWithStateInPandas in batch…
bogao007 Dec 12, 2024
032623f
[SPARK-50553][CONNECT] Throw `InvalidPlanInput` for invalid plan message
zhengruifeng Dec 12, 2024
e4be5e6
[SPARK-50552][INFRA] Add a separate docker file for PyPy 3.10 daily b…
zhengruifeng Dec 12, 2024
b5d195a
[SPARK-48344][SQL] Add Frames and Scopes to support Exception Handler…
miland-db Dec 12, 2024
f979bc8
[SPARK-50554][INFRA] Add a separate docker file for Python 3.11 daily…
zhengruifeng Dec 12, 2024
df08177
[SPARK-48416][SQL] Support nested correlated With expression
cloud-fan Dec 12, 2024
429402b
[SPARK-49883][SS] State Store Checkpoint Structure V2 Integration wit…
WweiL Dec 13, 2024
98cef08
[SPARK-50562][INFRA] Apply Python 3.11 image in Java 21 daily build
zhengruifeng Dec 13, 2024
819bac9
[SPARK-50157][SQL] Using SQLConf provided by SparkSession first
beliefer Dec 13, 2024
885915f
[SPARK-50566][MINOR][SS] Fix code style violation for RocksDB file
WweiL Dec 13, 2024
d1db510
[SPARK-50563][BUILD] Upgrade `protobuf-java` to 4.29.1
LuciferYang Dec 13, 2024
3362ec8
[SPARK-50567][INFRA] Apply Python 3.11 image in No-ANSI daily build
zhengruifeng Dec 13, 2024
fbc061d
[SPARK-48898][SQL] Set nullability correctly in the Variant schema
cashmand Dec 13, 2024
f5740d0
[SPARK-50571][INFRA] Apply Python 3.11 image in RocksDB as UI Backend…
zhengruifeng Dec 13, 2024
42ce604
[SPARK-50338][CORE] Make LazyTry exceptions less verbose
Dec 13, 2024
ebd6b7c
[SPARK-50546][SQL] Add subquery cast support to collation type coercion
stefankandic Dec 13, 2024
c1a9fc1
[SPARK-45891][SQL][TESTS] Make VariantSuite test agnostic of the stab…
harshmotw-db Dec 13, 2024
3e7b614
[SPARK_50076] Fix logkeys
michaelzhan-db Dec 13, 2024
5538d85
[SPARK-50540][PYTHON][SS] Fix string schema for StatefulProcessorHandle
bogao007 Dec 14, 2024
3db46bf
[SPARK-50526][SS] Add store encoding format conf into offset log and …
anishshri-db Dec 14, 2024
f8de6c7
[SPARK-50565][SS][TESTS] Add transformWithState correctness test
WweiL Dec 14, 2024
2b9eb08
[SPARK-50515][CORE] Add read-only interface to `SparkConf`
pmenon Dec 14, 2024
976192a
[SPARK-50559][SQL] Store Except, Intersect and Union's outputs as laz…
vladimirg-db Dec 14, 2024
d2965ae
[SPARK-49839][SQL] SPJ: Skip shuffles if possible for sorts
szehon-ho Dec 15, 2024
769e569
[SPARK-49954][SQL][FOLLOWUP] Move states to SchemaOfJsonEvaluator
cloud-fan Dec 16, 2024
3df1769
[SPARK-50574][DOCS] Upgrade `rexml` to 3.3.9
dongjoon-hyun Dec 16, 2024
7d7711d
[SPARK-50576][BUILD] Upgrade `common-text` to `1.13.0`
panbingkun Dec 16, 2024
e7a2e4b
[MINOR][DOCS] Modify a link in running-on-yarn.md
sarutak Dec 16, 2024
279d5ee
[SPARK-50564][PYTHON] Upgrade `protobuf` Python package to 5.29.1
LuciferYang Dec 16, 2024
a6f82c6
[SPARK-49461][SS][TESTS][FOLLOWUP] Move related resource files to cor…
WweiL Dec 16, 2024
ef37f9a
[SPARK-50568][SS][TESTS] Fix `KafkaMicroBatchSourceSuite` to cover `K…
ostronaut Dec 16, 2024
15dcd21
[SPARK-50360][SS][FOLLOWUP][MINOR] Make readVersion lazy value
tedyu Dec 16, 2024
3f8e395
[SPARK-50583][INFRA] Apply Python 3.11 image in PR build
zhengruifeng Dec 16, 2024
d293ba6
[SPARK-50511][PYTHON] Avoid wrapping Python data source error messages
allisonwang-db Dec 16, 2024
0faf9d5
[SPARK-50579][SQL] Fix `truncatedString`
MaxGekk Dec 16, 2024
44ab349
[SPARK-50581][SQL] fix support for UDAF in Dataset.observe()
toms-definity Dec 16, 2024
868c587
[SPARK-50583][INFRA][FOLLOW-UP] Fix 3.5 daily build
zhengruifeng Dec 16, 2024
576caec
[SPARK-50580][BUILD] Upgrade log4j2 to 2.24.3
panbingkun Dec 16, 2024
1c3d580
[SPARK-50592][INFRA] Make 3.5 daily build able to manually trigger
zhengruifeng Dec 17, 2024
4c0f7b5
[SPARK-50594][BUILD][INFRA] Align the gRPC-related Python package ver…
LuciferYang Dec 17, 2024
79026ad
[SPARK-50588][BUILD] `build-docs` skips building R docs on host when …
pan3793 Dec 17, 2024
63c7ca4
[SPARK-50597][SQL] Refactor batch construction in Optimizer.scala and…
anton5798 Dec 17, 2024
accde83
[SPARK-50598][SQL] An initial, no-op PR which adds new parameters to …
milanisvet Dec 17, 2024
2b41131
[MINOR][SS] Minor update to watermark propagation comments
neilramaswamy Dec 18, 2024
229118c
[SPARK-50599][SQL] Create the DataEncoder trait that allows for Avro …
ericm-db Dec 18, 2024
62d49b3
[SPARK-50504][SQL] Enable SQL pipe syntax by default
dtenedor Dec 18, 2024
cb84939
[SPARK-50596][PYTHON] Upgrade Py4J from 0.10.9.7 to 0.10.9.8
HyukjinKwon Dec 18, 2024
9c9bbf6
[SPARK-50604][SQL][TESTS] Extract MariaDBDatabaseOnDocker and upgrade…
panbingkun Dec 18, 2024
76672cb
[SPARK-50472][SQL] Introduce initial implementation of the single-pas…
vladimirg-db Dec 18, 2024
5ef99bd
[SPARK-50606][CONNECT] Fix NPE on uninitiated SessionHolder
yaooqinn Dec 18, 2024
0e1fa6f
[SPARK-49661][SQL] Removing redudant checks of collation in binary co…
jovanpavl-db Dec 18, 2024
0dd90d9
[SPARK-50611][SQL] Improve HybridAnalyzer so it throws ExplicitlyUnsu…
mihailoale-db Dec 18, 2024
788fa5a
[SPARK-50609][INFRA] Eliminate warnings in docker image building
panbingkun Dec 18, 2024
332efb2
[SPARK-49636][SQL] Remove the ANSI config suggestion in INVALID_ARRAY…
mihailom-db Dec 18, 2024
2b1369b
[SPARK-50499][PYTHON] Expose metrics from BasePythonRunner
sebastianhillig-db Dec 18, 2024
2cb66e6
[SPARK-49632][SQL] Remove the ANSI config suggestion in CANNOT_PARSE_…
mihailom-db Dec 18, 2024
7f6d554
[SPARK-49436][CONNECT][SQL] Common interface for SQLContext
xupefei Dec 19, 2024
3a61eef
[SPARK-50619][SQL] Refactor VariantGet.cast to pack the cast arguments
chenhao-db Dec 19, 2024
f4c88ca
[SPARK-50558][SQL] Add configurable logging limits for the number of …
olaky Dec 19, 2024
f593cda
[SPARK-50612][SQL] Normalize ordering in the project list of an inner…
mihailotim-db Dec 19, 2024
f9aaeb4
[SPARK-50586][BUILD] Use CommonJS format for ESLint configuration file
sarutak Dec 19, 2024
09d93ec
[SPARK-50602][SQL] Fix transpose to show a proper error message when …
ueshin Dec 19, 2024
d499400
[SPARK-50243][SQL][CONNECT] Cached classloader for ArtifactManager
xupefei Dec 20, 2024
939f3df
[SPARK-50621][PYTHON] Upgrade Cloudpickle to 3.1.0
HyukjinKwon Dec 20, 2024
78592a0
[SPARK-50615][SQL] Push variant into scan
chenhao-db Dec 20, 2024
629fe3f
[SPARK-50370][SQL] Codegen Support for `json_tuple`
panbingkun Dec 20, 2024
bccdf1f
[SPARK-50483][SPARK-50545][DOC][FOLLOWUP] Mention behavior changes in…
pan3793 Dec 20, 2024
a2e3188
[SPARK-50640][CORE][TESTS] Update `ChecksumBenchmark` by removing `Pu…
dongjoon-hyun Dec 20, 2024
482a27c
[SPARK-50637][SQL] Fix code style for the single-pass Analyzer
vladimirg-db Dec 22, 2024
827d2a0
[SPARK-50638][SQL] Refactor the view resolution into the separate fil…
vladimirg-db Dec 22, 2024
5ac42e2
[SPARK-50534][SPARK-50535][TEST][CONNECT] Fix sporadic test failures
changgyoopark-db Dec 23, 2024
f6f6dae
[SPARK-50590][INFRA] Skip uncessary image build and push
zhengruifeng Dec 23, 2024
f8fd398
[SPARK-50645][INFRA] Make more daily builds able to manually trigger
zhengruifeng Dec 23, 2024
ab95c4e
[SPARK-50527][INFRA][FOLLOW-UP] Python 3.12 image clean up
zhengruifeng Dec 23, 2024
08675b1
[SPARK-50630][SQL] Fix GROUP BY ordinal support for pipe SQL AGGREGAT…
dtenedor Dec 23, 2024
876450c
[SPARK-50646][PYTHON][DOCS] Document explicit style of pyspark plotting
xinrong-meng Dec 23, 2024
7cd5c4a
[SPARK-50641][SQL] Move `GetJsonObjectEvaluator` to `JsonExpressionEv…
panbingkun Dec 23, 2024
a30a3fd
[SPARK-49530][PYTHON] Support pie subplots in pyspark plotting
xinrong-meng Dec 24, 2024
debae71
[SPARK-50651][SQL][DOCS] Add note about octal representation for char…
sarutak Dec 24, 2024
50d49ee
[MINOR][INFRA] Skip step `List Python packages` when `PYTHON_TO_TEST`…
zhengruifeng Dec 24, 2024
202b42e
[SPARK-50649] Fix inconsistencies with casting between different coll…
stefankandic Dec 24, 2024
2c1c4d2
[SPARK-50644][SQL] Read variant struct in Parquet reader
chenhao-db Dec 24, 2024
f9e117e
[SPARK-50659][SQL] Move Union-related errors to QueryCompilationErrors
vladimirg-db Dec 25, 2024
495e248
[MINOR][SQL][DOCS] Fix spacing with SQL configuration documentation
HyukjinKwon Dec 25, 2024
062bc4c
[SPARK-50659][SQL] Refactor Union output computation out to reuse it …
vladimirg-db Dec 25, 2024
2414542
[SPARK-50590][INFRA][FOLLOW-UP] Further skip unnecessary image build …
zhengruifeng Dec 25, 2024
6eb556f
[SPARK-50409][SQL] Fix set statement to ignore `;` at the end of `SET…
mihailom-db Dec 25, 2024
112276d
[SPARK-50360][SS][FOLLOWUP][TESTS] Changelog reader test improvements
WweiL Dec 25, 2024
ef4be07
[SPARK-50220][PYTHON] Support listagg in PySpark
mikhailnik-db Dec 25, 2024
9c9bdab
[SPARK-50657][PYTHON] Upgrade the minimum version of `pyarrow` to 11.0.0
zhengruifeng Dec 25, 2024
4ad7f3d
[SPARK-50647][INFRA] Add a daily build for PySpark with old dependencies
zhengruifeng Dec 26, 2024
be2da52
[SPARK-49632][SQL][FOLLOW-UP] Fix suggestion for `to_date` function
mihailom-db Dec 26, 2024
a483dfd
[SPARK-50650][SQL] Improve logging in single-pass Analyzer
vladimirg-db Dec 26, 2024
2475b35
[SPARK-50665][SQL] Substitute LocalRelation with ComparableLocalRelat…
vladimirg-db Dec 26, 2024
5c075c3
[SPARK-50667][PYTHON][TESTS] Make `jinja2` optional in PySpark Tests
zhengruifeng Dec 26, 2024
38c6ef4
[SPARK-50529][SQL] Change char/varchar behavior under the `spark.sql.…
jovanm-db Dec 26, 2024
ac91a7d
[SPARK-50608][SQL][DOCS] Fix malformed configuration page caused by u…
yaooqinn Dec 26, 2024
7a4114c
[SPARK-50644][FOLLOWUP][SQL] Fix scalar cast in the shredded reader
chenhao-db Dec 26, 2024
92948e7
[SPARK-50675][SQL] Table and view level collations support
dejankrak-db Dec 26, 2024
d3022e9
[SPARK-50672][PYTHON][TESTS] Make `openpyxl` optional in PySpark Tests
zhengruifeng Dec 26, 2024
aac494e
[SPARK-50134][SPARK-50130][SQL][CONNECT] Support DataFrame API for SC…
ueshin Dec 26, 2024
c920210
[SPARK-50578][PYTHON][SS] Add support for new version of state metada…
jingz-db Dec 26, 2024
94f9bb0
[SPARK-50673][ML] Avoid traversing model coefficients twice in `Word2…
zhengruifeng Dec 27, 2024
f3426b7
[SPARK-50669][SQL] Change the signature of TimestampAdd expression
stevomitric Dec 27, 2024
928655b
[SPARK-50677][BUILD] Upgrade jupiter-interface to 0.13.3 and Junit5 t…
LuciferYang Dec 27, 2024
20e4508
[SPARK-50573][SS] Adding State Schema ID to State Rows to schema evol…
ericm-db Dec 27, 2024
783b3e3
[SPARK-50622][SS][MINOR] RocksDB Refactor
WweiL Dec 27, 2024
789702b
[MINOR][PYTHON] Leverage functools.cached_property in `ImageSchema`
HyukjinKwon Dec 27, 2024
b8a8e0d
[MINOR][PYTHON] Leverage functools.cached_property in `SparkSession`
HyukjinKwon Dec 27, 2024
2372bc0
[SPARK-50679][SQL] Duplicated common expressions in different With sh…
cloud-fan Dec 27, 2024
f72ff1b
[SPARK-50681][PYTHON][CONNECT] Cache the parsed schema for MapInXXX a…
zhengruifeng Dec 27, 2024
59fb887
[MINOR][PYTHON] Leverage functools.cached_property in `DataFrame`
HyukjinKwon Dec 27, 2024
1a1fcd6
Revert "[SPARK-50310][CONNECT][PYTHON] Call `with_origin_to_class` wh…
HyukjinKwon Dec 27, 2024
a0539bf
[SPARK-50310][CONNECT][PYTHON][FOLLOW-UP] Delay is_debugging_enabled …
HyukjinKwon Dec 27, 2024
b309db0
[SPARK-50682][SQL] Inner Alias should be canonicalized
cloud-fan Dec 27, 2024
9297c5d
[SPARK-50684][PYTHON] Improve Py4J performance in DataFrameQueryContext
HyukjinKwon Dec 27, 2024
2d320aa
[SPARK-50685][PYTHON] Improve Py4J performance by leveraging getattr
HyukjinKwon Dec 27, 2024
003be89
[SPARK-50687][PYTHON] Optimize the logic to get stack traces for Data…
HyukjinKwon Dec 27, 2024
939129e
[SPARK-50674][PYTHON] Fix check for ‘terminate’ method existence in U…
xinrong-meng Dec 27, 2024
af53ee4
[SPARK-50689][SQL] Enforce deterministic ordering in LCA project lists
mihailotim-db Dec 27, 2024
51b011f
[SPARK-50661][CONNECT][SS] Fix Spark Connect Scala foreachBatch impl.…
haiyangsun-db Dec 28, 2024
c1e51f2
[SPARK-50690][SQL] Fix discrepancy in DESCRIBE TABLE view query outpu…
asl3 Dec 28, 2024
89fb67f
[SPARK-50676][SQL] Remove unused `private lazy val mapValueContainsNu…
LuciferYang Dec 28, 2024
31cb811
[SPARK-50632][BUILD] Upgrade tink to 1.16.0
panbingkun Dec 28, 2024
73a2ebb
[SPARK-50696][PYTHON] Optimize Py4J call for DDL parse method
zhengruifeng Dec 30, 2024
7d5aaaa
[SPARK-50688][SQL] Eliminate ambiguity for rowTag missing in xml writ…
yaooqinn Dec 30, 2024
97ee25a
[SPARK-50691][SQL][FOLLOWUP] Use UnsafeProjection for LocalRelation r…
vladimirg-db Dec 30, 2024
43a9b88
[SPARK-50693][CONNECT] The inputs for TypedScalaUdf should be analyzed
ueshin Dec 30, 2024
4c39d6f
[SPARK-50699][PYTHON] Parse and generate DDL string with a specified …
zhengruifeng Dec 31, 2024
fd8230b
[SPARK-49229][CONNECT] Deduplicate Scala UDF handling in the SparkCon…
xupefei Dec 31, 2024
c4145db
[SPARK-50689][SQL][FOLLOWUP] Enforce deterministic ordering in LCA ag…
mihailotim-db Dec 31, 2024
6099de7
[SPARK-50697][SQL] Enable tail-recursion wherever possible
LuciferYang Dec 31, 2024
5ef556b
[SPARK-50706][PYTHON][TESTS] Skip test_value_state_ttl_expiration in …
HyukjinKwon Dec 31, 2024
8a09817
[SPARK-50701][PYTHON] Make plotting require the minimum plotly version
zhengruifeng Dec 31, 2024
e1fb18d
[SPARK-50692] Add RPAD pushdown support
andrej-db Dec 31, 2024
5334494
[SPARK-50648][CORE] Cleanup zombie tasks in non-running stages when t…
yabola Dec 31, 2024
1c79b54
[SPARK-49491][SQL] Replace AnyRefMap with HashMap
George314159 Jan 1, 2025
3f333a0
[SPARK-50642][CONNECT][SS] Fix the state schema for FlatMapGroupsWith…
huanliwang-db Jan 2, 2025
721a417
[SPARK-50702][PYTHON] Refine the docstring of regexp_count, regexp_ex…
drexlersky Jan 2, 2025
492fcd8
[SPARK-50683][SQL] Inline the common expression in With if used once
zml1206 Jan 2, 2025
5c63484
[SPARK-50614][SQL] Add Variant shredding support for Parquet
cashmand Jan 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
3 changes: 3 additions & 0 deletions .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,11 @@ github:
merge: false
squash: true
rebase: true
ghp_branch: master
ghp_path: /docs

notifications:
pullrequests: reviews@spark.apache.org
issues: reviews@spark.apache.org
commits: commits@spark.apache.org
jira_options: link label
11 changes: 10 additions & 1 deletion .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Thanks for sending a pull request! Here are some tips for you:
7. If you want to add a new configuration, please read the guideline first for naming configurations in
'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
8. If you want to add or modify an error type or message, please read the guideline first in
'core/src/main/resources/error/README.md'.
'common/utils/src/main/resources/error/README.md'.
-->

### What changes were proposed in this pull request?
Expand Down Expand Up @@ -47,3 +47,12 @@ If it was tested in a way different from regular unit tests, please clarify how
If tests were not added, please describe why they were not added and/or why it was difficult to add.
If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
-->


### Was this patch authored or co-authored using generative AI tooling?
<!--
If generative AI tooling has been used in the process of authoring this patch, please include the
phrase: 'Generated-by: ' followed by the name of the tool and its version.
If no, write 'No'.
Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
-->
309 changes: 192 additions & 117 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,144 +17,219 @@
# under the License.
#

#
# Pull Request Labeler Github Action Configuration: https://github.com/marketplace/actions/labeler
#
# Note that we currently cannot use the negatioon operator (i.e. `!`) for miniglob matches as they
# would match any file that doesn't touch them. What's needed is the concept of `any `, which takes a
# list of constraints / globs and then matches all of the constraints for either `any` of the files or
# `all` of the files in the change set.
#
# However, `any`/`all` are not supported in a released version and testing off of the `main` branch
# resulted in some other errors when testing.
#
# An issue has been opened upstream requesting that a release be cut that has support for all/any:
# - https://github.com/actions/labeler/issues/111
#
# While we wait for this issue to be handled upstream, we can remove
# the negated / `!` matches for now and at least have labels again.
#
INFRA:
- ".github/**/*"
- "appveyor.yml"
- "tools/**/*"
- "dev/create-release/**/*"
- ".asf.yaml"
- ".gitattributes"
- ".gitignore"
- "dev/github_jira_sync.py"
- "dev/merge_spark_pr.py"
- "dev/run-tests-jenkins*"
- changed-files:
- any-glob-to-any-file: [
'.github/**/*',
'tools/**/*',
'dev/create-release/**/*',
'.asf.yaml',
'.gitattributes',
'.gitignore',
'dev/merge_spark_pr.py'
]

BUILD:
# Can be supported when a stable release with correct all/any is released
#- any: ['dev/**/*', '!dev/github_jira_sync.py', '!dev/merge_spark_pr.py', '!dev/.rat-excludes']
- "dev/**/*"
- "build/**/*"
- "project/**/*"
- "assembly/**/*"
- "**/*pom.xml"
- "bin/docker-image-tool.sh"
- "bin/find-spark-home*"
- "scalastyle-config.xml"
# These can be added in the above `any` clause (and the /dev/**/* glob removed) when
# `any`/`all` support is released
# - "!dev/github_jira_sync.py"
# - "!dev/merge_spark_pr.py"
# - "!dev/run-tests-jenkins*"
# - "!dev/.rat-excludes"
- changed-files:
- all-globs-to-any-file: [
'dev/**/*',
'!dev/merge_spark_pr.py'
]
- any-glob-to-any-file: [
'build/**/*',
'project/**/*',
'assembly/**/*',
'**/*pom.xml',
'bin/docker-image-tool.sh',
'bin/find-spark-home*',
'scalastyle-config.xml'
]

DOCS:
- "docs/**/*"
- "**/README.md"
- "**/CONTRIBUTING.md"
- changed-files:
- any-glob-to-any-file: [
'docs/**/*',
'**/README.md',
'**/CONTRIBUTING.md',
'python/docs/**/*'
]

EXAMPLES:
- "examples/**/*"
- "bin/run-example*"
# CORE needs to be updated when all/any are released upstream.
- changed-files:
- any-glob-to-any-file: [
'examples/**/*',
'bin/run-example*'
]

CORE:
# - any: ["core/**/*", "!**/*UI.scala", "!**/ui/**/*"] # If any file matches all of the globs defined in the list started by `any`, label is applied.
- "core/**/*"
- "common/kvstore/**/*"
- "common/network-common/**/*"
- "common/network-shuffle/**/*"
- "python/pyspark/**/*.py"
- "python/pyspark/tests/**/*.py"
- changed-files:
- all-globs-to-any-file: [
'core/**/*',
'!**/*UI.scala',
'!**/ui/**/*'
]
- any-glob-to-any-file: [
'common/kvstore/**/*',
'common/network-common/**/*',
'common/network-shuffle/**/*',
'python/pyspark/*.py',
'python/pyspark/tests/**/*.py'
]

SPARK SUBMIT:
- "bin/spark-submit*"
- changed-files:
- any-glob-to-any-file: [
'bin/spark-submit*'
]

SPARK SHELL:
- "repl/**/*"
- "bin/spark-shell*"
- changed-files:
- any-glob-to-any-file: [
'repl/**/*',
'bin/spark-shell*'
]

SQL:
#- any: ["**/sql/**/*", "!python/pyspark/sql/avro/**/*", "!python/pyspark/sql/streaming/**/*", "!python/pyspark/sql/tests/streaming/test_streaming.py"]
- "**/sql/**/*"
- "common/unsafe/**/*"
#- "!python/pyspark/sql/avro/**/*"
#- "!python/pyspark/sql/streaming/**/*"
#- "!python/pyspark/sql/tests/streaming/test_streaming.py"
- "bin/spark-sql*"
- "bin/beeline*"
- "sbin/*thriftserver*.sh"
- "**/*SQL*.R"
- "**/DataFrame.R"
- "**/*WindowSpec.R"
- "**/*catalog.R"
- "**/*column.R"
- "**/*functions.R"
- "**/*group.R"
- "**/*schema.R"
- "**/*types.R"
- changed-files:
- all-globs-to-any-file: [
'**/sql/**/*',
'!python/**/avro/**/*',
'!python/**/protobuf/**/*',
'!python/**/streaming/**/*'
]
- any-glob-to-any-file: [
'common/unsafe/**/*',
'common/sketch/**/*',
'common/variant/**/*',
'bin/spark-sql*',
'bin/beeline*',
'sbin/*thriftserver*.sh',
'**/*SQL*.R',
'**/DataFrame.R',
'**/*WindowSpec.R',
'**/*catalog.R',
'**/*column.R',
'**/*functions.R',
'**/*group.R',
'**/*schema.R',
'**/*types.R'
]

AVRO:
- "connector/avro/**/*"
- "python/pyspark/sql/avro/**/*"
- changed-files:
- any-glob-to-any-file: [
'connector/avro/**/*',
'python/**/avro/**/*'
]

DSTREAM:
- "streaming/**/*"
- "data/streaming/**/*"
- "connector/kinesis*"
- "connector/kafka*"
- "python/pyspark/streaming/**/*"
- changed-files:
- any-glob-to-any-file: [
'streaming/**/*',
'data/streaming/**/*',
'connector/kinesis-asl/**/*',
'connector/kinesis-asl-assembly/**/*',
'connector/kafka-0-10/**/*',
'connector/kafka-0-10-assembly/**/*',
'connector/kafka-0-10-token-provider/**/*',
'python/pyspark/streaming/**/*'
]

GRAPHX:
- "graphx/**/*"
- "data/graphx/**/*"
- changed-files:
- any-glob-to-any-file: [
'graphx/**/*',
'data/graphx/**/*'
]

ML:
- "**/ml/**/*"
- "**/*mllib_*.R"
- changed-files:
- any-glob-to-any-file: [
'**/ml/**/*',
'**/*mllib_*.R'
]

MLLIB:
- "**/spark/mllib/**/*"
- "mllib-local/**/*"
- "python/pyspark/mllib/**/*"
- changed-files:
- any-glob-to-any-file: [
'**/mllib/**/*',
'mllib-local/**/*'
]

STRUCTURED STREAMING:
- "**/sql/**/streaming/**/*"
- "connector/kafka-0-10-sql/**/*"
- "python/pyspark/sql/streaming/**/*"
- "python/pyspark/sql/tests/streaming/test_streaming.py"
- "**/*streaming.R"
- changed-files:
- any-glob-to-any-file: [
'**/sql/**/streaming/**/*',
'connector/kafka-0-10-sql/**/*',
'python/pyspark/sql/**/streaming/**/*',
'**/*streaming.R'
]

PYTHON:
- "bin/pyspark*"
- "**/python/**/*"
- changed-files:
- any-glob-to-any-file: [
'bin/pyspark*',
'**/python/**/*'
]

PANDAS API ON SPARK:
- "python/pyspark/pandas/**/*"
- changed-files:
- any-glob-to-any-file: [
'python/pyspark/pandas/**/*'
]

R:
- "**/r/**/*"
- "**/R/**/*"
- "bin/sparkR*"
- changed-files:
- any-glob-to-any-file: [
'**/r/**/*',
'**/R/**/*',
'bin/sparkR*'
]

YARN:
- "resource-managers/yarn/**/*"
MESOS:
- "resource-managers/mesos/**/*"
- "sbin/*mesos*.sh"
- changed-files:
- any-glob-to-any-file: [
'resource-managers/yarn/**/*'
]

KUBERNETES:
- "resource-managers/kubernetes/**/*"
- changed-files:
- any-glob-to-any-file: [
'bin/docker-image-tool.sh',
'resource-managers/kubernetes/**/*'
]

WINDOWS:
- "**/*.cmd"
- "R/pkg/tests/fulltests/test_Windows.R"
- changed-files:
- any-glob-to-any-file: [
'**/*.cmd',
'R/pkg/tests/fulltests/test_Windows.R'
]

WEB UI:
- "**/ui/**/*"
- "**/*UI.scala"
- changed-files:
- any-glob-to-any-file: [
'**/ui/**/*',
'**/*UI.scala'
]

DEPLOY:
- "sbin/**/*"
- changed-files:
- any-glob-to-any-file: [
'sbin/**/*'
]

CONNECT:
- "connector/connect/**/*"
- "**/sql/sparkconnect/**/*"
- "python/pyspark/sql/**/connect/**/*"
- changed-files:
- any-glob-to-any-file: [
'sql/connect/**/*',
'connector/connect/**/*',
'python/**/connect/**/*'
]

PROTOBUF:
- "connector/protobuf/**/*"
- "python/pyspark/sql/protobuf/**/*"
- changed-files:
- any-glob-to-any-file: [
'connector/protobuf/**/*',
'python/**/protobuf/**/*'
]
Loading
Loading