Skip to content

Commit

Permalink
[SPARK-45410][INFRA] Add Python GitHub Action Daily Job
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Python language becomes more and more important and we have **11** Python-related test pipelines.

This PR aims to add `Python GitHub Action Daily` job
- To setup a framework to add other Python versions later.
- To offload PyPy3 from main PR and commit builders to this Daily job.

### Why are the changes needed?

This will improve our test coverage and will save lots of time in the main builders.

Currently, we are running two Python executables at every commits.
```
========================================================================
Running PySpark tests
========================================================================
Running PySpark tests. Output is in /__w/spark/spark/python/unit-tests.log
Will test against the following Python executables: ['python3.9', 'pypy3']
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#43209 from dongjoon-hyun/SPARK-45410.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
dongjoon-hyun committed Oct 4, 2023
1 parent d5c8dfc commit 8d4dc9c
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 2 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,7 @@ jobs:
pyspark-pandas-connect-part3
env:
MODULES_TO_TEST: ${{ matrix.modules }}
PYTHON_TO_TEST: 'python3.9'
HADOOP_PROFILE: ${{ inputs.hadoop }}
HIVE_PROFILE: hive2.3
GITHUB_PREV_SHA: ${{ github.event.before }}
Expand Down Expand Up @@ -447,7 +448,7 @@ jobs:
export SKIP_PACKAGING=false
echo "Python Packaging Tests Enabled!"
fi
./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST"
./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST" --python-executables "$PYTHON_TO_TEST"
- name: Upload coverage to Codecov
if: fromJSON(inputs.envs).PYSPARK_CODECOV == 'true'
uses: codecov/codecov-action@v2
Expand Down
44 changes: 44 additions & 0 deletions .github/workflows/build_python.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Python - PyPy3.8 (master)"

on:
schedule:
- cron: '0 15 * * *'

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
java: 17
branch: master
hadoop: hadoop3
envs: >-
{
"PYTHON_TO_TEST": "pypy3"
}
jobs: >-
{
"pyspark": "true"
}
10 changes: 9 additions & 1 deletion dev/run-tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,7 +375,7 @@ def run_scala_tests(build_tool, extra_profiles, test_modules, excluded_tags, inc
run_scala_tests_sbt(test_modules, test_profiles)


def run_python_tests(test_modules, parallelism, with_coverage=False):
def run_python_tests(test_modules, test_pythons, parallelism, with_coverage=False):
set_title_and_block("Running PySpark tests", "BLOCK_PYSPARK_UNIT_TESTS")

if with_coverage:
Expand All @@ -390,6 +390,7 @@ def run_python_tests(test_modules, parallelism, with_coverage=False):
if test_modules != [modules.root]:
command.append("--modules=%s" % ",".join(m.name for m in test_modules))
command.append("--parallelism=%i" % parallelism)
command.append("--python-executables=%s" % test_pythons)
run_cmd(command)


Expand Down Expand Up @@ -423,6 +424,12 @@ def parse_opts():
default=8,
help="The number of suites to test in parallel (default %(default)d)",
)
parser.add_argument(
"--python-executables",
type=str,
default="python3.9",
help="A comma-separated list of Python executables to test against (default: %(default)s)",
)
parser.add_argument(
"-m",
"--modules",
Expand Down Expand Up @@ -651,6 +658,7 @@ def main():
if modules_with_python_tests and not os.environ.get("SKIP_PYTHON"):
run_python_tests(
modules_with_python_tests,
opts.python_executables,
opts.parallelism,
with_coverage=os.environ.get("PYSPARK_CODECOV", "false") == "true",
)
Expand Down

0 comments on commit 8d4dc9c

Please sign in to comment.