first commit
Some checks failed
Self-hosted runner (nightly-past-ci-caller) / Get number (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.11 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.10 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.9 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.8 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.7 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.6 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.5 (push) Has been cancelled
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled
Build documentation / build (push) Has been cancelled
Build documentation / build_other_lang (push) Has been cancelled
CodeQL Security Analysis / CodeQL Analysis (push) Has been cancelled
New model PR merged notification / Notify new model (push) Has been cancelled
PR CI / pr-ci (push) Has been cancelled
Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Transformers metadata / build_and_package (push) Has been cancelled
Slow tests on important models (on Push - A10) / Model CI (push) Has been cancelled
Check Tiny Models / Check tiny models (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Model CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Pipeline CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Example CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / DeepSpeed CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Trainer/FSDP CI (push) Has been cancelled
Nvidia CI - Flash Attn / Setup (push) Has been cancelled
Nvidia CI - Flash Attn / Model CI (push) Has been cancelled
Nvidia CI / Setup (push) Has been cancelled
Nvidia CI / Model CI (push) Has been cancelled
Nvidia CI / Torch pipeline CI (push) Has been cancelled
Nvidia CI / Example CI (push) Has been cancelled
Nvidia CI / Trainer/FSDP CI (push) Has been cancelled
Nvidia CI / DeepSpeed CI (push) Has been cancelled
Nvidia CI / Quantization CI (push) Has been cancelled
Nvidia CI / Kernels CI (push) Has been cancelled
Doctests / Setup (push) Has been cancelled
Doctests / Call doctest jobs (push) Has been cancelled
Doctests / Send results to webhook (push) Has been cancelled
Extras Smoke Test / Get supported Python versions (push) Has been cancelled
Extras Smoke Test / Test extras on Python ${{ matrix.python-version }} (push) Has been cancelled
Extras Smoke Test / Check Slack token availability (push) Has been cancelled
Extras Smoke Test / Notify failures to Slack (push) Has been cancelled
Self-hosted runner (AMD scheduled CI caller) / Trigger Scheduled AMD CI (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
Some checks failed
Self-hosted runner (nightly-past-ci-caller) / Get number (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.11 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.10 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.9 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.8 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.7 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.6 (push) Has been cancelled
Self-hosted runner (nightly-past-ci-caller) / TensorFlow 2.5 (push) Has been cancelled
Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled
Build documentation / build (push) Has been cancelled
Build documentation / build_other_lang (push) Has been cancelled
CodeQL Security Analysis / CodeQL Analysis (push) Has been cancelled
New model PR merged notification / Notify new model (push) Has been cancelled
PR CI / pr-ci (push) Has been cancelled
Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
Update Transformers metadata / build_and_package (push) Has been cancelled
Slow tests on important models (on Push - A10) / Model CI (push) Has been cancelled
Check Tiny Models / Check tiny models (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Model CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Pipeline CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Example CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / DeepSpeed CI (push) Has been cancelled
Self-hosted runner (Intel Gaudi3 scheduled CI caller) / Trainer/FSDP CI (push) Has been cancelled
Nvidia CI - Flash Attn / Setup (push) Has been cancelled
Nvidia CI - Flash Attn / Model CI (push) Has been cancelled
Nvidia CI / Setup (push) Has been cancelled
Nvidia CI / Model CI (push) Has been cancelled
Nvidia CI / Torch pipeline CI (push) Has been cancelled
Nvidia CI / Example CI (push) Has been cancelled
Nvidia CI / Trainer/FSDP CI (push) Has been cancelled
Nvidia CI / DeepSpeed CI (push) Has been cancelled
Nvidia CI / Quantization CI (push) Has been cancelled
Nvidia CI / Kernels CI (push) Has been cancelled
Doctests / Setup (push) Has been cancelled
Doctests / Call doctest jobs (push) Has been cancelled
Doctests / Send results to webhook (push) Has been cancelled
Extras Smoke Test / Get supported Python versions (push) Has been cancelled
Extras Smoke Test / Test extras on Python ${{ matrix.python-version }} (push) Has been cancelled
Extras Smoke Test / Check Slack token availability (push) Has been cancelled
Extras Smoke Test / Notify failures to Slack (push) Has been cancelled
Self-hosted runner (AMD scheduled CI caller) / Trigger Scheduled AMD CI (push) Has been cancelled
Stale Bot / Close Stale Issues (push) Has been cancelled
This commit is contained in:
9
.github/workflows/TROUBLESHOOT.md
vendored
Normal file
9
.github/workflows/TROUBLESHOOT.md
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
# Troubleshooting
|
||||
|
||||
This is a document explaining how to deal with various issues on github-actions self-hosted CI. The entries may include actual solutions or pointers to Issues that cover those.
|
||||
|
||||
## GitHub Actions (self-hosted CI)
|
||||
|
||||
* Deepspeed
|
||||
|
||||
- if jit build hangs, clear out `rm -rf ~/.cache/torch_extensions/` reference: https://github.com/huggingface/transformers/pull/12723
|
||||
85
.github/workflows/add-model-like.yml
vendored
Normal file
85
.github/workflows/add-model-like.yml
vendored
Normal file
@@ -0,0 +1,85 @@
|
||||
name: Add model like runner
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- none # put main here when this is fixed
|
||||
#pull_request:
|
||||
# paths:
|
||||
# - "src/**"
|
||||
# - "tests/**"
|
||||
# - ".github/**"
|
||||
# types: [opened, synchronize, reopened]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
run_tests_templates_like:
|
||||
name: "Add new model like template tests"
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
sudo apt -y update && sudo apt install -y libsndfile1-dev
|
||||
|
||||
- name: Load cached virtual environment
|
||||
uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
|
||||
id: cache
|
||||
with:
|
||||
path: ~/venv/
|
||||
key: v4-tests_model_like-${{ hashFiles('setup.py') }}
|
||||
|
||||
- name: Create virtual environment on cache miss
|
||||
if: steps.cache.outputs.cache-hit != 'true'
|
||||
run: |
|
||||
python -m venv ~/venv && . ~/venv/bin/activate
|
||||
pip install --upgrade pip!=21.3
|
||||
pip install -e .[dev]
|
||||
|
||||
- name: Check transformers location
|
||||
# make `transformers` available as package (required since we use `-e` flag) and check it's indeed from the repo.
|
||||
run: |
|
||||
. ~/venv/bin/activate
|
||||
python setup.py develop
|
||||
transformers_install=$(pip list -e | grep transformers)
|
||||
transformers_install_array=($transformers_install)
|
||||
transformers_loc=${transformers_install_array[-1]}
|
||||
transformers_repo_loc=$(pwd .)
|
||||
if [ "$transformers_loc" != "$transformers_repo_loc" ]; then
|
||||
echo "transformers is from $transformers_loc but it shoud be from $transformers_repo_loc/src."
|
||||
echo "A fix is required. Stop testing."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Create model files
|
||||
run: |
|
||||
. ~/venv/bin/activate
|
||||
transformers add-new-model-like --config_file tests/fixtures/add_distilbert_like_config.json --path_to_repo .
|
||||
make style
|
||||
make fix-copies
|
||||
|
||||
- name: Run all PyTorch modeling test
|
||||
run: |
|
||||
. ~/venv/bin/activate
|
||||
python -m pytest -n 2 --dist=loadfile -s --make-reports=tests_new_models tests/bert_new/test_modeling_bert_new.py
|
||||
|
||||
- name: Run style changes
|
||||
run: |
|
||||
. ~/venv/bin/activate
|
||||
make style && make quality && make repo-consistency
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ always() }}
|
||||
run: cat reports/tests_new_models/failures_short.txt
|
||||
|
||||
- name: Test suite reports artifacts
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: run_all_tests_new_models_test_reports
|
||||
path: reports/tests_new_models
|
||||
68
.github/workflows/anti-slop.yml
vendored
Normal file
68
.github/workflows/anti-slop.yml
vendored
Normal file
@@ -0,0 +1,68 @@
|
||||
name: Anti-Slop
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
issues: read
|
||||
pull-requests: write
|
||||
|
||||
on:
|
||||
pull_request_target:
|
||||
types: [opened, reopened]
|
||||
|
||||
jobs:
|
||||
anti-slop:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: peakoss/anti-slop@85daca1880e9e1af197fc06ea03349daf08f4202 # v0.2.1
|
||||
with:
|
||||
# -- Failure threshold --
|
||||
# Require both enabled checks to fail before labeling while we validate the signals
|
||||
max-failures: 2
|
||||
|
||||
# -- Do NOT close or lock, just label --
|
||||
close-pr: false
|
||||
lock-pr: false
|
||||
failure-add-pr-labels: "Code agent slop"
|
||||
failure-pr-message: |
|
||||
This PR was flagged by our automated quality checks. If you're a genuine
|
||||
contributor, please reply here and a maintainer will review your PR.
|
||||
|
||||
Common reasons for flagging:
|
||||
- New GitHub account
|
||||
- Unusually high number of repository forks in a 24-hour window
|
||||
|
||||
We appreciate your contribution and apologize if this is a false positive!
|
||||
|
||||
# -- Account checks --
|
||||
# Start with two conservative, high-signal checks and iterate from there
|
||||
min-account-age: 30
|
||||
max-daily-forks: 7
|
||||
|
||||
# -- Disabled checks (keep minimal) --
|
||||
blocked-source-branches: ""
|
||||
blocked-paths: ""
|
||||
detect-spam-usernames: false
|
||||
min-profile-completeness: 0
|
||||
require-description: false
|
||||
require-linked-issue: false
|
||||
require-conventional-title: false
|
||||
require-pr-template: false
|
||||
strict-pr-template-sections: ""
|
||||
optional-pr-template-sections: ""
|
||||
max-additional-pr-template-sections: 0
|
||||
max-description-length: 0
|
||||
require-conventional-commits: false
|
||||
require-commit-author-match: false
|
||||
require-maintainer-can-modify: false
|
||||
require-final-newline: false
|
||||
max-added-comments: 0
|
||||
max-emoji-count: 0
|
||||
max-code-references: 0
|
||||
max-commit-message-length: 0
|
||||
min-repo-merged-prs: 0
|
||||
min-repo-merge-ratio: 0
|
||||
min-global-merge-ratio: 0
|
||||
|
||||
# -- Exemptions --
|
||||
exempt-author-association: "OWNER,MEMBER,COLLABORATOR"
|
||||
exempt-label: "exempt"
|
||||
31
.github/workflows/assign-reviewers.yml
vendored
Normal file
31
.github/workflows/assign-reviewers.yml
vendored
Normal file
@@ -0,0 +1,31 @@
|
||||
name: Assign PR Reviewers
|
||||
on:
|
||||
pull_request_target:
|
||||
branches:
|
||||
- main
|
||||
types: [ready_for_review]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
assign_reviewers:
|
||||
permissions:
|
||||
pull-requests: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: '3.13'
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install PyGithub
|
||||
- name: Run assignment script
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
run: python .github/scripts/assign_reviewers.py
|
||||
65
.github/workflows/benchmark.yml
vendored
Normal file
65
.github/workflows/benchmark.yml
vendored
Normal file
@@ -0,0 +1,65 @@
|
||||
name: Self-hosted runner (benchmark)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
types: [ opened, labeled, reopened, synchronize ]
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
DATASET_ID: hf-benchmarks/transformers
|
||||
MODEL_ID: meta-llama/Llama-3.1-8B-Instruct
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
benchmark:
|
||||
name: Benchmark
|
||||
strategy:
|
||||
matrix:
|
||||
# group: [aws-g5-4xlarge-cache, aws-p4d-24xlarge-plus] (A100 runner is not enabled)
|
||||
group: [aws-g5-4xlarge-cache]
|
||||
runs-on:
|
||||
group: ${{ matrix.group }}
|
||||
if: |
|
||||
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark') )||
|
||||
(github.event_name == 'push' && github.ref == 'refs/heads/main')
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --privileged --ipc host
|
||||
steps:
|
||||
- name: Get repo
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
fetch-depth: 1
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install benchmark script dependencies
|
||||
run: python3 -m pip install -r benchmark_v2/requirements.txt kernels
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e ".[torch]"
|
||||
|
||||
- name: Run benchmark
|
||||
run: |
|
||||
git config --global --add safe.directory /__w/transformers/transformers
|
||||
if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
|
||||
commit_id=$(echo "${{ github.event.pull_request.head.sha }}")
|
||||
elif [ "$GITHUB_EVENT_NAME" = "push" ]; then
|
||||
commit_id=$GITHUB_SHA
|
||||
fi
|
||||
commit_msg=$(git show -s --format=%s | cut -c1-70)
|
||||
python3 benchmark_v2/run_benchmarks.py -b 32 -s 128 -n 256 --level 2 --branch-name "$BRANCH_NAME" --commit-id "$commit_id" --commit-message "$commit_msg" --model-id "$MODEL_ID" --log-level INFO --push-result-to-dataset "$DATASET_ID"
|
||||
env:
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
PUSH_TO_HUB_TOKEN: ${{ secrets.PUSH_TO_HUB_TOKEN }}
|
||||
# Enable this to see debug logs
|
||||
# HF_HUB_VERBOSITY: debug
|
||||
# TRANSFORMERS_VERBOSITY: debug
|
||||
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
|
||||
91
.github/workflows/benchmark_v2.yml
vendored
Normal file
91
.github/workflows/benchmark_v2.yml
vendored
Normal file
@@ -0,0 +1,91 @@
|
||||
name: Benchmark v2 Framework
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
runner:
|
||||
description: 'Runner to use for the benchmark job'
|
||||
required: true
|
||||
type: string
|
||||
container_image:
|
||||
description: 'Container image to use'
|
||||
required: true
|
||||
type: string
|
||||
container_options:
|
||||
description: 'Container options'
|
||||
required: false
|
||||
type: string
|
||||
default: ''
|
||||
commit_sha:
|
||||
description: 'Commit SHA to benchmark'
|
||||
required: false
|
||||
type: string
|
||||
run_id:
|
||||
description: 'Run ID for tracking'
|
||||
required: false
|
||||
type: string
|
||||
benchmark_repo_id:
|
||||
description: 'Repository ID to push benchmark results'
|
||||
required: true
|
||||
type: string
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
|
||||
# This token is created under the bot `hf-transformers-bot`.
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
benchmark-v2:
|
||||
name: Benchmark v2
|
||||
runs-on: ${{ inputs.runner }}
|
||||
if: |
|
||||
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark')) ||
|
||||
(github.event_name == 'schedule')
|
||||
container:
|
||||
image: ${{ inputs.container_image }}
|
||||
options: ${{ inputs.container_options }}
|
||||
steps:
|
||||
- name: Get repo
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
ref: ${{ inputs.commit_sha || github.sha }}
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install benchmark dependencies
|
||||
run: |
|
||||
python3 -m pip install -r benchmark_v2/requirements.txt
|
||||
|
||||
- name: Reinstall transformers in edit mode
|
||||
run: |
|
||||
python3 -m pip uninstall -y transformers
|
||||
python3 -m pip install -e ".[torch]"
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: |
|
||||
python3 -m pip list
|
||||
python3 -c "import torch; print(f'PyTorch version: {torch.__version__}')"
|
||||
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
|
||||
python3 -c "import torch; print(f'CUDA device count: {torch.cuda.device_count()}')" || true
|
||||
nvidia-smi || true
|
||||
|
||||
- name: Run benchmark v2
|
||||
working-directory: benchmark_v2
|
||||
env:
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
COMMIT_ID: ${{ inputs.commit_sha || github.sha }}
|
||||
RUN_ID: ${{ inputs.run_id }}
|
||||
BENCHMARK_REPO_ID: ${{ inputs.benchmark_repo_id }}
|
||||
UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
|
||||
run: |
|
||||
echo "Running benchmarks"
|
||||
python3 run_benchmarks.py \
|
||||
--commit-id "$COMMIT_ID" \
|
||||
--run-id "$RUN_ID" \
|
||||
--push-to-hub "$BENCHMARK_REPO_ID" \
|
||||
--token "$UPLOAD_TOKEN" \
|
||||
--log-level INFO
|
||||
20
.github/workflows/benchmark_v2_a10_caller.yml
vendored
Normal file
20
.github/workflows/benchmark_v2_a10_caller.yml
vendored
Normal file
@@ -0,0 +1,20 @@
|
||||
name: Benchmark v2 Scheduled Runner - A10 Single-GPU
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
benchmark-v2-default:
|
||||
name: Benchmark v2 - Default Models
|
||||
uses: ./.github/workflows/benchmark_v2.yml
|
||||
with:
|
||||
runner: aws-g5-4xlarge-cache-use1-public-80
|
||||
container_image: huggingface/transformers-all-latest-gpu
|
||||
container_options: --gpus all --privileged --ipc host --shm-size "16gb"
|
||||
commit_sha: ${{ github.sha }}
|
||||
run_id: ${{ github.run_id }}
|
||||
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks
|
||||
secrets: inherit
|
||||
20
.github/workflows/benchmark_v2_mi325_caller.yml
vendored
Normal file
20
.github/workflows/benchmark_v2_mi325_caller.yml
vendored
Normal file
@@ -0,0 +1,20 @@
|
||||
name: Benchmark v2 Scheduled Runner - MI325 Single-GPU
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
benchmark-v2-default:
|
||||
name: Benchmark v2 - Default Models
|
||||
uses: ./.github/workflows/benchmark_v2.yml
|
||||
with:
|
||||
runner: amd-mi325-ci-1gpu
|
||||
container_image: huggingface/transformers-pytorch-amd-gpu
|
||||
container_options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache
|
||||
commit_sha: ${{ github.sha }}
|
||||
run_id: ${{ github.run_id }}
|
||||
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks
|
||||
secrets: inherit
|
||||
84
.github/workflows/build-ci-docker-images.yml
vendored
Normal file
84
.github/workflows/build-ci-docker-images.yml
vendored
Normal file
@@ -0,0 +1,84 @@
|
||||
name: Build pr ci-docker
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- push-ci-image # for now let's only build on this branch
|
||||
repository_dispatch:
|
||||
workflow_call:
|
||||
inputs:
|
||||
image_postfix:
|
||||
required: true
|
||||
type: string
|
||||
schedule:
|
||||
- cron: "6 0 * * *"
|
||||
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}
|
||||
cancel-in-progress: true
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-22.04
|
||||
|
||||
if: ${{ contains(github.event.head_commit.message, '[build-ci-image]') || contains(github.event.head_commit.message, '[push-ci-image]') && '!cancelled()' || github.event_name == 'schedule' }}
|
||||
|
||||
strategy:
|
||||
matrix:
|
||||
file: ["quality", "consistency", "custom-tokenizers", "torch-light", "exotic-models", "examples-torch"]
|
||||
continue-on-error: true
|
||||
|
||||
steps:
|
||||
-
|
||||
name: Set tag
|
||||
env:
|
||||
COMMIT_MESSAGE: ${{ github.event.head_commit.message }}
|
||||
run: |
|
||||
if echo "$COMMIT_MESSAGE" | grep -q '\[build-ci-image\]'; then
|
||||
echo "TAG=huggingface/transformers-${{ matrix.file }}:dev" >> "$GITHUB_ENV"
|
||||
echo "setting it to DEV!"
|
||||
else
|
||||
echo "TAG=huggingface/transformers-${{ matrix.file }}" >> "$GITHUB_ENV"
|
||||
|
||||
fi
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build ${{ matrix.file }}.dockerfile
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker
|
||||
build-args: |
|
||||
REF=${{ github.sha }}
|
||||
file: "./docker/${{ matrix.file }}.dockerfile"
|
||||
push: ${{ contains(github.event.head_commit.message, 'ci-image]') || github.event_name == 'schedule' }}
|
||||
tags: ${{ env.TAG }}
|
||||
|
||||
notify:
|
||||
runs-on: ubuntu-22.04
|
||||
if: ${{ contains(github.event.head_commit.message, '[build-ci-image]') || contains(github.event.head_commit.message, '[push-ci-image]') && '!cancelled()' || github.event_name == 'schedule' }}
|
||||
steps:
|
||||
- name: Post to Slack
|
||||
if: ${{ contains(github.event.head_commit.message, '[push-ci-image]') && github.event_name != 'schedule' }}
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@a88e7fa2eaee28de5a4d6142381b1fb792349b67 # main
|
||||
with:
|
||||
slack_channel: "#transformers-ci-circleci-images"
|
||||
title: 🤗 New docker images for CircleCI are pushed.
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
321
.github/workflows/build-docker-images.yml
vendored
Normal file
321
.github/workflows/build-docker-images.yml
vendored
Normal file
@@ -0,0 +1,321 @@
|
||||
name: Build docker images (scheduled)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- build_ci_docker_image*
|
||||
repository_dispatch:
|
||||
workflow_dispatch:
|
||||
workflow_call:
|
||||
inputs:
|
||||
image_postfix:
|
||||
required: true
|
||||
type: string
|
||||
schedule:
|
||||
- cron: "17 0 * * *"
|
||||
|
||||
concurrency:
|
||||
group: docker-images-builds
|
||||
cancel-in-progress: false
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
latest-docker:
|
||||
name: "Latest PyTorch [dev]"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-all-latest-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-all-latest-gpu${{ inputs.image_postfix }}
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the transformers-all-latest-gpu docker build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
flash-attn-ci-image:
|
||||
name: "PyTorch with Flash Attn [dev]"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-all-latest-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
PYTORCH=2.8.0
|
||||
TORCHCODEC=0.7.0
|
||||
FLASH_ATTN=yes
|
||||
push: true
|
||||
tags: huggingface/transformers-all-latest-gpu${{ inputs.image_postfix }}:flash-attn
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the transformers-all-latest-gpu docker build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
latest-torch-deepspeed-docker:
|
||||
name: "Latest PyTorch + DeepSpeed"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-pytorch-deepspeed-latest-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-pytorch-deepspeed-latest-gpu${{ inputs.image_postfix }}
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER}}
|
||||
title: 🤗 Results of the transformers-pytorch-deepspeed-latest-gpu docker build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
doc-builder:
|
||||
name: "Doc builder"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-doc-builder
|
||||
push: true
|
||||
tags: huggingface/transformers-doc-builder
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the huggingface/transformers-doc-builder docker build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
latest-pytorch-amd:
|
||||
name: "Latest PyTorch (AMD) [dev]"
|
||||
runs-on:
|
||||
group: aws-highcpu-32-priv
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-pytorch-amd-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-pytorch-amd-gpu${{ inputs.image_postfix }}
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the huggingface/transformers-pytorch-amd-gpu build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
cache-latest-pytorch-amd:
|
||||
name: "Cache Latest Pytorch (AMD) Image"
|
||||
needs: latest-pytorch-amd
|
||||
runs-on:
|
||||
group: amd-mi325-1gpu
|
||||
steps:
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
|
||||
-
|
||||
name: Pull and save docker image to cache
|
||||
run: |
|
||||
image="huggingface/transformers-pytorch-amd-gpu"
|
||||
final_path="/mnt/image-cache/transformers-pytorch-amd-gpu.tar"
|
||||
tmp_path="${final_path}.tmp"
|
||||
|
||||
echo "Pulling image: ${image}"
|
||||
docker pull "${image}"
|
||||
|
||||
echo "Saving to temp file: ${tmp_path}"
|
||||
docker save "${image}" -o "${tmp_path}"
|
||||
|
||||
echo "Moving to final path: ${final_path}"
|
||||
mv -f "${tmp_path}" "${final_path}"
|
||||
|
||||
echo "Cache populated successfully at ${final_path}"
|
||||
|
||||
latest-pytorch-deepspeed-amd:
|
||||
name: "PyTorch + DeepSpeed (AMD) [dev]"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-pytorch-deepspeed-amd-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-pytorch-deepspeed-amd-gpu${{ inputs.image_postfix }}
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the transformers-pytorch-deepspeed-amd-gpu build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
|
||||
latest-quantization-torch-docker:
|
||||
name: "Latest Pytorch + Quantization [dev]"
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0
|
||||
with:
|
||||
context: ./docker/transformers-quantization-latest-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-quantization-latest-gpu${{ inputs.image_postfix }}
|
||||
|
||||
- name: Post to Slack
|
||||
if: always()
|
||||
uses: huggingface/hf-workflows/.github/actions/post-slack@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
slack_channel: ${{ secrets.CI_SLACK_CHANNEL_DOCKER }}
|
||||
title: 🤗 Results of the transformers-quantization-latest-gpu build
|
||||
status: ${{ job.status }}
|
||||
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
80
.github/workflows/build-nightly-ci-docker-images.yml
vendored
Normal file
80
.github/workflows/build-nightly-ci-docker-images.yml
vendored
Normal file
@@ -0,0 +1,80 @@
|
||||
name: Build docker images (Nightly CI)
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
push:
|
||||
branches:
|
||||
- build_nightly_ci_docker_image*
|
||||
|
||||
concurrency:
|
||||
group: docker-images-builds
|
||||
cancel-in-progress: false
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
latest-with-torch-nightly-docker:
|
||||
name: "Nightly PyTorch"
|
||||
if: inputs.job == 'latest-with-torch-nightly-docker' || inputs.job == ''
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@885d1462b80bc1c1c7f0b00334ad271f09369c55 # v2.10.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@465a07811f14bebb1938fbed4728c6a1ff8901fc # v2.2.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@1104d471370f9806843c095c1db02b5a90c5f8b6 # v3.3.1
|
||||
with:
|
||||
context: ./docker/transformers-all-latest-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
PYTORCH=pre
|
||||
push: true
|
||||
tags: huggingface/transformers-all-latest-torch-nightly-gpu
|
||||
|
||||
nightly-torch-deepspeed-docker:
|
||||
name: "Nightly PyTorch + DeepSpeed"
|
||||
if: inputs.job == 'nightly-torch-deepspeed-docker' || inputs.job == ''
|
||||
runs-on:
|
||||
group: aws-g4dn-2xlarge-cache
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@885d1462b80bc1c1c7f0b00334ad271f09369c55 # v2.10.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@465a07811f14bebb1938fbed4728c6a1ff8901fc # v2.2.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@1104d471370f9806843c095c1db02b5a90c5f8b6 # v3.3.1
|
||||
with:
|
||||
context: ./docker/transformers-pytorch-deepspeed-nightly-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
push: true
|
||||
tags: huggingface/transformers-pytorch-deepspeed-nightly-gpu
|
||||
108
.github/workflows/build-past-ci-docker-images.yml
vendored
Normal file
108
.github/workflows/build-past-ci-docker-images.yml
vendored
Normal file
@@ -0,0 +1,108 @@
|
||||
name: Build docker images (Past CI)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- build_past_ci_docker_image*
|
||||
|
||||
concurrency:
|
||||
group: docker-images-builds
|
||||
cancel-in-progress: false
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
past-pytorch-docker:
|
||||
name: "Past PyTorch Docker"
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
version: ["1.13", "1.12", "1.11"]
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@885d1462b80bc1c1c7f0b00334ad271f09369c55 # v2.10.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
id: get-base-image
|
||||
name: Get Base Image
|
||||
env:
|
||||
framework_version: ${{ matrix.version }}
|
||||
run: |
|
||||
echo "base_image=$(python3 -c 'import os; from utils.past_ci_versions import past_versions_testing; base_image = past_versions_testing["pytorch"][os.environ["framework_version"]]["base_image"]; print(base_image)')" >> $GITHUB_OUTPUT
|
||||
-
|
||||
name: Print Base Image
|
||||
run: |
|
||||
echo ${{ steps.get-base-image.outputs.base_image }}
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@465a07811f14bebb1938fbed4728c6a1ff8901fc # v2.2.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@1104d471370f9806843c095c1db02b5a90c5f8b6 # v3.3.1
|
||||
with:
|
||||
context: ./docker/transformers-past-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
BASE_DOCKER_IMAGE=${{ steps.get-base-image.outputs.base_image }}
|
||||
FRAMEWORK=pytorch
|
||||
VERSION=${{ matrix.version }}
|
||||
push: true
|
||||
tags: huggingface/transformers-pytorch-past-${{ matrix.version }}-gpu
|
||||
|
||||
past-tensorflow-docker:
|
||||
name: "Past TensorFlow Docker"
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
version: ["2.11", "2.10", "2.9", "2.8", "2.7", "2.6", "2.5"]
|
||||
runs-on:
|
||||
group: aws-general-8-plus
|
||||
steps:
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@885d1462b80bc1c1c7f0b00334ad271f09369c55 # v2.10.0
|
||||
-
|
||||
name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
-
|
||||
id: get-base-image
|
||||
name: Get Base Image
|
||||
env:
|
||||
framework_version: ${{ matrix.version }}
|
||||
run: |
|
||||
echo "base_image=$(python3 -c 'import os; from utils.past_ci_versions import past_versions_testing; base_image = past_versions_testing["tensorflow"][os.environ["framework_version"]]["base_image"]; print(base_image)')" >> $GITHUB_OUTPUT
|
||||
-
|
||||
name: Print Base Image
|
||||
run: |
|
||||
echo ${{ steps.get-base-image.outputs.base_image }}
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@465a07811f14bebb1938fbed4728c6a1ff8901fc # v2.2.0
|
||||
with:
|
||||
username: ${{ secrets.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_PASSWORD }}
|
||||
-
|
||||
name: Build and push
|
||||
uses: docker/build-push-action@1104d471370f9806843c095c1db02b5a90c5f8b6 # v3.3.1
|
||||
with:
|
||||
context: ./docker/transformers-past-gpu
|
||||
build-args: |
|
||||
REF=main
|
||||
BASE_DOCKER_IMAGE=${{ steps.get-base-image.outputs.base_image }}
|
||||
FRAMEWORK=tensorflow
|
||||
VERSION=${{ matrix.version }}
|
||||
push: true
|
||||
tags: huggingface/transformers-tensorflow-past-${{ matrix.version }}-gpu
|
||||
38
.github/workflows/build_documentation.yml
vendored
Normal file
38
.github/workflows/build_documentation.yml
vendored
Normal file
@@ -0,0 +1,38 @@
|
||||
name: Build documentation
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- doc-builder*
|
||||
- v*-release
|
||||
- use_templates
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build:
|
||||
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@2430c1ec91d04667414e2fa31ecfc36c153ea391 # main
|
||||
with:
|
||||
commit_sha: ${{ github.sha }}
|
||||
package: transformers
|
||||
notebook_folder: transformers_doc
|
||||
languages: en
|
||||
custom_container: huggingface/transformers-doc-builder
|
||||
secrets:
|
||||
token: ${{ secrets.HUGGINGFACE_PUSH }}
|
||||
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
|
||||
|
||||
build_other_lang:
|
||||
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@2430c1ec91d04667414e2fa31ecfc36c153ea391 # main
|
||||
with:
|
||||
commit_sha: ${{ github.sha }}
|
||||
package: transformers
|
||||
notebook_folder: transformers_doc
|
||||
languages: ar de es fr hi it ja ko pt ro tr zh
|
||||
custom_container: huggingface/transformers-doc-builder
|
||||
secrets:
|
||||
token: ${{ secrets.HUGGINGFACE_PUSH }}
|
||||
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
|
||||
42
.github/workflows/build_pr_documentation.yml
vendored
Normal file
42
.github/workflows/build_pr_documentation.yml
vendored
Normal file
@@ -0,0 +1,42 @@
|
||||
name: Build PR Documentation
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
merge_group:
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build:
|
||||
if: github.event_name == 'pull_request'
|
||||
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@90b4ee2c10b81b5c1a6367c4e6fc9e2fb510a7e3 # main
|
||||
with:
|
||||
commit_sha: ${{ github.event.pull_request.head.sha }}
|
||||
pr_number: ${{ github.event.number }}
|
||||
package: transformers
|
||||
languages: en
|
||||
|
||||
# Satisfy required check in merge queue without actually building docs
|
||||
skip_merge_queue:
|
||||
if: github.event_name == 'merge_group'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: echo "Skipping doc build in merge queue"
|
||||
|
||||
doc_build_status_check:
|
||||
needs: [build, skip_merge_queue]
|
||||
if: always()
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: |
|
||||
if [[ "${{ needs.build.result }}" == "success" || "${{ needs.build.result }}" == "skipped" ]] && \
|
||||
[[ "${{ needs.skip_merge_queue.result }}" == "success" || "${{ needs.skip_merge_queue.result }}" == "skipped" ]]; then
|
||||
echo "OK"
|
||||
else
|
||||
exit 1
|
||||
fi
|
||||
26
.github/workflows/check-workflow-permissions.yml
vendored
Normal file
26
.github/workflows/check-workflow-permissions.yml
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
name: Check Permissions Advisor
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
workflow_name:
|
||||
description: 'Workflow file name'
|
||||
type: string
|
||||
run_count:
|
||||
description: 'Number of runs to analyze'
|
||||
type: string
|
||||
default: "10"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
advisor:
|
||||
uses: huggingface/security-workflows/.github/workflows/permissions-advisor-reusable.yml@1b6a139c28db347498b30338da6a602e0a06f56c # main
|
||||
permissions:
|
||||
actions: read
|
||||
contents: read
|
||||
with:
|
||||
workflow_name: ${{ inputs.workflow_name }}
|
||||
run_count: ${{ fromJSON(inputs.run_count) }}
|
||||
414
.github/workflows/check_failed_tests.yml
vendored
Normal file
414
.github/workflows/check_failed_tests.yml
vendored
Normal file
@@ -0,0 +1,414 @@
|
||||
name: Process failed tests
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
docker:
|
||||
required: true
|
||||
type: string
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
slack_report_channel:
|
||||
required: true
|
||||
type: string
|
||||
ci_event:
|
||||
required: true
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: true
|
||||
type: string
|
||||
commit_sha:
|
||||
required: false
|
||||
type: string
|
||||
pr_number:
|
||||
required: false
|
||||
type: string
|
||||
max_num_runners:
|
||||
required: false
|
||||
type: number
|
||||
default: 4
|
||||
outputs:
|
||||
is_check_failures_ok:
|
||||
description: "Whether the failure checking infrastructure succeeded"
|
||||
value: ${{ jobs.check_new_failures.result != 'failure' && jobs.process_new_failures_with_commit_info.result != 'failure' }}
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
OMP_NUM_THREADS: 8
|
||||
MKL_NUM_THREADS: 8
|
||||
RUN_SLOW: yes
|
||||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
|
||||
# This token is created under the bot `hf-transformers-bot`.
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
CUDA_VISIBLE_DEVICES: 0,1
|
||||
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup_check_new_failures:
|
||||
name: "Setup matrix for finding commits"
|
||||
runs-on: ubuntu-22.04
|
||||
outputs:
|
||||
matrix: ${{ steps.set-matrix.outputs.matrix }}
|
||||
n_runners: ${{ steps.set-matrix.outputs.n_runners }}
|
||||
process: ${{ steps.set-matrix.outputs.process }}
|
||||
steps:
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
continue-on-error: true
|
||||
with:
|
||||
name: ci_results_${{ inputs.job }}
|
||||
path: ci_results_${{ inputs.job }}
|
||||
|
||||
- name: Set matrix
|
||||
id: set-matrix
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
max_num_runners: ${{ inputs.max_num_runners }}
|
||||
run: |
|
||||
python3 - << 'EOF'
|
||||
import json, os, math
|
||||
|
||||
print("Script started")
|
||||
|
||||
job = os.environ["job"]
|
||||
filepath = f"ci_results_{job}/new_failures.json"
|
||||
|
||||
print(f"Looking for file: {filepath}")
|
||||
print(f"File exists: {os.path.isfile(filepath)}")
|
||||
|
||||
if not os.path.isfile(filepath):
|
||||
print("File not found, setting process=false")
|
||||
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
|
||||
f.write("process=false\n")
|
||||
exit(0)
|
||||
|
||||
with open(filepath) as f:
|
||||
reports = json.load(f)
|
||||
|
||||
print(f"Loaded reports with {len(reports)} models")
|
||||
|
||||
n_tests = sum(
|
||||
len(model_data.get("failures", model_data).get("single-gpu", []))
|
||||
for model_data in reports.values()
|
||||
)
|
||||
|
||||
print(f"n_tests: {n_tests}")
|
||||
|
||||
max_num_runners = int(os.environ["max_num_runners"])
|
||||
|
||||
TESTS_PER_RUNNER = 10
|
||||
n_runners = max(1, min(max_num_runners, math.ceil(n_tests / TESTS_PER_RUNNER)))
|
||||
|
||||
print(f"n_runners: {n_runners}")
|
||||
|
||||
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
|
||||
f.write(f"matrix={json.dumps(list(range(n_runners)))}\n")
|
||||
f.write(f"n_runners={n_runners}\n")
|
||||
f.write("process=true\n")
|
||||
|
||||
print("Done")
|
||||
EOF
|
||||
|
||||
|
||||
check_new_failures:
|
||||
name: "Find commits for new failing tests"
|
||||
needs: setup_check_new_failures
|
||||
if: needs.setup_check_new_failures.outputs.process == 'true'
|
||||
strategy:
|
||||
matrix:
|
||||
run_idx: ${{ fromJson(needs.setup_check_new_failures.outputs.matrix) }}
|
||||
runs-on:
|
||||
group: aws-g5-4xlarge-cache
|
||||
outputs:
|
||||
process: ${{ needs.setup_check_new_failures.outputs.process }}
|
||||
container:
|
||||
image: ${{ inputs.docker }}
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
with:
|
||||
name: ci_results_${{ inputs.job }}
|
||||
path: /transformers/ci_results_${{ inputs.job }}
|
||||
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
env:
|
||||
ACTIONS_ARTIFACT_MAX_ARTIFACT_COUNT: 2000
|
||||
with:
|
||||
pattern: setup_values*
|
||||
path: setup_values
|
||||
merge-multiple: true
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Prepare some setup values
|
||||
run: |
|
||||
if [ -f setup_values/prev_workflow_run_id.txt ]; then
|
||||
echo "PREV_WORKFLOW_RUN_ID=$(cat setup_values/prev_workflow_run_id.txt)" >> $GITHUB_ENV
|
||||
else
|
||||
echo "PREV_WORKFLOW_RUN_ID=" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: |
|
||||
git fetch origin "$commit_sha" && git checkout "$commit_sha"
|
||||
|
||||
- name: Get `START_SHA`
|
||||
working-directory: /transformers/utils
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: |
|
||||
echo "START_SHA=$commit_sha" >> $GITHUB_ENV
|
||||
|
||||
# This is used if the CI is triggered from a pull request `self-comment-ci.yml` (after security check is verified)
|
||||
- name: Extract the base commit on `main` (of the merge commit created by Github) if it is a PR
|
||||
id: pr_info
|
||||
if: ${{ inputs.pr_number != '' }}
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
env:
|
||||
PR_NUMBER: ${{ inputs.pr_number }}
|
||||
COMMIT_SHA: ${{ inputs.commit_sha }}
|
||||
with:
|
||||
script: |
|
||||
const pull_number = parseInt(process.env.PR_NUMBER, 10);
|
||||
const commit_sha = process.env.COMMIT_SHA;
|
||||
|
||||
const { data: pr } = await github.rest.pulls.get({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
pull_number,
|
||||
});
|
||||
|
||||
const { data: merge_commit } = await github.rest.repos.getCommit({
|
||||
owner: pr.base.repo.owner.login,
|
||||
repo: pr.base.repo.name,
|
||||
ref: commit_sha,
|
||||
});
|
||||
|
||||
core.setOutput('merge_commit_base_sha', merge_commit.parents[0].sha);
|
||||
|
||||
# Usually, `END_SHA` should be the commit of the last previous workflow run of the **SAME** (scheduled) workflow.
|
||||
# (This is why we don't need to specify `workflow_id` which would be fetched automatically in the python script.)
|
||||
- name: Get `END_SHA` from previous CI runs of the same workflow
|
||||
working-directory: /transformers/utils
|
||||
if: ${{ inputs.pr_number == '' }}
|
||||
env:
|
||||
ACCESS_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
run: |
|
||||
echo "END_SHA=$(TOKEN="$ACCESS_TOKEN" python3 -c 'import os; from get_previous_daily_ci import get_last_daily_ci_run_commit; commit=get_last_daily_ci_run_commit(token=os.environ["TOKEN"], workflow_run_id=os.environ["PREV_WORKFLOW_RUN_ID"]); print(commit)')" >> $GITHUB_ENV
|
||||
|
||||
# However, for workflow runs triggered by `issue_comment` (for pull requests), we want to check against the
|
||||
# parent commit (on `main`) of the `merge_commit` (dynamically created by GitHub). In this case, the goal is to
|
||||
# see if a reported failing test is actually ONLY failing on the `merge_commit`.
|
||||
- name: Set `END_SHA`
|
||||
if: ${{ inputs.pr_number != '' }}
|
||||
env:
|
||||
merge_commit_base_sha: ${{ steps.pr_info.outputs.merge_commit_base_sha }}
|
||||
run: |
|
||||
echo "END_SHA=$merge_commit_base_sha" >> $GITHUB_ENV
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Install pytest-flakefinder
|
||||
run: python3 -m pip install pytest-flakefinder
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Check failed tests
|
||||
working-directory: /transformers
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
n_runners: ${{ needs.setup_check_new_failures.outputs.n_runners }}
|
||||
run_idx: ${{ matrix.run_idx }}
|
||||
pr_number: ${{ inputs.pr_number }}
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
run: python3 utils/check_bad_commit.py --start_commit "$START_SHA" --end_commit "$END_SHA" --file "ci_results_${job}/new_failures.json" --output_file "new_failures_with_bad_commit_${job}_${run_idx}.json"
|
||||
|
||||
- name: Show results
|
||||
working-directory: /transformers
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
run_idx: ${{ matrix.run_idx }}
|
||||
run: |
|
||||
ls -l "new_failures_with_bad_commit_${job}_${run_idx}.json"
|
||||
cat "new_failures_with_bad_commit_${job}_${run_idx}.json"
|
||||
|
||||
- name: Upload artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}
|
||||
path: /transformers/new_failures_with_bad_commit_${{ inputs.job }}_${{ matrix.run_idx }}.json
|
||||
|
||||
process_new_failures_with_commit_info:
|
||||
name: "process bad commit reports"
|
||||
needs: check_new_failures
|
||||
if: needs.check_new_failures.outputs.process == 'true'
|
||||
runs-on:
|
||||
group: aws-g5-4xlarge-cache
|
||||
container:
|
||||
image: ${{ inputs.docker }}
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
with:
|
||||
name: ci_results_${{ inputs.job }}
|
||||
path: /transformers/ci_results_${{ inputs.job }}
|
||||
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
env:
|
||||
ACTIONS_ARTIFACT_MAX_ARTIFACT_COUNT: 2000
|
||||
with:
|
||||
pattern: new_failures_with_bad_commit_${{ inputs.job }}*
|
||||
path: /transformers/new_failures_with_bad_commit_${{ inputs.job }}
|
||||
merge-multiple: true
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Check files
|
||||
working-directory: /transformers
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
run: |
|
||||
ls -la /transformers
|
||||
ls -la "/transformers/new_failures_with_bad_commit_${job}"
|
||||
|
||||
# Currently, we only run with a single runner by using `run_idx: [1]`. We might try to run with multiple runners
|
||||
# to further reduce the false positive caused by flaky tests, which requires further processing to merge reports.
|
||||
- name: Merge files
|
||||
shell: bash
|
||||
working-directory: /transformers
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
run: |
|
||||
python3 - << 'EOF'
|
||||
import json
|
||||
import glob
|
||||
import os
|
||||
|
||||
job = os.environ["job"]
|
||||
pattern = f"/transformers/new_failures_with_bad_commit_{job}/new_failures_with_bad_commit_{job}_*.json"
|
||||
files = sorted(glob.glob(pattern))
|
||||
|
||||
if not files:
|
||||
print(f"No files found matching: {pattern}")
|
||||
exit(1)
|
||||
|
||||
print(f"Found {len(files)} file(s) to merge: {files}")
|
||||
|
||||
merged = {}
|
||||
for filepath in files:
|
||||
with open(filepath) as f:
|
||||
data = json.load(f)
|
||||
|
||||
for model, model_results in data.items():
|
||||
if model not in merged:
|
||||
merged[model] = {}
|
||||
for gpu_type, failures in model_results.items():
|
||||
if gpu_type not in merged[model]:
|
||||
merged[model][gpu_type] = []
|
||||
merged[model][gpu_type].extend(failures)
|
||||
|
||||
print(f"filepath: {filepath}")
|
||||
print(len(data))
|
||||
|
||||
output_path = "/transformers/new_failures_with_bad_commit.json"
|
||||
with open(output_path, "w") as f:
|
||||
json.dump(merged, f, indent=4)
|
||||
|
||||
print(f"Merged {len(files)} file(s) into {output_path}")
|
||||
print(f"n_items: {len(merged)}")
|
||||
print(merged)
|
||||
EOF
|
||||
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: |
|
||||
git fetch origin "$commit_sha" && git checkout "$commit_sha"
|
||||
|
||||
- name: Process report
|
||||
shell: bash
|
||||
working-directory: /transformers
|
||||
env:
|
||||
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
|
||||
JOB_NAME: ${{ inputs.job }}
|
||||
REPORT_REPO_ID: ${{ inputs.report_repo_id }}
|
||||
run: |
|
||||
{
|
||||
echo 'REPORT_TEXT<<EOF'
|
||||
python3 utils/process_bad_commit_report.py
|
||||
echo EOF
|
||||
} >> "$GITHUB_ENV"
|
||||
|
||||
- name: Show results
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
ls -l new_failures_with_bad_commit.json
|
||||
cat new_failures_with_bad_commit.json
|
||||
|
||||
- name: Upload artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: new_failures_with_bad_commit_${{ inputs.job }}
|
||||
path: |
|
||||
/transformers/new_failures_with_bad_commit.json
|
||||
/transformers/new_failures_with_bad_commit_url.txt
|
||||
|
||||
- name: Prepare Slack report title
|
||||
working-directory: /transformers
|
||||
env:
|
||||
ci_event: ${{ inputs.ci_event }}
|
||||
job: ${{ inputs.job }}
|
||||
run: |
|
||||
pip install slack_sdk
|
||||
echo "title=$(python3 -c 'import sys; import os; sys.path.append("utils"); from utils.notification_service import job_to_test_map; ci_event = os.environ["ci_event"]; job = os.environ["job"]; test_name = job_to_test_map[job]; title = f"New failed tests of {ci_event}" + ":" + f" {test_name}"; print(title)')" >> $GITHUB_ENV
|
||||
|
||||
- name: Send processed report
|
||||
if: ${{ !endsWith(env.REPORT_TEXT, '{}') }}
|
||||
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
|
||||
with:
|
||||
# Slack channel id, channel name, or user id to post message.
|
||||
# See also: https://api.slack.com/methods/chat.postMessage#channels
|
||||
channel-id: '#${{ inputs.slack_report_channel }}'
|
||||
# For posting a rich message using Block Kit
|
||||
payload: |
|
||||
{
|
||||
"blocks": [
|
||||
{
|
||||
"type": "header",
|
||||
"text": {
|
||||
"type": "plain_text",
|
||||
"text": "${{ env.title }}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "section",
|
||||
"text": {
|
||||
"type": "mrkdwn",
|
||||
"text": "${{ env.REPORT_TEXT }}"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
env:
|
||||
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
63
.github/workflows/check_tiny_models.yml
vendored
Normal file
63
.github/workflows/check_tiny_models.yml
vendored
Normal file
@@ -0,0 +1,63 @@
|
||||
name: Check Tiny Models
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- check_tiny_models*
|
||||
repository_dispatch:
|
||||
schedule:
|
||||
- cron: "0 2 * * *"
|
||||
|
||||
env:
|
||||
TOKEN: ${{ secrets.TRANSFORMERS_HUB_BOT_HF_TOKEN }}
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
check_tiny_models:
|
||||
name: Check tiny models
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Checkout transformers
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 2
|
||||
persist-credentials: false
|
||||
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
- name: Set up Python 3.10
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
# Semantic version range syntax or exact version of a Python version
|
||||
python-version: '3.10'
|
||||
# Optional - x64 or x86 architecture, defaults to x64
|
||||
architecture: 'x64'
|
||||
|
||||
- name: Install
|
||||
run: |
|
||||
sudo apt-get -y update && sudo apt-get install -y libsndfile1-dev espeak-ng cmake
|
||||
pip install --upgrade pip
|
||||
python -m pip install -U .[dev]
|
||||
# For `FastSpeech2ConformerConfig` and `FastSpeech2ConformerWithHifiGanConfig`
|
||||
python -m pip install g2p-en
|
||||
# For `Pop2PianoConfig`
|
||||
python -m pip install essentia==2.1b6.dev1034
|
||||
# For `mamba` type models to avoid the error `self._specs = frozenset(specifiers) #TypeError: 'int' object is not iterable`
|
||||
python -m pip install -U "kernels>=0.12.2" "einops"
|
||||
# For `Pop2PianoProcessor` (otherwise multiprocess error with `Placeholder` unpickable issue)
|
||||
python -m pip install -U "pretty_midi"
|
||||
|
||||
- name: Create all tiny models (locally)
|
||||
run: |
|
||||
python utils/create_dummy_models.py tiny_local_models --all --num_workers 4
|
||||
|
||||
- name: Local tiny model reports artifacts
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: tiny_local_model_creation_reports
|
||||
path: tiny_local_models/reports
|
||||
263
.github/workflows/circleci-failure-summary-comment.yml
vendored
Normal file
263
.github/workflows/circleci-failure-summary-comment.yml
vendored
Normal file
@@ -0,0 +1,263 @@
|
||||
name: CircleCI Failure Summary Comment
|
||||
|
||||
on:
|
||||
pull_request_target:
|
||||
types: [opened, synchronize, reopened]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
comment:
|
||||
runs-on: ubuntu-22.04
|
||||
permissions:
|
||||
pull-requests: write
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: "3.13"
|
||||
|
||||
- name: Install dependencies
|
||||
run: python -m pip install huggingface_hub
|
||||
|
||||
- name: Wait for CircleCI check suite completion
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
COMMIT_SHA: ${{ github.event.pull_request.head.sha }}
|
||||
GITHUB_REPOSITORY: ${{ github.repository }}
|
||||
run: |
|
||||
# Exit on error, undefined variables, or pipe failures
|
||||
set -euo pipefail
|
||||
|
||||
echo "Waiting for CircleCI check suite to complete..."
|
||||
# Timeout after 30 minutes (1800 seconds)
|
||||
end=$((SECONDS + 1800))
|
||||
|
||||
while [ $SECONDS -lt $end ]; do
|
||||
# Query GitHub API for check suites associated with this commit
|
||||
# || echo "" allows retry on transient API failures instead of exiting
|
||||
suite_json=$(gh api "repos/${GITHUB_REPOSITORY}/commits/${COMMIT_SHA}/check-suites" \
|
||||
--jq '.check_suites[] | select(.app.slug == "circleci-checks")' || echo "")
|
||||
|
||||
if [ -z "$suite_json" ]; then
|
||||
echo "CircleCI check suite not found yet, retrying..."
|
||||
else
|
||||
status=$(echo "$suite_json" | jq -r '.status')
|
||||
conclusion=$(echo "$suite_json" | jq -r '.conclusion // empty')
|
||||
echo "CircleCI status: $status, conclusion: $conclusion"
|
||||
|
||||
# Check suite is done when status is "completed" AND conclusion is set
|
||||
if [ "$status" = "completed" ] && [ -n "$conclusion" ]; then
|
||||
echo "Check suite completed successfully"
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
|
||||
# Poll every 20 seconds
|
||||
sleep 20
|
||||
done
|
||||
|
||||
echo "ERROR: Timed out waiting for CircleCI check suite"
|
||||
exit 1
|
||||
|
||||
- name: Get CircleCI run's artifacts and upload them to Hub
|
||||
id: circleci
|
||||
env:
|
||||
COMMIT_SHA: ${{ github.event.pull_request.head.sha }}
|
||||
REPO: ${{ github.repository }}
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
run: |
|
||||
# Step 1: Get CircleCI check suite ID
|
||||
echo "Getting check suites for commit ${COMMIT_SHA}..."
|
||||
check_suites=$(curl -s -H "Authorization: token ${GITHUB_TOKEN}" \
|
||||
"https://api.github.com/repos/${REPO}/commits/${COMMIT_SHA}/check-suites")
|
||||
|
||||
circleci_suite_id=$(echo "$check_suites" | jq -r '.check_suites[] | select(.app.slug == "circleci-checks") | .id' | head -n 1)
|
||||
echo "CircleCI check suite ID: ${circleci_suite_id}"
|
||||
|
||||
# Step 2: Get check runs from the CircleCI suite
|
||||
echo "Getting check runs for suite ${circleci_suite_id}..."
|
||||
check_runs=$(curl -s -H "Authorization: token ${GITHUB_TOKEN}" \
|
||||
"https://api.github.com/repos/${REPO}/check-suites/${circleci_suite_id}/check-runs")
|
||||
|
||||
# Step 3: Extract workflow ID from the "run_tests" check run
|
||||
workflow_id=$(echo "$check_runs" | jq -r '.check_runs[] | select(.name == "run_tests") | .details_url' | grep -oP 'workflows/\K[a-f0-9-]+')
|
||||
echo "CircleCI Workflow ID: ${workflow_id}"
|
||||
|
||||
# Step 4: Get all jobs in the workflow
|
||||
echo "Getting jobs for workflow ${workflow_id}..."
|
||||
jobs=$(curl -s \
|
||||
"https://circleci.com/api/v2/workflow/${workflow_id}/job")
|
||||
|
||||
# Step 5: Extract collection_job details
|
||||
# "first // empty": if collection_job is absent or has job_number=null, jq outputs nothing
|
||||
# (empty string). Without this, jq -r would output the literal string "null", making
|
||||
# [ -z "$collection_job_number" ] false and bypassing the early-exit check below.
|
||||
collection_job_number=$(echo "$jobs" | jq -r '[.items[] | select(.name == "collection_job") | .job_number] | first // empty')
|
||||
collection_job_id=$(echo "$jobs" | jq -r '[.items[] | select(.name == "collection_job") | .id] | first // empty')
|
||||
echo "CircleCI Collection job number: ${collection_job_number}"
|
||||
echo "CircleCI Collection job ID: ${collection_job_id}"
|
||||
|
||||
# When only the "empty" job ran (no tests selected), there is no collection_job.
|
||||
# Exit gracefully to avoid curl hitting a broken URL.
|
||||
if [ -z "$collection_job_number" ]; then
|
||||
echo "No collection_job found (only empty job ran - no tests were selected). Skipping."
|
||||
echo "artifact_found=false" >> $GITHUB_OUTPUT
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Step 6: Get artifacts list
|
||||
echo "Getting artifacts for job ${collection_job_number}..."
|
||||
artifacts=$(curl -s \
|
||||
"https://circleci.com/api/v2/project/gh/${REPO}/${collection_job_number}/artifacts")
|
||||
|
||||
# Print for debugging; "|| true" prevents failure if the response is not valid JSON.
|
||||
echo "$artifacts" | jq '.' || true
|
||||
|
||||
# Step 7: Download failure_summary.json specifically
|
||||
# .items // [] : use empty array if .items is null (avoids "Cannot iterate over null")
|
||||
# .[] : iterate over each artifact object in the array
|
||||
# select(...) : keep only the artifact whose .path matches
|
||||
# | .url : extract the download URL from the matched artifact
|
||||
# first // empty: take the first match; outputs nothing (not "null") if no match found
|
||||
failure_summary_url=$(echo "$artifacts" | jq -r '[.items // [] | .[] | select(.path == "outputs/failure_summary.json") | .url] | first // empty')
|
||||
|
||||
if [ -z "$failure_summary_url" ]; then
|
||||
echo "failure_summary.json not found in artifacts - PR may not have latest main merged. Skipping."
|
||||
echo "artifact_found=false" >> $GITHUB_OUTPUT
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "Downloading failure_summary.json from: ${failure_summary_url}"
|
||||
mkdir -p outputs
|
||||
curl -s -L "${failure_summary_url}" -o outputs/failure_summary.json
|
||||
ls -la outputs
|
||||
|
||||
echo "Downloaded failure_summary.json successfully"
|
||||
|
||||
# Verify the file was downloaded
|
||||
if [ ! -f outputs/failure_summary.json ]; then
|
||||
echo "Failed to download failure_summary.json - skipping."
|
||||
echo "artifact_found=false" >> $GITHUB_OUTPUT
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "File size: $(wc -c < outputs/failure_summary.json) bytes"
|
||||
|
||||
# Export variables for next steps
|
||||
echo "artifact_found=true" >> $GITHUB_OUTPUT
|
||||
echo "workflow_id=${workflow_id}" >> $GITHUB_OUTPUT
|
||||
echo "collection_job_number=${collection_job_number}" >> $GITHUB_OUTPUT
|
||||
|
||||
- name: Upload summaries to Hub
|
||||
if: steps.circleci.outputs.artifact_found == 'true'
|
||||
env:
|
||||
HF_TOKEN: ${{ secrets.HF_CI_WRITE_TOKEN }}
|
||||
CIRCLECI_RESULTS_DATASET_ID: "transformers-community/circleci-test-results"
|
||||
PR_NUMBER: ${{ github.event.pull_request.number }}
|
||||
COMMIT_SHA: ${{ github.event.pull_request.head.sha }}
|
||||
run: |
|
||||
python << 'EOF'
|
||||
import os
|
||||
from pathlib import Path
|
||||
from huggingface_hub import HfApi
|
||||
|
||||
# Setup paths
|
||||
pr_number = os.environ["PR_NUMBER"]
|
||||
commit_short = os.environ["COMMIT_SHA"][:12]
|
||||
folder_path = f"pr-{pr_number}/sha-{commit_short}"
|
||||
|
||||
# Create folder and move file
|
||||
Path(folder_path).mkdir(parents=True, exist_ok=True)
|
||||
Path("outputs/failure_summary.json").rename(f"{folder_path}/failure_summary.json")
|
||||
|
||||
# Upload to Hub
|
||||
dataset_id = os.environ["CIRCLECI_RESULTS_DATASET_ID"]
|
||||
api = HfApi(token=os.environ["HF_TOKEN"])
|
||||
api.upload_folder(
|
||||
commit_message=f"Update CircleCI artifacts for PR {pr_number} ({commit_short})",
|
||||
folder_path=folder_path,
|
||||
path_in_repo=folder_path,
|
||||
repo_id=dataset_id,
|
||||
repo_type="dataset",
|
||||
)
|
||||
|
||||
print(f"Uploaded {folder_path} to {dataset_id}")
|
||||
EOF
|
||||
|
||||
- name: Delete existing CircleCI summary comments
|
||||
if: steps.circleci.outputs.artifact_found == 'true'
|
||||
env:
|
||||
PR_NUMBER: ${{ github.event.pull_request.number }}
|
||||
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
|
||||
with:
|
||||
script: |
|
||||
const PR_NUMBER = parseInt(process.env.PR_NUMBER, 10);
|
||||
|
||||
// Get all comments on the PR
|
||||
const { data: comments } = await github.rest.issues.listComments({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: PR_NUMBER
|
||||
});
|
||||
|
||||
// Find existing bot comments that start with "View the CircleCI Test Summary for this PR:"
|
||||
const existingComments = comments.filter(comment =>
|
||||
comment.user.login === 'github-actions[bot]' &&
|
||||
comment.body.startsWith('View the CircleCI Test Summary for this PR:')
|
||||
);
|
||||
|
||||
// Delete all matching comments
|
||||
for (const comment of existingComments) {
|
||||
console.log(`Deleting comment #${comment.id}`);
|
||||
await github.rest.issues.deleteComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
comment_id: comment.id
|
||||
});
|
||||
}
|
||||
|
||||
console.log(`Deleted ${existingComments.length} old CircleCI summary comment(s)`);
|
||||
|
||||
- name: Post comment with helper link
|
||||
if: steps.circleci.outputs.artifact_found == 'true'
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_REPOSITORY: ${{ github.repository }}
|
||||
PR_NUMBER: ${{ github.event.pull_request.number }}
|
||||
PR_SHA: ${{ github.event.pull_request.head.sha }}
|
||||
run: |
|
||||
COMMIT_SHORT="${PR_SHA:0:12}"
|
||||
SUMMARY_FILE="pr-${PR_NUMBER}/sha-${COMMIT_SHORT}/failure_summary.json"
|
||||
|
||||
if [ ! -f "$SUMMARY_FILE" ]; then
|
||||
echo "failure_summary.json missing, skipping comment."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
failures=$(jq '.failures | length' "$SUMMARY_FILE")
|
||||
if [ "$failures" -eq 0 ]; then
|
||||
echo "No failures detected, skipping PR comment."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Build Space URL with encoded parameters
|
||||
repo_enc=$(jq -rn --arg v "$GITHUB_REPOSITORY" '$v|@uri')
|
||||
pr_enc=$(jq -rn --arg v "$PR_NUMBER" '$v|@uri')
|
||||
sha_short="${PR_SHA:0:6}"
|
||||
sha_enc=$(jq -rn --arg v "$sha_short" '$v|@uri')
|
||||
SPACE_URL="https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=${pr_enc}&sha=${sha_enc}"
|
||||
|
||||
# Post comment (using printf for proper newlines)
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \
|
||||
-f body="$(printf "View the CircleCI Test Summary for this PR:\n\n%s" "$SPACE_URL")"
|
||||
26
.github/workflows/codeql.yml
vendored
Normal file
26
.github/workflows/codeql.yml
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
name: CodeQL Security Analysis
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: ["main", "fix_security_issue_*"]
|
||||
# pull_request:
|
||||
# branches: ["main"]
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
codeql:
|
||||
name: CodeQL Analysis
|
||||
uses: huggingface/security-workflows/.github/workflows/codeql-reusable.yml@1b6a139c28db347498b30338da6a602e0a06f56c # main
|
||||
permissions:
|
||||
security-events: write
|
||||
packages: read
|
||||
actions: read
|
||||
contents: read
|
||||
with:
|
||||
languages: '["actions"]'
|
||||
queries: 'security-extended,security-and-quality'
|
||||
runner: 'ubuntu-latest'
|
||||
52
.github/workflows/collated-reports.yml
vendored
Normal file
52
.github/workflows/collated-reports.yml
vendored
Normal file
@@ -0,0 +1,52 @@
|
||||
name: CI collated reports
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: true
|
||||
type: string
|
||||
machine_type:
|
||||
required: true
|
||||
type: string
|
||||
gpu_name:
|
||||
description: Name of the GPU used for the job. Its enough that the value contains the name of the GPU, e.g. "noise-h100-more-noise". Case insensitive.
|
||||
required: true
|
||||
type: string
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
collated_reports:
|
||||
name: Collated reports
|
||||
runs-on: ubuntu-22.04
|
||||
if: always()
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
|
||||
- name: Collated reports
|
||||
shell: bash
|
||||
env:
|
||||
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
CI_SHA: ${{ github.sha }}
|
||||
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
|
||||
MACHINE_TYPE: ${{ inputs.machine_type }}
|
||||
JOB: ${{ inputs.job }}
|
||||
REPORT_REPO_ID: ${{ inputs.report_repo_id }}
|
||||
GPU_NAME: ${{ inputs.gpu_name }}
|
||||
run: |
|
||||
pip install huggingface_hub
|
||||
python3 utils/collated_reports.py \
|
||||
--path . \
|
||||
--machine-type "$MACHINE_TYPE" \
|
||||
--commit-hash "$CI_SHA" \
|
||||
--job "$JOB" \
|
||||
--report-repo-id "$REPORT_REPO_ID" \
|
||||
--gpu-name "$GPU_NAME"
|
||||
87
.github/workflows/doctest_job.yml
vendored
Normal file
87
.github/workflows/doctest_job.yml
vendored
Normal file
@@ -0,0 +1,87 @@
|
||||
name: Doctest job
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job_splits:
|
||||
required: true
|
||||
type: string
|
||||
split_keys:
|
||||
required: true
|
||||
type: string
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
RUN_SLOW: yes
|
||||
OMP_NUM_THREADS: 16
|
||||
MKL_NUM_THREADS: 16
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
run_doctests:
|
||||
name: " "
|
||||
strategy:
|
||||
max-parallel: 8 # 8 jobs at a time
|
||||
fail-fast: false
|
||||
matrix:
|
||||
split_keys: ${{ fromJson(inputs.split_keys) }}
|
||||
runs-on:
|
||||
group: aws-g5-4xlarge-cache
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
run: git fetch && git checkout ${{ github.sha }}
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .[flax]
|
||||
|
||||
- name: GPU visibility
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: pip freeze
|
||||
|
||||
- name: Get doctest files
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
echo "${{ toJson(fromJson(inputs.job_splits)[matrix.split_keys]) }}" > doc_tests.txt
|
||||
cat doc_tests.txt
|
||||
|
||||
- name: Set `split_keys`
|
||||
shell: bash
|
||||
run: |
|
||||
echo "${MATRIX_SPLIT_KEYS}"
|
||||
split_keys=${MATRIX_SPLIT_KEYS}
|
||||
split_keys=${split_keys//'/'/'_'}
|
||||
echo "split_keys"
|
||||
echo "split_keys=$split_keys" >> $GITHUB_ENV
|
||||
env:
|
||||
MATRIX_SPLIT_KEYS: ${{ matrix.split_keys }}
|
||||
|
||||
- name: Run doctests
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
cat doc_tests.txt
|
||||
python3 -m pytest -v --make-reports "doc_tests_gpu_${split_keys}" --doctest-modules $(cat doc_tests.txt) -sv --doctest-continue-on-failure --doctest-glob="*.md"
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: cat "/transformers/reports/doc_tests_gpu_${split_keys}/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: doc_tests_gpu_test_reports_${{ env.split_keys }}"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: doc_tests_gpu_test_reports_${{ env.split_keys }}
|
||||
path: /transformers/reports/doc_tests_gpu_${{ env.split_keys }}
|
||||
94
.github/workflows/doctests.yml
vendored
Normal file
94
.github/workflows/doctests.yml
vendored
Normal file
@@ -0,0 +1,94 @@
|
||||
name: Doctests
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- run_doctest*
|
||||
repository_dispatch:
|
||||
schedule:
|
||||
- cron: "17 2 * * *"
|
||||
|
||||
env:
|
||||
NUM_SLICES: 3
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup:
|
||||
name: Setup
|
||||
runs-on:
|
||||
group: aws-g5-4xlarge-cache
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
outputs:
|
||||
job_splits: ${{ steps.set-matrix.outputs.job_splits }}
|
||||
split_keys: ${{ steps.set-matrix.outputs.split_keys }}
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
git fetch && git checkout ${{ github.sha }}
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Check values for matrix
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/split_doctest_jobs.py
|
||||
python3 utils/split_doctest_jobs.py --only_return_keys --num_splits ${{ env.NUM_SLICES }}
|
||||
|
||||
- id: set-matrix
|
||||
working-directory: /transformers
|
||||
name: Set values for matrix
|
||||
run: |
|
||||
echo "job_splits=$(python3 utils/split_doctest_jobs.py)" >> $GITHUB_OUTPUT
|
||||
echo "split_keys=$(python3 utils/split_doctest_jobs.py --only_return_keys --num_splits ${{ env.NUM_SLICES }})" >> $GITHUB_OUTPUT
|
||||
|
||||
call_doctest_job:
|
||||
name: "Call doctest jobs"
|
||||
needs: setup
|
||||
strategy:
|
||||
max-parallel: 1 # 1 split at a time (in `doctest_job.yml`, we set `8` to run 8 jobs at the same time)
|
||||
fail-fast: false
|
||||
matrix:
|
||||
split_keys: ${{ fromJson(needs.setup.outputs.split_keys) }}
|
||||
uses: ./.github/workflows/doctest_job.yml
|
||||
with:
|
||||
job_splits: ${{ needs.setup.outputs.job_splits }}
|
||||
split_keys: ${{ toJson(matrix.split_keys) }}
|
||||
secrets: inherit
|
||||
|
||||
send_results:
|
||||
name: Send results to webhook
|
||||
runs-on: ubuntu-22.04
|
||||
if: always()
|
||||
needs: [call_doctest_job]
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
- name: Send message to Slack
|
||||
env:
|
||||
CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
|
||||
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
# Use `CI_SLACK_CHANNEL_DUMMY_TESTS` when doing experimentation
|
||||
SLACK_REPORT_CHANNEL: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY_DOCS }}
|
||||
run: |
|
||||
pip install slack_sdk
|
||||
python utils/notification_service_doc_tests.py
|
||||
|
||||
- name: "Upload results"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: doc_test_results
|
||||
path: doc_test_results
|
||||
211
.github/workflows/extras-smoke-test.yml
vendored
Normal file
211
.github/workflows/extras-smoke-test.yml
vendored
Normal file
@@ -0,0 +1,211 @@
|
||||
name: Extras Smoke Test
|
||||
|
||||
on:
|
||||
schedule:
|
||||
# Run every night at 3 AM UTC
|
||||
- cron: "0 3 * * *"
|
||||
env:
|
||||
SLACK_CHANNEL_ID: '#transformers-gh-ci-central'
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get-python-versions:
|
||||
name: Get supported Python versions
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
versions: ${{ steps.extract-versions.outputs.versions }}
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install setuptools
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install setuptools
|
||||
|
||||
- name: Extract Python versions from setup.py
|
||||
id: extract-versions
|
||||
run: |
|
||||
VERSIONS=$(python utils/extract_metadata.py python-versions)
|
||||
echo "Supported Python versions: $VERSIONS"
|
||||
echo "versions=$VERSIONS" >> $GITHUB_OUTPUT
|
||||
|
||||
test-extras:
|
||||
name: Test extras on Python ${{ matrix.python-version }}
|
||||
needs: get-python-versions
|
||||
runs-on: ubuntu-latest
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
python-version: ${{ fromJson(needs.get-python-versions.outputs.versions) }}
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python ${{ matrix.python-version }}
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: ${{ matrix.python-version }}
|
||||
allow-prereleases: true
|
||||
|
||||
- name: Install base dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install setuptools
|
||||
|
||||
- name: Extract extras for this Python version
|
||||
id: get-extras
|
||||
run: |
|
||||
python utils/extract_metadata.py extras > extras_list.txt
|
||||
echo "Found $(wc -l < extras_list.txt) extras for Python ${MATRIX_PYTHON_VERSION}"
|
||||
cat extras_list.txt
|
||||
env:
|
||||
MATRIX_PYTHON_VERSION: ${{ matrix.python-version }}
|
||||
|
||||
- name: Install base package
|
||||
run: |
|
||||
echo "Installing base package..."
|
||||
pip install -e .
|
||||
|
||||
- name: Test all extras
|
||||
id: test-extras
|
||||
run: |
|
||||
mkdir -p failure_reports
|
||||
failed=0
|
||||
|
||||
while IFS= read -r extra; do
|
||||
echo "=== Testing extra: $extra on Python ${MATRIX_PYTHON_VERSION} ==="
|
||||
if ! pip install -e .[$extra]; then
|
||||
echo "❌ Failed to install extra: $extra"
|
||||
cat > failure_reports/failure-${MATRIX_PYTHON_VERSION}-${extra}.json << EOF
|
||||
{
|
||||
"python_version": "${MATRIX_PYTHON_VERSION}",
|
||||
"extra": "${extra}",
|
||||
"job_url": "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
|
||||
}
|
||||
EOF
|
||||
failed=$((failed + 1))
|
||||
else
|
||||
echo "✓ Successfully installed extra: $extra"
|
||||
fi
|
||||
done < extras_list.txt
|
||||
|
||||
if [ $failed -gt 0 ]; then
|
||||
echo "❌ $failed extra(s) failed to install"
|
||||
exit 1
|
||||
fi
|
||||
env:
|
||||
MATRIX_PYTHON_VERSION: ${{ matrix.python-version }}
|
||||
|
||||
- name: Verify installation
|
||||
run: |
|
||||
python -c "import transformers; print(f'Transformers version: {transformers.__version__}')"
|
||||
python -c "from transformers import pipeline; print('Successfully imported pipeline')"
|
||||
|
||||
- name: Upload failure report
|
||||
if: always()
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: failure-report-${{ matrix.python-version }}
|
||||
path: failure_reports/
|
||||
retention-days: 1
|
||||
if-no-files-found: ignore
|
||||
|
||||
precheck-slack:
|
||||
name: Check Slack token availability
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
has_slack_token: ${{ steps.chk.outputs.has_token }}
|
||||
steps:
|
||||
- id: chk
|
||||
env:
|
||||
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
run: |
|
||||
if [ -n "$SLACK_BOT_TOKEN" ]; then
|
||||
echo "has_token=true" >> "$GITHUB_OUTPUT"
|
||||
else
|
||||
echo "has_token=false" >> "$GITHUB_OUTPUT"
|
||||
fi
|
||||
|
||||
notify-failures:
|
||||
name: Notify failures to Slack
|
||||
needs: [test-extras, precheck-slack]
|
||||
runs-on: ubuntu-latest
|
||||
if: always() && needs.precheck-slack.outputs.has_slack_token == 'true' && needs.test-extras.result != 'success'
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Download all failure reports
|
||||
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
with:
|
||||
pattern: failure-report-*
|
||||
path: failure_reports/
|
||||
merge-multiple: true
|
||||
continue-on-error: true
|
||||
|
||||
- name: Aggregate failures
|
||||
run: |
|
||||
python utils/aggregate_failure_reports.py \
|
||||
--input-dir failure_reports \
|
||||
--output all_failures.json
|
||||
|
||||
- name: Format Slack message
|
||||
env:
|
||||
FAILURES_FILE: all_failures.json
|
||||
WORKFLOW_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
run: |
|
||||
python utils/format_extras_slack_message.py \
|
||||
--failures "$FAILURES_FILE" \
|
||||
--workflow-url "$WORKFLOW_URL"
|
||||
|
||||
- name: Send Slack notification
|
||||
if: env.SLACK_MESSAGE != ''
|
||||
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
|
||||
with:
|
||||
channel-id: ${{ env.SLACK_CHANNEL_ID }}
|
||||
payload: |
|
||||
{
|
||||
"blocks": [
|
||||
{
|
||||
"type": "header",
|
||||
"text": {
|
||||
"type": "plain_text",
|
||||
"text": "${{ env.SLACK_TITLE }}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "section",
|
||||
"text": {
|
||||
"type": "mrkdwn",
|
||||
"text": "${{ env.SLACK_MESSAGE }}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "divider"
|
||||
},
|
||||
{
|
||||
"type": "section",
|
||||
"text": {
|
||||
"type": "mrkdwn",
|
||||
"text": "<${{ env.SLACK_WORKFLOW_URL }}|View workflow run>"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
env:
|
||||
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
174
.github/workflows/get-pr-info.yml
vendored
Normal file
174
.github/workflows/get-pr-info.yml
vendored
Normal file
@@ -0,0 +1,174 @@
|
||||
name: Get PR commit SHA
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
pr_number:
|
||||
required: true
|
||||
type: string
|
||||
outputs:
|
||||
PR_HEAD_REPO_FULL_NAME:
|
||||
description: "The full name of the repository from which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}
|
||||
PR_BASE_REPO_FULL_NAME:
|
||||
description: "The full name of the repository to which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_FULL_NAME }}
|
||||
PR_HEAD_REPO_OWNER:
|
||||
description: "The owner of the repository from which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}
|
||||
PR_BASE_REPO_OWNER:
|
||||
description: "The owner of the repository to which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_OWNER }}
|
||||
PR_HEAD_REPO_NAME:
|
||||
description: "The name of the repository from which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}
|
||||
PR_BASE_REPO_NAME:
|
||||
description: "The name of the repository to which the pull request is created"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_NAME }}
|
||||
PR_HEAD_REF:
|
||||
description: "The branch name of the pull request in the head repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REF }}
|
||||
PR_BASE_REF:
|
||||
description: "The branch name in the base repository (to merge into)"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REF }}
|
||||
PR_HEAD_SHA:
|
||||
description: "The head sha of the pull request branch in the head repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
PR_BASE_SHA:
|
||||
description: "The head sha of the target branch in the base repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_BASE_SHA }}
|
||||
PR_MERGE_COMMIT_SHA:
|
||||
description: "The sha of the merge commit for the pull request (created by GitHub) in the base repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_SHA }}
|
||||
PR_MERGE_COMMIT_BASE_SHA:
|
||||
description: "The sha of the parent commit of the merge commit on the target branch in the base repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_BASE_SHA }}
|
||||
PR_HEAD_COMMIT_DATE:
|
||||
description: "The date of the head sha of the pull request branch in the head repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_DATE }}
|
||||
PR_MERGE_COMMIT_DATE:
|
||||
description: "The date of the merge commit for the pull request (created by GitHub) in the base repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_DATE }}
|
||||
PR_HEAD_COMMIT_TIMESTAMP:
|
||||
description: "The timestamp of the head sha of the pull request branch in the head repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_TIMESTAMP }}
|
||||
PR_MERGE_COMMIT_TIMESTAMP:
|
||||
description: "The timestamp of the merge commit for the pull request (created by GitHub) in the base repository"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}
|
||||
PR:
|
||||
description: "The PR"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR }}
|
||||
PR_FILES:
|
||||
description: "The files touched in the PR"
|
||||
value: ${{ jobs.get-pr-info.outputs.PR_FILES }}
|
||||
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get-pr-info:
|
||||
runs-on: ubuntu-22.04
|
||||
name: Get PR commit SHA better
|
||||
outputs:
|
||||
PR_HEAD_REPO_FULL_NAME: ${{ steps.pr_info.outputs.head_repo_full_name }}
|
||||
PR_BASE_REPO_FULL_NAME: ${{ steps.pr_info.outputs.base_repo_full_name }}
|
||||
PR_HEAD_REPO_OWNER: ${{ steps.pr_info.outputs.head_repo_owner }}
|
||||
PR_BASE_REPO_OWNER: ${{ steps.pr_info.outputs.base_repo_owner }}
|
||||
PR_HEAD_REPO_NAME: ${{ steps.pr_info.outputs.head_repo_name }}
|
||||
PR_BASE_REPO_NAME: ${{ steps.pr_info.outputs.base_repo_name }}
|
||||
PR_HEAD_REF: ${{ steps.pr_info.outputs.head_ref }}
|
||||
PR_BASE_REF: ${{ steps.pr_info.outputs.base_ref }}
|
||||
PR_HEAD_SHA: ${{ steps.pr_info.outputs.head_sha }}
|
||||
PR_BASE_SHA: ${{ steps.pr_info.outputs.base_sha }}
|
||||
PR_MERGE_COMMIT_BASE_SHA: ${{ steps.pr_info.outputs.merge_commit_base_sha }}
|
||||
PR_MERGE_COMMIT_SHA: ${{ steps.pr_info.outputs.merge_commit_sha }}
|
||||
PR_HEAD_COMMIT_DATE: ${{ steps.pr_info.outputs.head_commit_date }}
|
||||
PR_MERGE_COMMIT_DATE: ${{ steps.pr_info.outputs.merge_commit_date }}
|
||||
PR_HEAD_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.head_commit_timestamp }}
|
||||
PR_MERGE_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.merge_commit_timestamp }}
|
||||
PR: ${{ steps.pr_info.outputs.pr }}
|
||||
PR_FILES: ${{ steps.pr_info.outputs.files }}
|
||||
if: ${{ inputs.pr_number != '' }}
|
||||
steps:
|
||||
- name: Extract PR details
|
||||
id: pr_info
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
env:
|
||||
PR_NUMBER: ${{ inputs.pr_number }}
|
||||
with:
|
||||
script: |
|
||||
const pull_number = parseInt(process.env.PR_NUMBER, 10);
|
||||
|
||||
const { data: pr } = await github.rest.pulls.get({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
pull_number,
|
||||
});
|
||||
|
||||
const { data: head_commit } = await github.rest.repos.getCommit({
|
||||
owner: pr.head.repo.owner.login,
|
||||
repo: pr.head.repo.name,
|
||||
ref: pr.head.ref
|
||||
});
|
||||
|
||||
const { data: merge_commit } = await github.rest.repos.getCommit({
|
||||
owner: pr.base.repo.owner.login,
|
||||
repo: pr.base.repo.name,
|
||||
ref: pr.merge_commit_sha,
|
||||
});
|
||||
|
||||
const { data: files } = await github.rest.pulls.listFiles({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
pull_number,
|
||||
});
|
||||
|
||||
core.setOutput('head_repo_full_name', pr.head.repo.full_name);
|
||||
core.setOutput('base_repo_full_name', pr.base.repo.full_name);
|
||||
core.setOutput('head_repo_owner', pr.head.repo.owner.login);
|
||||
core.setOutput('base_repo_owner', pr.base.repo.owner.login);
|
||||
core.setOutput('head_repo_name', pr.head.repo.name);
|
||||
core.setOutput('base_repo_name', pr.base.repo.name);
|
||||
core.setOutput('head_ref', pr.head.ref);
|
||||
core.setOutput('base_ref', pr.base.ref);
|
||||
core.setOutput('head_sha', pr.head.sha);
|
||||
core.setOutput('base_sha', pr.base.sha);
|
||||
core.setOutput('merge_commit_base_sha', merge_commit.parents[0].sha);
|
||||
core.setOutput('merge_commit_sha', pr.merge_commit_sha);
|
||||
core.setOutput('pr', pr);
|
||||
|
||||
core.setOutput('head_commit_date', head_commit.commit.committer.date);
|
||||
core.setOutput('merge_commit_date', merge_commit.commit.committer.date);
|
||||
|
||||
core.setOutput('files', files);
|
||||
|
||||
console.log('PR head commit:', {
|
||||
head_commit: head_commit,
|
||||
commit: head_commit.commit,
|
||||
date: head_commit.commit.committer.date
|
||||
});
|
||||
|
||||
console.log('PR merge commit:', {
|
||||
merge_commit: merge_commit,
|
||||
commit: merge_commit.commit,
|
||||
date: merge_commit.commit.committer.date
|
||||
});
|
||||
|
||||
console.log('PR Info:', {
|
||||
pr_info: pr
|
||||
});
|
||||
|
||||
- name: Convert dates to timestamps
|
||||
id: get_timestamps
|
||||
env:
|
||||
head_commit_date: ${{ steps.pr_info.outputs.head_commit_date }}
|
||||
merge_commit_date: ${{ steps.pr_info.outputs.merge_commit_date }}
|
||||
run: |
|
||||
echo "$head_commit_date"
|
||||
echo "$merge_commit_date"
|
||||
head_commit_timestamp=$(date -d "$head_commit_date" +%s)
|
||||
merge_commit_timestamp=$(date -d "$merge_commit_date" +%s)
|
||||
echo "$head_commit_timestamp"
|
||||
echo "$merge_commit_timestamp"
|
||||
echo "head_commit_timestamp=$head_commit_timestamp" >> $GITHUB_OUTPUT
|
||||
echo "merge_commit_timestamp=$merge_commit_timestamp" >> $GITHUB_OUTPUT
|
||||
45
.github/workflows/get-pr-number.yml
vendored
Normal file
45
.github/workflows/get-pr-number.yml
vendored
Normal file
@@ -0,0 +1,45 @@
|
||||
name: Get PR number
|
||||
on:
|
||||
workflow_call:
|
||||
outputs:
|
||||
PR_NUMBER:
|
||||
description: "The extracted PR number"
|
||||
value: ${{ jobs.get-pr-number.outputs.PR_NUMBER }}
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get-pr-number:
|
||||
runs-on: ubuntu-22.04
|
||||
name: Get PR number
|
||||
outputs:
|
||||
PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}
|
||||
steps:
|
||||
- name: Get PR number
|
||||
shell: bash
|
||||
env:
|
||||
issue_number: ${{ github.event.issue.number }}
|
||||
is_pull_request_issue: ${{ github.event.issue.pull_request != null }}
|
||||
pr_number: ${{ github.event.pull_request.number }}
|
||||
is_pull_request: ${{ github.event.pull_request != null }}
|
||||
event_number: ${{ github.event.number }}
|
||||
run: |
|
||||
if [[ "$issue_number" != "" && "$is_pull_request_issue" == "true" ]]; then
|
||||
echo "PR_NUMBER=$issue_number" >> $GITHUB_ENV
|
||||
elif [[ "$pr_number" != "" ]]; then
|
||||
echo "PR_NUMBER=$pr_number" >> $GITHUB_ENV
|
||||
elif [[ "$is_pull_request" == "true" ]]; then
|
||||
echo "PR_NUMBER=$event_number" >> $GITHUB_ENV
|
||||
else
|
||||
echo "PR_NUMBER=" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
- name: Check PR number
|
||||
shell: bash
|
||||
run: |
|
||||
echo "$PR_NUMBER"
|
||||
|
||||
- name: Set PR number
|
||||
id: set_pr_number
|
||||
run: echo "PR_NUMBER=$PR_NUMBER" >> "$GITHUB_OUTPUT"
|
||||
219
.github/workflows/model_jobs.yml
vendored
Normal file
219
.github/workflows/model_jobs.yml
vendored
Normal file
@@ -0,0 +1,219 @@
|
||||
name: model jobs
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
folder_slices:
|
||||
required: true
|
||||
type: string
|
||||
machine_type:
|
||||
required: true
|
||||
type: string
|
||||
slice_id:
|
||||
required: true
|
||||
type: number
|
||||
docker:
|
||||
required: true
|
||||
type: string
|
||||
commit_sha:
|
||||
required: false
|
||||
type: string
|
||||
report_name_prefix:
|
||||
required: false
|
||||
default: run_models_gpu
|
||||
type: string
|
||||
runner_type:
|
||||
required: false
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: false
|
||||
type: string
|
||||
pytest_marker:
|
||||
required: false
|
||||
type: string
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
OMP_NUM_THREADS: 8
|
||||
MKL_NUM_THREADS: 8
|
||||
RUN_SLOW: yes
|
||||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
|
||||
# This token is created under the bot `hf-transformers-bot`.
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
CUDA_VISIBLE_DEVICES: 0,1
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
run_models_gpu:
|
||||
name: " "
|
||||
strategy:
|
||||
max-parallel: 8
|
||||
fail-fast: false
|
||||
matrix:
|
||||
folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}
|
||||
runs-on:
|
||||
group: '${{ inputs.machine_type }}'
|
||||
container:
|
||||
image: ${{ inputs.docker }}
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
outputs:
|
||||
machine_type: ${{ steps.set_machine_type.outputs.machine_type }}
|
||||
steps:
|
||||
- name: Echo input and matrix info
|
||||
shell: bash
|
||||
env:
|
||||
folder_slices: ${{ inputs.folder_slices }}
|
||||
matrix_folders: ${{ matrix.folders }}
|
||||
slice_data: ${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}
|
||||
run: |
|
||||
echo "$folder_slices"
|
||||
echo "$matrix_folders"
|
||||
echo "$slice_data"
|
||||
|
||||
- name: Echo folder ${{ matrix.folders }}
|
||||
shell: bash
|
||||
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
|
||||
# set the artifact folder names (because the character `/` is not allowed).
|
||||
env:
|
||||
matrix_folders_raw: ${{ matrix.folders }}
|
||||
run: |
|
||||
echo "$matrix_folders_raw"
|
||||
matrix_folders="${matrix_folders_raw/'models/'/'models_'}"
|
||||
echo "$matrix_folders"
|
||||
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
|
||||
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: |
|
||||
git fetch origin "$commit_sha" && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: Update / Install some packages (for Past CI)
|
||||
if: ${{ contains(inputs.docker, '-past-') }}
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 -m pip install -U datasets
|
||||
|
||||
- name: Update / Install some packages (for Past CI)
|
||||
if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
id: set_machine_type
|
||||
working-directory: /transformers
|
||||
shell: bash
|
||||
env:
|
||||
input_machine_type: ${{ inputs.machine_type }}
|
||||
run: |
|
||||
echo "$input_machine_type"
|
||||
|
||||
if [ "$input_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$input_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$input_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
echo "machine_type=$machine_type" >> $GITHUB_OUTPUT
|
||||
|
||||
- name: Create report directory if it doesn't exist
|
||||
shell: bash
|
||||
env:
|
||||
report_name_prefix: ${{ inputs.report_name_prefix }}
|
||||
run: |
|
||||
mkdir -p "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports"
|
||||
echo "dummy" > "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports/dummy.txt"
|
||||
ls -la "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports"
|
||||
|
||||
- name: Run all tests on GPU
|
||||
working-directory: /transformers
|
||||
env:
|
||||
report_name_prefix: ${{ inputs.report_name_prefix }}
|
||||
pytest_marker: ${{ inputs.pytest_marker }}
|
||||
model: ${{ matrix.folders }}
|
||||
run: |
|
||||
# Map short names to actual test paths for trainer/distributed tests
|
||||
test_path="tests/${model}"
|
||||
if [ "$model" = "fsdp" ]; then
|
||||
test_path="tests/trainer/distributed/test_trainer_distributed_fsdp.py"
|
||||
elif [ "$model" = "ddp" ]; then
|
||||
test_path="tests/trainer/distributed/test_trainer_distributed_ddp.py"
|
||||
elif [ "$model" = "trainer" ] && [ "$report_name_prefix" = "run_trainer_and_fsdp_gpu" ]; then
|
||||
test_path="tests/trainer --ignore=tests/trainer/distributed"
|
||||
fi
|
||||
script -q -c "PATCH_TESTING_METHODS_TO_COLLECT_OUTPUTS=yes _PATCHED_TESTING_METHODS_OUTPUT_DIR=/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports python3 -m pytest -rsfE -v -m '${pytest_marker}' --make-reports=${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports ${test_path}" test_outputs.txt
|
||||
ls -la
|
||||
# Extract the exit code from the output file
|
||||
EXIT_CODE=$(tail -1 test_outputs.txt | grep -o 'COMMAND_EXIT_CODE="[0-9]*"' | cut -d'"' -f2)
|
||||
exit ${EXIT_CODE:-1}
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
# This step is only to show information on Github Actions log.
|
||||
# Always mark this step as successful, even if the report directory or the file `failures_short.txt` in it doesn't exist
|
||||
continue-on-error: true
|
||||
env:
|
||||
report_name_prefix: ${{ inputs.report_name_prefix }}
|
||||
run: cat "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports/failures_short.txt"
|
||||
|
||||
- name: Captured information
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
env:
|
||||
report_name_prefix: ${{ inputs.report_name_prefix }}
|
||||
run: |
|
||||
cat "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports/captured_info.txt"
|
||||
|
||||
- name: Copy test_outputs.txt
|
||||
if: ${{ always() }}
|
||||
continue-on-error: true
|
||||
env:
|
||||
report_name_prefix: ${{ inputs.report_name_prefix }}
|
||||
run: |
|
||||
cp /transformers/test_outputs.txt "/transformers/reports/${machine_type}_${report_name_prefix}_${matrix_folders}_test_reports"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports
|
||||
path: /transformers/reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports
|
||||
|
||||
collated_reports:
|
||||
name: Collated Reports
|
||||
if: ${{ always() && inputs.runner_type != '' }}
|
||||
needs: run_models_gpu
|
||||
uses: huggingface/transformers/.github/workflows/collated-reports.yml@6abd9725ee7d809dc974991f8ff6c958afb63a3a # main
|
||||
with:
|
||||
job: run_models_gpu
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
gpu_name: ${{ inputs.runner_type }}
|
||||
machine_type: ${{ needs.run_models_gpu.outputs.machine_type }}
|
||||
secrets: inherit
|
||||
143
.github/workflows/model_jobs_intel_gaudi.yml
vendored
Normal file
143
.github/workflows/model_jobs_intel_gaudi.yml
vendored
Normal file
@@ -0,0 +1,143 @@
|
||||
name: model jobs
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
folder_slices:
|
||||
required: true
|
||||
type: string
|
||||
slice_id:
|
||||
required: true
|
||||
type: number
|
||||
runner:
|
||||
required: true
|
||||
type: string
|
||||
machine_type:
|
||||
required: true
|
||||
type: string
|
||||
report_name_prefix:
|
||||
required: false
|
||||
default: run_models_gpu
|
||||
type: string
|
||||
|
||||
env:
|
||||
RUN_SLOW: yes
|
||||
PT_HPU_LAZY_MODE: 0
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
PT_ENABLE_INT64_SUPPORT: 1
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
HF_HOME: /mnt/cache/.cache/huggingface
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
run_models_gpu:
|
||||
name: " "
|
||||
strategy:
|
||||
max-parallel: 8
|
||||
fail-fast: false
|
||||
matrix:
|
||||
folders: ${{ fromJson(inputs.folder_slices)[inputs.slice_id] }}
|
||||
runs-on:
|
||||
group: ${{ inputs.runner }}
|
||||
container:
|
||||
image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest
|
||||
options: --runtime=habana
|
||||
-v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface
|
||||
--env OMPI_MCA_btl_vader_single_copy_mechanism=none
|
||||
--env HABANA_VISIBLE_DEVICES
|
||||
--env HABANA_VISIBLE_MODULES
|
||||
--cap-add=sys_nice
|
||||
--shm-size=64G
|
||||
steps:
|
||||
- name: Echo input and matrix info
|
||||
shell: bash
|
||||
env:
|
||||
FOLDER_SLICES: ${{ inputs.folder_slices }}
|
||||
MATRIX_FOLDERS: ${{ matrix.folders }}
|
||||
SLICE: ${{ toJson(fromJson(inputs.folder_slices)[inputs.slice_id]) }}
|
||||
run: |
|
||||
echo "$FOLDER_SLICES"
|
||||
echo "$MATRIX_FOLDERS"
|
||||
echo "$SLICE"
|
||||
|
||||
- name: Echo folder ${{ matrix.folders }}
|
||||
shell: bash
|
||||
env:
|
||||
MATRIX_FOLDERS: ${{ matrix.folders }}
|
||||
run: |
|
||||
echo "$MATRIX_FOLDERS"
|
||||
matrix_folders="${MATRIX_FOLDERS/'models/'/'models_'}"
|
||||
echo "$matrix_folders"
|
||||
echo "matrix_folders=$matrix_folders" >> "$GITHUB_ENV"
|
||||
|
||||
- name: Checkout
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn
|
||||
|
||||
- name: HL-SMI
|
||||
run: |
|
||||
hl-smi
|
||||
echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"
|
||||
echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"
|
||||
|
||||
- name: Environment
|
||||
run: python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
shell: bash
|
||||
env:
|
||||
MACHINE_TYPE: ${{ inputs.machine_type }}
|
||||
run: |
|
||||
if [ "$MACHINE_TYPE" = "1gaudi" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$MACHINE_TYPE" = "2gaudi" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$MACHINE_TYPE"
|
||||
fi
|
||||
echo "machine_type=$machine_type" >> "$GITHUB_ENV"
|
||||
|
||||
- name: Run all tests on Gaudi
|
||||
env:
|
||||
REPORT_NAME_PREFIX: ${{ inputs.report_name_prefix }}
|
||||
MATRIX_FOLDERS: ${{ matrix.folders }}
|
||||
run: |
|
||||
REPORTS="${machine_type}_${REPORT_NAME_PREFIX}_${MATRIX_FOLDERS}_test_reports"
|
||||
python3 -m pytest -v --make-reports="$REPORTS" "tests/${MATRIX_FOLDERS}"
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
env:
|
||||
REPORT_NAME_PREFIX: ${{ inputs.report_name_prefix }}
|
||||
MATRIX_FOLDERS: ${{ matrix.folders }}
|
||||
run: cat "reports/${machine_type}_${REPORT_NAME_PREFIX}_${MATRIX_FOLDERS}_test_reports/failures_short.txt"
|
||||
|
||||
- name: Run test
|
||||
shell: bash
|
||||
env:
|
||||
REPORT_NAME_PREFIX: ${{ inputs.report_name_prefix }}
|
||||
MATRIX_FOLDERS: ${{ matrix.folders }}
|
||||
run: |
|
||||
REPORTS="${machine_type}_${REPORT_NAME_PREFIX}_${MATRIX_FOLDERS}_test_reports"
|
||||
mkdir -p "reports/$REPORTS"
|
||||
echo "hello" > "reports/$REPORTS/hello.txt"
|
||||
echo "$REPORTS"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ env.matrix_folders }}_test_reports
|
||||
path: reports/${{ env.machine_type }}_${{ inputs.report_name_prefix }}_${{ matrix.folders }}_test_reports
|
||||
72
.github/workflows/new_model_pr_merged_notification.yml
vendored
Normal file
72
.github/workflows/new_model_pr_merged_notification.yml
vendored
Normal file
@@ -0,0 +1,72 @@
|
||||
# Used to notify core maintainers about new model PR being merged
|
||||
name: New model PR merged notification
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
paths:
|
||||
- 'src/transformers/models/*/modeling_*'
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
notify_new_model:
|
||||
name: Notify new model
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
- name: Check new model
|
||||
shell: bash
|
||||
run: |
|
||||
python -m pip install gitpython
|
||||
python -c 'from utils.pr_slow_ci_models import get_new_model; new_model = get_new_model(diff_with_last_commit=True); print(new_model)' | tee output.txt
|
||||
echo "NEW_MODEL=$(tail -n 1 output.txt)" >> $GITHUB_ENV
|
||||
echo "COMMIT_SHA=$(git log -1 --format=%H)" >> $GITHUB_ENV
|
||||
|
||||
- name: print commit sha
|
||||
if: ${{ env.NEW_MODEL != ''}}
|
||||
shell: bash
|
||||
run: |
|
||||
echo "$COMMIT_SHA"
|
||||
|
||||
- name: print new model
|
||||
if: ${{ env.NEW_MODEL != ''}}
|
||||
shell: bash
|
||||
run: |
|
||||
echo "$NEW_MODEL"
|
||||
|
||||
- name: Notify
|
||||
if: ${{ env.NEW_MODEL != ''}}
|
||||
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
|
||||
with:
|
||||
# Slack channel id, channel name, or user id to post message.
|
||||
# See also: https://api.slack.com/methods/chat.postMessage#channels
|
||||
channel-id: transformers-new-model-notification
|
||||
# For posting a rich message using Block Kit
|
||||
payload: |
|
||||
{
|
||||
"blocks": [
|
||||
{
|
||||
"type": "header",
|
||||
"text": {
|
||||
"type": "plain_text",
|
||||
"text": "New model!",
|
||||
"emoji": true
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "section",
|
||||
"text": {
|
||||
"type": "mrkdwn",
|
||||
"text": "<https://github.com/huggingface/transformers/commit/${{ env.COMMIT_SHA }}|New model: ${{ env.NEW_MODEL }}> GH_ArthurZucker, GH_lysandrejik, GH_ydshieh\ncommit SHA: ${{ env.COMMIT_SHA }}"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
env:
|
||||
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
23
.github/workflows/pr-ci-caller.yml
vendored
Normal file
23
.github/workflows/pr-ci-caller.yml
vendored
Normal file
@@ -0,0 +1,23 @@
|
||||
name: PR CI
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
|
||||
concurrency:
|
||||
# Group by PR number, commit SHA (main) to avoid cross-branch cancellation, or branch ref for cancellation within each branch
|
||||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || (github.ref == 'refs/heads/main' && github.sha) || github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
pr-ci:
|
||||
# Allow `ydshieh2` for testing during development
|
||||
if: github.event_name == 'push' || contains(fromJSON('["MEMBER","OWNER","COLLABORATOR"]'), github.event.pull_request.author_association) || github.event.pull_request.user.login == 'ydshieh2'
|
||||
# DON'T use commit sha here before the CI migration is finished
|
||||
uses: huggingface/transformers-test-ci/.github/workflows/pr-ci_dynamic_caller_example.yml@main # main
|
||||
|
||||
71
.github/workflows/pr-ci-post-dashboard-link.yml
vendored
Normal file
71
.github/workflows/pr-ci-post-dashboard-link.yml
vendored
Normal file
@@ -0,0 +1,71 @@
|
||||
name: Post CI Dashboard Link
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["PR CI"]
|
||||
types: [completed]
|
||||
|
||||
permissions:
|
||||
pull-requests: write
|
||||
|
||||
jobs:
|
||||
post-link:
|
||||
if: github.event.workflow_run.event == 'pull_request'
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- uses: actions/github-script@v7
|
||||
with:
|
||||
script: |
|
||||
const headSha = context.payload.workflow_run.head_sha;
|
||||
const runId = context.payload.workflow_run.id;
|
||||
|
||||
const { data: prs } = await github.rest.pulls.list({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
state: 'open',
|
||||
});
|
||||
|
||||
const pr = prs.find(pr => pr.head.sha === headSha);
|
||||
if (!pr) {
|
||||
console.log(`No open PR found for SHA ${headSha}, skipping`);
|
||||
return;
|
||||
}
|
||||
|
||||
// Check if the code quality job failed (causes all test jobs to be skipped)
|
||||
const { data: jobsData } = await github.rest.actions.listJobsForWorkflowRun({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
run_id: runId,
|
||||
per_page: 100,
|
||||
});
|
||||
const qualityJob = jobsData.jobs.find(j => j.name.includes('Check code quality'));
|
||||
const qualityFailed = qualityJob && qualityJob.conclusion === 'failure';
|
||||
|
||||
// Delete any existing dashboard comment before posting a fresh one
|
||||
const { data: comments } = await github.rest.issues.listComments({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: pr.number,
|
||||
});
|
||||
for (const comment of comments) {
|
||||
if (comment.body.includes('**CI Observability Dashboard:** [View test results in Grafana]') ||
|
||||
comment.body.includes('**CI Dashboard:** [View test results in Grafana]')) {
|
||||
await github.rest.issues.deleteComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
comment_id: comment.id,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
const url = `https://transformers-ci.lor-e.huggingface.cool/d/pytest-observability-by-pr/pytest-observability-branch?var-pr=${pr.number}`;
|
||||
let body = `**CI Dashboard:** [View test results in Grafana](${url})`;
|
||||
if (qualityFailed) {
|
||||
body += `\n\n> ⚠️ **Code quality check failed** — all test jobs were skipped. Fix the code quality issues and push again to run tests.`;
|
||||
}
|
||||
await github.rest.issues.createComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: pr.number,
|
||||
body,
|
||||
});
|
||||
390
.github/workflows/pr-repo-consistency-bot.yml
vendored
Normal file
390
.github/workflows/pr-repo-consistency-bot.yml
vendored
Normal file
@@ -0,0 +1,390 @@
|
||||
name: PR Repo. Consistency Bot
|
||||
|
||||
on:
|
||||
issue_comment:
|
||||
types:
|
||||
- created
|
||||
branches-ignore:
|
||||
- main
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, '@bot /repo') || startsWith(github.event.comment.body, '@bot /style') }}
|
||||
cancel-in-progress: true
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
|
||||
jobs:
|
||||
get-pr-number:
|
||||
name: Get PR number
|
||||
if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "eustlb", "MekkCyber", "vasqu", "ivarflakstad", "stevhliu", "ebezzam", "remi-or", "itazap", "3outeille", "IlyasMoutawwakil", "tarekziade"]'), github.actor) && (startsWith(github.event.comment.body, '@bot /repo') || startsWith(github.event.comment.body, '@bot /style')) }}
|
||||
uses: ./.github/workflows/get-pr-number.yml
|
||||
|
||||
get-pr-info:
|
||||
name: Get PR commit SHA
|
||||
needs: get-pr-number
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
|
||||
uses: ./.github/workflows/get-pr-info.yml
|
||||
with:
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
|
||||
check-timestamps:
|
||||
name: Check timestamps (security check)
|
||||
runs-on: ubuntu-22.04
|
||||
needs: get-pr-info
|
||||
outputs:
|
||||
VERIFIED_PR_HEAD_SHA: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
steps:
|
||||
- name: Verify `merge_commit` timestamp is older than the issue comment timestamp
|
||||
env:
|
||||
COMMENT_DATE: ${{ github.event.comment.created_at }}
|
||||
PR_MERGE_COMMIT_TIMESTAMP: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}
|
||||
run: |
|
||||
COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
|
||||
echo "COMMENT_DATE: $COMMENT_DATE"
|
||||
echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
|
||||
if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
|
||||
echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
|
||||
exit -1;
|
||||
fi
|
||||
|
||||
init_comment_with_url:
|
||||
name: Init Comment on PR
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [get-pr-number, check-timestamps]
|
||||
outputs:
|
||||
comment_id: ${{ steps.init_comment.outputs.comment_id }}
|
||||
permissions:
|
||||
pull-requests: write
|
||||
steps:
|
||||
- name: Delete existing bot comment if it exists
|
||||
env:
|
||||
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
with:
|
||||
script: |
|
||||
const PR_NUMBER = parseInt(process.env.PR_NUMBER, 10);
|
||||
|
||||
// Get all comments on the PR
|
||||
const { data: comments } = await github.rest.issues.listComments({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: PR_NUMBER
|
||||
});
|
||||
|
||||
// Find existing bot comments that start with "Repo. Consistency" or "Style fix"
|
||||
const existingComments = comments.filter(comment =>
|
||||
comment.user.login === 'github-actions[bot]' &&
|
||||
(comment.body.startsWith('Repo. Consistency') || comment.body.startsWith('Style fix'))
|
||||
);
|
||||
|
||||
if (existingComments.length > 0) {
|
||||
// Get the most recent comment
|
||||
const mostRecentComment = existingComments
|
||||
.sort((a, b) => new Date(b.created_at) - new Date(a.created_at))[0];
|
||||
|
||||
console.log(`Deleting most recent comment #${mostRecentComment.id}`);
|
||||
await github.rest.issues.deleteComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
comment_id: mostRecentComment.id
|
||||
});
|
||||
}
|
||||
|
||||
- name: Comment on PR with workflow run link
|
||||
id: init_comment
|
||||
env:
|
||||
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
COMMENT_BODY: ${{ github.event.comment.body }}
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
with:
|
||||
script: |
|
||||
const PR_NUMBER = parseInt(process.env.PR_NUMBER, 10);
|
||||
const COMMENT_BODY = process.env.COMMENT_BODY;
|
||||
const runUrl = `${process.env.GITHUB_SERVER_URL}/${process.env.GITHUB_REPOSITORY}/actions/runs/${process.env.GITHUB_RUN_ID}`
|
||||
|
||||
// Determine which command was used
|
||||
const isStyleFix = COMMENT_BODY.startsWith('@bot /style');
|
||||
const messagePrefix = isStyleFix ? 'Style fix' : 'Repo. Consistency fix';
|
||||
|
||||
const { data: botComment } = await github.rest.issues.createComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: PR_NUMBER,
|
||||
body: `${messagePrefix} is beginning .... [View the workflow run here](${runUrl}).`
|
||||
});
|
||||
core.setOutput('comment_id', botComment.id);
|
||||
|
||||
run-repo-consistency-checks:
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [get-pr-info, check-timestamps, init_comment_with_url]
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: read
|
||||
outputs:
|
||||
changes_detected: ${{ steps.run_repo_checks.outputs.changes_detected || steps.run_style_checks.outputs.changes_detected }}
|
||||
util_scripts_modified: ${{ steps.check_util_scripts.outputs.util_scripts_modified }}
|
||||
steps:
|
||||
# Checkout the trusted base repository (main branch) - this is safe
|
||||
- name: Checkout base repository
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
ref: main
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@7f4fc3e22c37d6ff65e88745f38bd3157c663f7c # v4.9.1
|
||||
with:
|
||||
python-version: "3.10"
|
||||
|
||||
- name: Install dependencies from trusted main branch
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -e ".[quality]"
|
||||
pip install --no-cache-dir --upgrade 'torch' 'torchaudio' 'torchvision' --index-url https://download.pytorch.org/whl/cpu
|
||||
|
||||
- name: Fetch and checkout PR code manually
|
||||
env:
|
||||
PR_HEAD_REPO_FULL_NAME: ${{ needs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}
|
||||
PR_HEAD_REF: ${{ needs.get-pr-info.outputs.PR_HEAD_REF }}
|
||||
PR_HEAD_SHA: ${{ needs.check-timestamps.outputs.VERIFIED_PR_HEAD_SHA }}
|
||||
run: |
|
||||
# Create separate directory for PR code
|
||||
mkdir -p pr-repo
|
||||
cd pr-repo
|
||||
|
||||
# Initialize git and fetch with full history
|
||||
git init
|
||||
git remote add pr-origin "https://github.com/${PR_HEAD_REPO_FULL_NAME}.git"
|
||||
git fetch pr-origin "${PR_HEAD_REF}"
|
||||
git checkout "${PR_HEAD_SHA}"
|
||||
|
||||
# Also fetch main branch from upstream for comparison (required by `check_modular_conversion`)
|
||||
git remote add upstream https://github.com/${{ github.repository }}.git
|
||||
git fetch upstream main:main
|
||||
|
||||
- name: Check if util scripts are modified in PR
|
||||
id: check_util_scripts
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
with:
|
||||
script: |
|
||||
// Re-fetch the PR file list from the API rather than using get-pr-info's PR_FILES
|
||||
// output, to avoid template injection and E2BIG issues (see pr_slow_ci_suggestion.yml).
|
||||
const UTIL_SCRIPTS = new Set([
|
||||
'setup.py',
|
||||
'utils/custom_init_isort.py',
|
||||
'utils/sort_auto_mappings.py',
|
||||
'utils/check_doc_toc.py',
|
||||
'utils/check_copies.py',
|
||||
'utils/check_modular_conversion.py',
|
||||
'utils/check_dummies.py',
|
||||
'utils/check_pipeline_typing.py',
|
||||
'utils/check_doctest_list.py',
|
||||
'utils/check_docstrings.py',
|
||||
'utils/add_dates.py',
|
||||
]);
|
||||
|
||||
const files = await github.paginate(github.rest.pulls.listFiles, {
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
pull_number: context.payload.issue.number,
|
||||
});
|
||||
|
||||
const modified = files.some(f => UTIL_SCRIPTS.has(f.filename));
|
||||
core.setOutput('util_scripts_modified', modified ? 'true' : 'false');
|
||||
|
||||
- name: Install editable transformers from PR branch
|
||||
if: steps.check_util_scripts.outputs.util_scripts_modified != 'true'
|
||||
run: |
|
||||
cd pr-repo
|
||||
pip install -e .
|
||||
|
||||
- name: Run repo consistency checks with trusted script
|
||||
id: run_repo_checks
|
||||
if: steps.check_util_scripts.outputs.util_scripts_modified != 'true' && startsWith(github.event.comment.body, '@bot /repo')
|
||||
run: |
|
||||
# Continue on errors (like Makefile's - prefix)
|
||||
set +e
|
||||
|
||||
# Run commands in PR directory (with the copied trusted scripts)
|
||||
cd pr-repo
|
||||
|
||||
# Run style commands
|
||||
ruff check examples tests src utils scripts benchmark benchmark_v2 setup.py conftest.py --fix
|
||||
ruff format examples tests src utils scripts benchmark benchmark_v2 setup.py conftest.py
|
||||
python utils/custom_init_isort.py
|
||||
python utils/sort_auto_mappings.py
|
||||
|
||||
# Run fix-repo commands
|
||||
python setup.py deps_table_update
|
||||
python utils/check_doc_toc.py --fix_and_overwrite
|
||||
python utils/check_copies.py --fix_and_overwrite
|
||||
python utils/check_modular_conversion.py --fix_and_overwrite
|
||||
python utils/check_dummies.py --fix_and_overwrite
|
||||
python utils/check_pipeline_typing.py --fix_and_overwrite
|
||||
python utils/check_doctest_list.py --fix_and_overwrite
|
||||
python utils/check_docstrings.py --fix_and_overwrite
|
||||
python utils/add_dates.py
|
||||
|
||||
# Check if there are changes
|
||||
if [ -n "$(git status --porcelain)" ]; then
|
||||
echo "changes_detected=true" >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "changes_detected=false" >> $GITHUB_OUTPUT
|
||||
fi
|
||||
|
||||
- name: Run style checks with trusted script
|
||||
id: run_style_checks
|
||||
if: steps.check_util_scripts.outputs.util_scripts_modified != 'true' && startsWith(github.event.comment.body, '@bot /style')
|
||||
run: |
|
||||
# Continue on errors (like Makefile's - prefix)
|
||||
set +e
|
||||
|
||||
# Run commands in PR directory (with the copied trusted scripts)
|
||||
cd pr-repo
|
||||
|
||||
# Run style commands
|
||||
ruff check examples tests src utils scripts benchmark benchmark_v2 setup.py conftest.py --fix
|
||||
ruff format examples tests src utils scripts benchmark benchmark_v2 setup.py conftest.py
|
||||
python utils/sort_auto_mappings.py
|
||||
|
||||
# Run fix-repo commands
|
||||
python setup.py deps_table_update
|
||||
python utils/check_doc_toc.py --fix_and_overwrite
|
||||
python utils/check_docstrings.py --fix_and_overwrite
|
||||
|
||||
# Check if there are changes
|
||||
if [ -n "$(git status --porcelain)" ]; then
|
||||
echo "changes_detected=true" >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "changes_detected=false" >> $GITHUB_OUTPUT
|
||||
fi
|
||||
|
||||
- name: Save modified files
|
||||
if: steps.check_util_scripts.outputs.util_scripts_modified != 'true' && (steps.run_repo_checks.outputs.changes_detected == 'true' || steps.run_style_checks.outputs.changes_detected == 'true')
|
||||
run: |
|
||||
cd pr-repo
|
||||
mkdir -p ../artifact-staging
|
||||
git diff --name-only > ../artifact-staging/modified-files.txt
|
||||
# Copy each modified file
|
||||
while IFS= read -r file; do
|
||||
mkdir -p "../artifact-staging/pr-repo/$(dirname "$file")"
|
||||
cp "$file" "../artifact-staging/pr-repo/$file"
|
||||
done < ../artifact-staging/modified-files.txt
|
||||
|
||||
- name: Upload modified files
|
||||
if: steps.check_util_scripts.outputs.util_scripts_modified != 'true' && (steps.run_repo_checks.outputs.changes_detected == 'true' || steps.run_style_checks.outputs.changes_detected == 'true')
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: modified-files
|
||||
path: artifact-staging/
|
||||
|
||||
commit-and-comment:
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [get-pr-number, get-pr-info, check-timestamps, init_comment_with_url, run-repo-consistency-checks]
|
||||
if: always()
|
||||
permissions:
|
||||
pull-requests: write
|
||||
contents: write
|
||||
steps:
|
||||
- name: Download modified files
|
||||
if: needs.run-repo-consistency-checks.outputs.changes_detected == 'true'
|
||||
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
with:
|
||||
name: modified-files
|
||||
|
||||
- name: Push changes to fork using git
|
||||
if: needs.run-repo-consistency-checks.outputs.changes_detected == 'true'
|
||||
env:
|
||||
PR_HEAD_REF: ${{ needs.get-pr-info.outputs.PR_HEAD_REF }}
|
||||
PR_HEAD_SHA: ${{ needs.check-timestamps.outputs.VERIFIED_PR_HEAD_SHA }}
|
||||
PR_HEAD_REPO_FULL_NAME: ${{ needs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}
|
||||
GITHUB_TOKEN: ${{ secrets.HF_STYLE_BOT_ACTION }}
|
||||
run: |
|
||||
# Initialize a fresh git repository for pushing
|
||||
mkdir push-repo
|
||||
cd push-repo
|
||||
git init
|
||||
|
||||
# Configure git
|
||||
git config user.name "github-actions[bot]"
|
||||
git config user.email "github-actions[bot]@users.noreply.github.com"
|
||||
|
||||
# Add fork as remote with token
|
||||
git remote add origin "https://x-access-token:${GITHUB_TOKEN}@github.com/${PR_HEAD_REPO_FULL_NAME}.git"
|
||||
|
||||
# Fetch only the specific branch
|
||||
git fetch origin "${PR_HEAD_REF}"
|
||||
|
||||
# Checkout the branch
|
||||
git checkout -b "${PR_HEAD_REF}" "origin/${PR_HEAD_REF}"
|
||||
|
||||
# Verify we're on the correct SHA
|
||||
current_sha=$(git rev-parse HEAD)
|
||||
if [ "$current_sha" != "$PR_HEAD_SHA" ]; then
|
||||
echo "❌ Error: Branch has been updated since workflow started"
|
||||
echo "Expected SHA: $PR_HEAD_SHA"
|
||||
echo "Current SHA: $current_sha"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Copy modified files from artifact
|
||||
echo "Copying modified files..."
|
||||
while IFS= read -r file; do
|
||||
if [ -n "$file" ]; then
|
||||
echo " - $file"
|
||||
mkdir -p "$(dirname "$file")"
|
||||
cp "../pr-repo/$file" "$file"
|
||||
fi
|
||||
done < ../modified-files.txt
|
||||
|
||||
# Check if there are changes
|
||||
if [ -n "$(git status --porcelain)" ]; then
|
||||
git add .
|
||||
git commit -m "Apply repo consistency fixes"
|
||||
git push origin "HEAD:${PR_HEAD_REF}"
|
||||
echo "✅ Changes pushed successfully"
|
||||
else
|
||||
echo "No changes to commit"
|
||||
fi
|
||||
|
||||
- name: Prepare final comment message
|
||||
id: prepare_final_comment
|
||||
if: needs.init_comment_with_url.result == 'success'
|
||||
env:
|
||||
CHANGES_DETECTED: ${{ needs.run-repo-consistency-checks.outputs.changes_detected }}
|
||||
UTIL_SCRIPTS_MODIFIED: ${{ needs.run-repo-consistency-checks.outputs.util_scripts_modified }}
|
||||
COMMENT_BODY: ${{ github.event.comment.body }}
|
||||
run: |
|
||||
# Determine which command was used
|
||||
if [[ "$COMMENT_BODY" == "@bot /style"* ]]; then
|
||||
MESSAGE_PREFIX="Style fix"
|
||||
else
|
||||
MESSAGE_PREFIX="Repo. Consistency"
|
||||
fi
|
||||
|
||||
if [ "$UTIL_SCRIPTS_MODIFIED" = 'true' ]; then
|
||||
echo "final_comment=${MESSAGE_PREFIX}: \`setup.py\` or some script files under the \`utils/\` directory are modified in this PR. Please run style/repo. checks/fixes locally." >> $GITHUB_OUTPUT
|
||||
elif [ "$CHANGES_DETECTED" = 'true' ]; then
|
||||
echo "final_comment=${MESSAGE_PREFIX} bot fixed some files and pushed the changes." >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "final_comment=${MESSAGE_PREFIX} fix runs successfully without any file modified." >> $GITHUB_OUTPUT
|
||||
fi
|
||||
|
||||
- name: Comment on PR
|
||||
if: needs.init_comment_with_url.result == 'success'
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
env:
|
||||
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
COMMENT_ID: ${{ needs.init_comment_with_url.outputs.comment_id }}
|
||||
FINAL_COMMENT: ${{ steps.prepare_final_comment.outputs.final_comment }}
|
||||
with:
|
||||
script: |
|
||||
const pr_number = parseInt(process.env.PR_NUMBER, 10);
|
||||
const comment_id = parseInt(process.env.COMMENT_ID, 10);
|
||||
const body = process.env.FINAL_COMMENT;
|
||||
await github.rest.issues.updateComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
comment_id,
|
||||
body,
|
||||
});
|
||||
138
.github/workflows/pr_build_doc_with_comment.yml
vendored
Normal file
138
.github/workflows/pr_build_doc_with_comment.yml
vendored
Normal file
@@ -0,0 +1,138 @@
|
||||
name: PR - build doc via comment
|
||||
on:
|
||||
issue_comment:
|
||||
types:
|
||||
- created
|
||||
branches-ignore:
|
||||
- main
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'build-doc') }}
|
||||
cancel-in-progress: true
|
||||
permissions: {}
|
||||
|
||||
|
||||
jobs:
|
||||
get-pr-number:
|
||||
name: Get PR number
|
||||
if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "molbap", "gante", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "eustlb", "MekkCyber", "vasqu", "ivarflakstad", "stevhliu", "ebezzam", "itazap", "tarekziade"]'), github.actor) && (startsWith(github.event.comment.body, 'build-doc')) }}
|
||||
uses: ./.github/workflows/get-pr-number.yml
|
||||
|
||||
get-pr-info:
|
||||
name: Get PR commit SHA
|
||||
needs: get-pr-number
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
|
||||
uses: ./.github/workflows/get-pr-info.yml
|
||||
with:
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
|
||||
verity_pr_commit:
|
||||
name: Verity PR commit corresponds to a specific event by comparing timestamps
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
|
||||
runs-on: ubuntu-22.04
|
||||
needs: get-pr-info
|
||||
env:
|
||||
COMMENT_DATE: ${{ github.event.comment.created_at }}
|
||||
PR_MERGE_COMMIT_DATE: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_DATE }}
|
||||
PR_MERGE_COMMIT_TIMESTAMP: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}
|
||||
steps:
|
||||
- run: |
|
||||
COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
|
||||
echo "COMMENT_DATE: $COMMENT_DATE"
|
||||
echo "PR_MERGE_COMMIT_DATE: $PR_MERGE_COMMIT_DATE"
|
||||
echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
|
||||
echo "PR_MERGE_COMMIT_TIMESTAMP: $PR_MERGE_COMMIT_TIMESTAMP"
|
||||
if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
|
||||
echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
|
||||
exit -1;
|
||||
fi
|
||||
|
||||
create_run:
|
||||
name: Create run
|
||||
needs: [get-pr-number, get-pr-info]
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != '' }}
|
||||
permissions:
|
||||
statuses: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Create Run
|
||||
id: create_run
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
# Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.
|
||||
# See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
NEEDS_GET_PR_INFO_OUTPUTS_PR_HEAD_SHA: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
repos/${{ github.repository }}/statuses/${NEEDS_GET_PR_INFO_OUTPUTS_PR_HEAD_SHA} \
|
||||
-f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Custom doc building job" -f "context=custom-doc-build"
|
||||
|
||||
reply_to_comment:
|
||||
name: Reply to the comment
|
||||
if: ${{ needs.create_run.result == 'success' }}
|
||||
needs: [get-pr-number, create_run]
|
||||
permissions:
|
||||
pull-requests: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Reply to the comment
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
NEEDS_GET_PR_NUMBER_OUTPUTS_PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
repos/${{ github.repository }}/issues/${NEEDS_GET_PR_NUMBER_OUTPUTS_PR_NUMBER}/comments \
|
||||
-f "body=[Building docs for all languages...](${GITHUB_RUN_URL})"
|
||||
|
||||
build-doc:
|
||||
name: Build doc
|
||||
needs: [get-pr-number, get-pr-info]
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != '' }}
|
||||
uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@093eb65f2e8745457987df060dc392e6bcf1347a # main
|
||||
with:
|
||||
commit_sha: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
package: transformers
|
||||
languages: ar de en es fr hi it ja ko pt zh
|
||||
|
||||
update_run_status:
|
||||
name: Update Check Run Status
|
||||
needs: [ get-pr-info, create_run, build-doc ]
|
||||
permissions:
|
||||
statuses: write
|
||||
if: ${{ always() && needs.create_run.result == 'success' }}
|
||||
runs-on: ubuntu-22.04
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
STATUS_OK: ${{ contains(fromJSON('["skipped", "success"]'), needs.build-doc.result) }}
|
||||
steps:
|
||||
- name: Get `build-doc` job status
|
||||
run: |
|
||||
echo "${{ needs.build-doc.result }}"
|
||||
echo $STATUS_OK
|
||||
if [ "$STATUS_OK" = "true" ]; then
|
||||
echo "STATUS=success" >> $GITHUB_ENV
|
||||
else
|
||||
echo "STATUS=failure" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
- name: Update PR commit statuses
|
||||
run: |
|
||||
echo "${{ needs.build-doc.result }}"
|
||||
echo "${STATUS}"
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
repos/${{ github.repository }}/statuses/${NEEDS_GET_PR_INFO_OUTPUTS_PR_HEAD_SHA} \
|
||||
-f "target_url=$GITHUB_RUN_URL" -f "state=${STATUS}" -f "description=Custom doc building job" -f "context=custom-doc-build"
|
||||
env:
|
||||
NEEDS_GET_PR_INFO_OUTPUTS_PR_HEAD_SHA: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
188
.github/workflows/pr_slow_ci_suggestion.yml
vendored
Normal file
188
.github/workflows/pr_slow_ci_suggestion.yml
vendored
Normal file
@@ -0,0 +1,188 @@
|
||||
name: PR slow CI - Suggestion
|
||||
on:
|
||||
pull_request_target:
|
||||
types: [opened, synchronize, reopened]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get-pr-number:
|
||||
name: Get PR number
|
||||
uses: ./.github/workflows/get-pr-number.yml
|
||||
|
||||
get-pr-info:
|
||||
name: Get PR commit SHA
|
||||
needs: get-pr-number
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
|
||||
uses: ./.github/workflows/get-pr-info.yml
|
||||
with:
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
|
||||
get-jobs:
|
||||
name: Get test files to run
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [get-pr-number, get-pr-info]
|
||||
outputs:
|
||||
jobs: ${{ steps.get_jobs.outputs.jobs_to_run }}
|
||||
steps:
|
||||
# This checkout to the main branch
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: "0"
|
||||
persist-credentials: false
|
||||
|
||||
# Refetch the PR file list from the API instead of receiving it as an
|
||||
# input from `get-pr-info.yml`. The previous approaches either expanded
|
||||
# attacker-controlled content into a shell/JS source via `${{ ... }}`
|
||||
# (template injection, fixed in #45956) or passed the full JSON through
|
||||
# an env var (`E2BIG` / "Argument list too long" once the patch payload
|
||||
# exceeds the per-env-var kernel limit, ~128KB). Fetching inside the
|
||||
# action keeps the JSON in Node heap memory and writes it straight to
|
||||
# disk, so it never crosses an execve argv/envp boundary.
|
||||
- name: Write pr_files file
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
env:
|
||||
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
with:
|
||||
script: |
|
||||
const fs = require('node:fs');
|
||||
const files = await github.paginate(github.rest.pulls.listFiles, {
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
pull_number: parseInt(process.env.PR_NUMBER, 10),
|
||||
});
|
||||
fs.writeFileSync('pr_files.txt', JSON.stringify(files));
|
||||
|
||||
- name: Get repository content
|
||||
id: repo_content
|
||||
uses: actions/github-script@d7906e4ad0b1822421a7e6a35d5ca353c962f410 # v6.4.1
|
||||
env:
|
||||
PR_HEAD_REPO_OWNER: ${{ needs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}
|
||||
PR_HEAD_REPO_NAME: ${{ needs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}
|
||||
PR_HEAD_SHA: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
with:
|
||||
script: |
|
||||
const fs = require('node:fs');
|
||||
const { PR_HEAD_REPO_OWNER, PR_HEAD_REPO_NAME, PR_HEAD_SHA } = process.env;
|
||||
|
||||
const { data: tests_dir } = await github.rest.repos.getContent({
|
||||
owner: PR_HEAD_REPO_OWNER,
|
||||
repo: PR_HEAD_REPO_NAME,
|
||||
path: 'tests',
|
||||
ref: PR_HEAD_SHA,
|
||||
});
|
||||
|
||||
const { data: tests_models_dir } = await github.rest.repos.getContent({
|
||||
owner: PR_HEAD_REPO_OWNER,
|
||||
repo: PR_HEAD_REPO_NAME,
|
||||
path: 'tests/models',
|
||||
ref: PR_HEAD_SHA,
|
||||
});
|
||||
|
||||
const { data: tests_quantization_dir } = await github.rest.repos.getContent({
|
||||
owner: PR_HEAD_REPO_OWNER,
|
||||
repo: PR_HEAD_REPO_NAME,
|
||||
path: 'tests/quantization',
|
||||
ref: PR_HEAD_SHA,
|
||||
});
|
||||
|
||||
// Write to files instead of outputs
|
||||
fs.writeFileSync('tests_dir.txt', JSON.stringify(tests_dir, null, 2));
|
||||
fs.writeFileSync('tests_models_dir.txt', JSON.stringify(tests_models_dir, null, 2));
|
||||
fs.writeFileSync('tests_quantization_dir.txt', JSON.stringify(tests_quantization_dir, null, 2));
|
||||
|
||||
- name: Run script to get jobs to run
|
||||
id: get_jobs
|
||||
run: |
|
||||
python utils/get_pr_run_slow_jobs.py | tee output.txt
|
||||
echo "jobs_to_run: $(tail -n 1 output.txt)"
|
||||
echo "jobs_to_run=$(tail -n 1 output.txt)" >> $GITHUB_OUTPUT
|
||||
|
||||
send_comment:
|
||||
# Will delete the previous comment and send a new one if:
|
||||
# - either the content is changed
|
||||
# - or the previous comment is 30 minutes or more old
|
||||
name: Send a comment to suggest jobs to run
|
||||
if: ${{ needs.get-jobs.outputs.jobs != '' }}
|
||||
needs: [get-pr-number, get-jobs]
|
||||
permissions:
|
||||
pull-requests: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Check and update comment if needed
|
||||
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
|
||||
env:
|
||||
BODY: "\n\nrun-slow: ${{ needs.get-jobs.outputs.jobs }}"
|
||||
PR_NUMBER: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
with:
|
||||
script: |
|
||||
const prNumber = parseInt(process.env.PR_NUMBER, 10);
|
||||
const commentPrefix = "**[For maintainers]** Suggested jobs to run (before merge)";
|
||||
const thirtyMinutesAgo = new Date(Date.now() - 30 * 60 * 1000); // 30 minutes ago
|
||||
const newBody = `${commentPrefix}${process.env.BODY}`;
|
||||
|
||||
// Get all comments on the PR
|
||||
const { data: comments } = await github.rest.issues.listComments({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: prNumber
|
||||
});
|
||||
|
||||
// Find existing comments that start with our prefix
|
||||
const existingComments = comments.filter(comment =>
|
||||
comment.user.login === 'github-actions[bot]' &&
|
||||
comment.body.startsWith(commentPrefix)
|
||||
);
|
||||
|
||||
let shouldCreateNewComment = true;
|
||||
let commentsToDelete = [];
|
||||
|
||||
if (existingComments.length > 0) {
|
||||
// Get the most recent comment
|
||||
const mostRecentComment = existingComments
|
||||
.sort((a, b) => new Date(b.created_at) - new Date(a.created_at))[0];
|
||||
|
||||
const commentDate = new Date(mostRecentComment.created_at);
|
||||
const isOld = commentDate < thirtyMinutesAgo;
|
||||
const isDifferentContent = mostRecentComment.body !== newBody;
|
||||
|
||||
console.log(`Most recent comment created: ${mostRecentComment.created_at}`);
|
||||
console.log(`Is older than 30 minutes: ${isOld}`);
|
||||
console.log(`Has different content: ${isDifferentContent}`);
|
||||
|
||||
if (isOld || isDifferentContent) {
|
||||
// Delete all existing comments and create new one
|
||||
commentsToDelete = existingComments;
|
||||
console.log(`Will delete ${commentsToDelete.length} existing comment(s) and create new one`);
|
||||
} else {
|
||||
// Content is same and comment is recent, skip
|
||||
shouldCreateNewComment = false;
|
||||
console.log('Comment is recent and content unchanged, skipping update');
|
||||
}
|
||||
} else {
|
||||
console.log('No existing comments found, will create new one');
|
||||
}
|
||||
|
||||
// Delete old comments if needed
|
||||
for (const comment of commentsToDelete) {
|
||||
console.log(`Deleting comment #${comment.id} (created: ${comment.created_at})`);
|
||||
await github.rest.issues.deleteComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
comment_id: comment.id
|
||||
});
|
||||
}
|
||||
|
||||
// Create new comment if needed
|
||||
if (shouldCreateNewComment) {
|
||||
await github.rest.issues.createComment({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: prNumber,
|
||||
body: newBody
|
||||
});
|
||||
console.log('✅ New comment created');
|
||||
} else {
|
||||
console.log('ℹ️ No comment update needed');
|
||||
}
|
||||
162
.github/workflows/push-important-models.yml
vendored
Normal file
162
.github/workflows/push-important-models.yml
vendored
Normal file
@@ -0,0 +1,162 @@
|
||||
name: Slow tests on important models (on Push - A10)
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main ]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get_modified_models:
|
||||
name: "Get all modified files"
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
matrix: ${{ steps.set-matrix.outputs.matrix }}
|
||||
steps:
|
||||
- name: Check out code
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Get changed files using `actions/github-script`
|
||||
id: get-changed-files
|
||||
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7.1.0
|
||||
with:
|
||||
script: |
|
||||
let files = [];
|
||||
|
||||
// Only handle push events
|
||||
if (context.eventName === 'push') {
|
||||
const afterSha = context.payload.after;
|
||||
const branchName = context.payload.ref.replace('refs/heads/', '');
|
||||
|
||||
let baseSha;
|
||||
|
||||
if (branchName === 'main') {
|
||||
console.log('Push to main branch, comparing to parent commit');
|
||||
// Get the parent commit of the pushed commit
|
||||
const { data: commit } = await github.rest.repos.getCommit({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
ref: afterSha
|
||||
});
|
||||
baseSha = commit.parents[0]?.sha;
|
||||
if (!baseSha) {
|
||||
throw new Error('No parent commit found for the pushed commit');
|
||||
}
|
||||
} else {
|
||||
console.log(`Push to branch ${branchName}, comparing to main`);
|
||||
baseSha = 'main';
|
||||
}
|
||||
|
||||
const { data: comparison } = await github.rest.repos.compareCommits({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
base: baseSha,
|
||||
head: afterSha
|
||||
});
|
||||
|
||||
// Include added, modified, and renamed files
|
||||
files = comparison.files
|
||||
.filter(file => file.status === 'added' || file.status === 'modified' || file.status === 'renamed')
|
||||
.map(file => file.filename);
|
||||
}
|
||||
|
||||
// Include all files under src/transformers/ (not just models subdirectory)
|
||||
const filteredFiles = files.filter(file =>
|
||||
file.startsWith('src/transformers/')
|
||||
);
|
||||
|
||||
core.setOutput('changed_files', filteredFiles.join(' '));
|
||||
core.setOutput('any_changed', filteredFiles.length > 0 ? 'true' : 'false');
|
||||
|
||||
- name: Parse changed files with Python
|
||||
if: steps.get-changed-files.outputs.any_changed == 'true'
|
||||
env:
|
||||
CHANGED_FILES: ${{ steps.get-changed-files.outputs.changed_files }}
|
||||
id: set-matrix
|
||||
run: |
|
||||
python3 - << 'EOF'
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
|
||||
# Add the utils directory to Python path
|
||||
sys.path.insert(0, 'utils')
|
||||
|
||||
# Import the important models list
|
||||
from important_files import IMPORTANT_MODELS
|
||||
|
||||
print(f"Important models: {IMPORTANT_MODELS}")
|
||||
|
||||
# Get the changed files from the previous step
|
||||
changed_files_str = os.environ.get('CHANGED_FILES', '')
|
||||
changed_files = changed_files_str.split() if changed_files_str else []
|
||||
|
||||
# Filter to only Python files
|
||||
python_files = [f for f in changed_files if f.endswith('.py')]
|
||||
print(f"Python files changed: {python_files}")
|
||||
|
||||
result_models = set()
|
||||
|
||||
# Specific files that trigger all models
|
||||
transformers_utils_files = [
|
||||
'modeling_utils.py',
|
||||
'modeling_rope_utils.py',
|
||||
'modeling_flash_attention_utils.py',
|
||||
'modeling_attn_mask_utils.py',
|
||||
'cache_utils.py',
|
||||
'masking_utils.py',
|
||||
'pytorch_utils.py'
|
||||
]
|
||||
|
||||
# Single loop through all Python files
|
||||
for file in python_files:
|
||||
# Check for files under src/transformers/models/
|
||||
if file.startswith('src/transformers/models/'):
|
||||
remaining_path = file[len('src/transformers/models/'):]
|
||||
if '/' in remaining_path:
|
||||
model_dir = remaining_path.split('/')[0]
|
||||
if model_dir in IMPORTANT_MODELS:
|
||||
result_models.add(model_dir)
|
||||
print(f"Added model directory: {model_dir}")
|
||||
|
||||
# Check for specific files under src/transformers/ or src/transformers/generation/ files
|
||||
elif file.startswith('src/transformers/generation/') or \
|
||||
(file.startswith('src/transformers/') and os.path.basename(file) in transformers_utils_files):
|
||||
print(f"Found core file: {file} - including all important models")
|
||||
result_models.update(IMPORTANT_MODELS)
|
||||
break # No need to continue once we include all models
|
||||
|
||||
# Convert to sorted list and create matrix
|
||||
result_list = sorted(list(result_models))
|
||||
print(f"Final model list: {result_list}")
|
||||
|
||||
if result_list:
|
||||
matrix_json = json.dumps(result_list)
|
||||
print(f"matrix={matrix_json}")
|
||||
|
||||
# Write to GITHUB_OUTPUT
|
||||
with open(os.environ['GITHUB_OUTPUT'], 'a') as f:
|
||||
f.write(f"matrix={matrix_json}\n")
|
||||
else:
|
||||
print("matrix=[]")
|
||||
with open(os.environ['GITHUB_OUTPUT'], 'a') as f:
|
||||
f.write("matrix=[]\n")
|
||||
EOF
|
||||
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
needs: get_modified_models
|
||||
if: needs.get_modified_models.outputs.matrix != '' && needs.get_modified_models.outputs.matrix != '[]'
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-push"
|
||||
docker: huggingface/transformers-all-latest-gpu:flash-attn
|
||||
ci_event: push
|
||||
report_repo_id: hf-internal-testing/transformers_ci_push
|
||||
commit_sha: ${{ github.sha }}
|
||||
subdirs: ${{ needs.get_modified_models.outputs.matrix }}
|
||||
secrets: inherit
|
||||
52
.github/workflows/release-conda.yml
vendored
Normal file
52
.github/workflows/release-conda.yml
vendored
Normal file
@@ -0,0 +1,52 @@
|
||||
name: Release - Conda
|
||||
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- v*
|
||||
branches:
|
||||
- conda_*
|
||||
|
||||
env:
|
||||
ANACONDA_API_TOKEN: ${{ secrets.ANACONDA_API_TOKEN }}
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build_and_package:
|
||||
runs-on: ubuntu-22.04
|
||||
defaults:
|
||||
run:
|
||||
shell: bash -l {0}
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install miniconda
|
||||
uses: conda-incubator/setup-miniconda@9f54435e0e72c53962ee863144e47a4b094bfd35 # v2.3.0
|
||||
with:
|
||||
auto-update-conda: true
|
||||
auto-activate-base: false
|
||||
python-version: 3.8
|
||||
activate-environment: "build-transformers"
|
||||
channels: huggingface
|
||||
|
||||
- name: Setup conda env
|
||||
run: |
|
||||
conda install -c defaults anaconda-client conda-build
|
||||
|
||||
- name: Extract version
|
||||
run: echo "TRANSFORMERS_VERSION=`python setup.py --version`" >> $GITHUB_ENV
|
||||
|
||||
- name: Build conda packages
|
||||
run: |
|
||||
conda info
|
||||
conda list
|
||||
conda-build .github/conda
|
||||
|
||||
- name: Upload to Anaconda
|
||||
run: anaconda upload `conda-build .github/conda --output` --force
|
||||
72
.github/workflows/release.yml
vendored
Normal file
72
.github/workflows/release.yml
vendored
Normal file
@@ -0,0 +1,72 @@
|
||||
name: Release
|
||||
on:
|
||||
push:
|
||||
tags:
|
||||
- v*
|
||||
branches:
|
||||
- 'v*-release'
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build_and_test:
|
||||
name: build release
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: set up python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: "3.13"
|
||||
|
||||
- run: pip install setuptools
|
||||
- run: pip install -e .
|
||||
- run: make build-release
|
||||
|
||||
- run: pip uninstall -y transformers
|
||||
- run: pip install dist/*.whl
|
||||
|
||||
- run: python -c "from transformers import *"
|
||||
|
||||
- run: pip install -e .[torch]
|
||||
- run: python -c "from transformers import pipeline; classifier = pipeline('text-classification'); assert classifier('What a nice release')[0]['score'] > 0"
|
||||
|
||||
- run: pip install twine
|
||||
- run: twine check --strict dist/*
|
||||
|
||||
- name: Upload build artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: python-dist
|
||||
path: |
|
||||
dist/**
|
||||
build/**
|
||||
|
||||
upload_package:
|
||||
needs: build_and_test
|
||||
if: startsWith(github.ref, 'refs/tags/')
|
||||
runs-on: ubuntu-latest
|
||||
environment: pypi-release
|
||||
permissions:
|
||||
id-token: write
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Download build artifacts
|
||||
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
with:
|
||||
name: python-dist
|
||||
path: .
|
||||
|
||||
- name: Publish package distributions to TestPyPI
|
||||
uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # release/v1
|
||||
with:
|
||||
verbose: true
|
||||
|
||||
472
.github/workflows/self-comment-ci.yml
vendored
Normal file
472
.github/workflows/self-comment-ci.yml
vendored
Normal file
@@ -0,0 +1,472 @@
|
||||
name: PR comment GitHub CI
|
||||
|
||||
on:
|
||||
issue_comment:
|
||||
types:
|
||||
- created
|
||||
branches-ignore:
|
||||
- main
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.issue.number }}-${{ startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow') }}
|
||||
cancel-in-progress: true
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
OMP_NUM_THREADS: 8
|
||||
MKL_NUM_THREADS: 8
|
||||
RUN_SLOW: yes
|
||||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
|
||||
# This token is created under the bot `hf-transformers-bot`.
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
CUDA_VISIBLE_DEVICES: 0,1
|
||||
|
||||
|
||||
jobs:
|
||||
get-pr-number:
|
||||
name: Get PR number
|
||||
if: ${{ github.event.issue.state == 'open' && contains(fromJSON('["ydshieh", "ArthurZucker", "zucchini-nlp", "molbap", "LysandreJik", "Cyrilvallez", "Rocketknight1", "SunMarc", "eustlb", "vasqu", "ivarflakstad", "stevhliu", "ebezzam", "remi-or", "itazap", "3outeille", "IlyasMoutawwakil", "tarekziade", "yonigozlan", "guarin"]'), github.actor) && (startsWith(github.event.comment.body, 'run-slow') || startsWith(github.event.comment.body, 'run slow') || startsWith(github.event.comment.body, 'run_slow')) }}
|
||||
uses: ./.github/workflows/get-pr-number.yml
|
||||
|
||||
get-pr-info:
|
||||
name: Get PR commit SHA
|
||||
needs: get-pr-number
|
||||
if: ${{ needs.get-pr-number.outputs.PR_NUMBER != ''}}
|
||||
uses: ./.github/workflows/get-pr-info.yml
|
||||
with:
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
|
||||
check-timestamps:
|
||||
name: Check timestamps (security check)
|
||||
runs-on: ubuntu-22.04
|
||||
needs: get-pr-info
|
||||
outputs:
|
||||
PR_HEAD_SHA: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
PR_MERGE_SHA: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_SHA }}
|
||||
steps:
|
||||
- name: Verify `merge_commit` timestamp is older than the issue comment timestamp
|
||||
env:
|
||||
COMMENT_DATE: ${{ github.event.comment.created_at }}
|
||||
PR_MERGE_COMMIT_TIMESTAMP: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}
|
||||
run: |
|
||||
COMMENT_TIMESTAMP=$(date -d "${COMMENT_DATE}" +"%s")
|
||||
echo "COMMENT_DATE: $COMMENT_DATE"
|
||||
echo "COMMENT_TIMESTAMP: $COMMENT_TIMESTAMP"
|
||||
if [ $COMMENT_TIMESTAMP -le $PR_MERGE_COMMIT_TIMESTAMP ]; then
|
||||
echo "Last commit on the pull request is newer than the issue comment triggering this run! Abort!";
|
||||
exit -1;
|
||||
fi
|
||||
|
||||
# use a python script to handle this complex logic.
|
||||
get-tests:
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [get-pr-number, check-timestamps]
|
||||
outputs:
|
||||
models: ${{ steps.models_to_run.outputs.models }}
|
||||
quantizations: ${{ steps.models_to_run.outputs.quantizations }}
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: "0"
|
||||
ref: "refs/pull/${{ needs.get-pr-number.outputs.PR_NUMBER }}/merge"
|
||||
persist-credentials: false
|
||||
|
||||
- name: Verify merge commit SHA
|
||||
env:
|
||||
VERIFIED_PR_MERGE_SHA: ${{ needs.check-timestamps.outputs.PR_MERGE_SHA }}
|
||||
run: |
|
||||
PR_MERGE_SHA=$(git log -1 --format=%H)
|
||||
if [ $PR_MERGE_SHA != $VERIFIED_PR_MERGE_SHA ]; then
|
||||
echo "The merged commit SHA is not the same as the verified one! Security issue detected, abort the workflow!";
|
||||
exit -1;
|
||||
fi
|
||||
|
||||
- name: Get models to test
|
||||
env:
|
||||
PR_COMMENT: ${{ github.event.comment.body }}
|
||||
run: |
|
||||
python -m pip install GitPython
|
||||
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" | tee output.txt
|
||||
echo "models=$(tail -n 1 output.txt)" >> $GITHUB_ENV
|
||||
python utils/pr_slow_ci_models.py --message "$PR_COMMENT" --quantization | tee output2.txt
|
||||
echo "quantizations=$(tail -n 1 output2.txt)" >> $GITHUB_ENV
|
||||
|
||||
- name: Show models to test
|
||||
id: models_to_run
|
||||
run: |
|
||||
echo "$models"
|
||||
echo "models=$models" >> $GITHUB_OUTPUT
|
||||
echo "$quantizations"
|
||||
echo "quantizations=$quantizations" >> $GITHUB_OUTPUT
|
||||
|
||||
# Report back if we are not able to get the tests (for example, security check is failing)
|
||||
report_error_earlier:
|
||||
name: Report error earlier
|
||||
if: ${{ always() && needs.get-pr-info.result == 'success' && needs.get-tests.result != 'success' }}
|
||||
needs: [get-pr-number, get-pr-info, get-tests]
|
||||
permissions:
|
||||
pull-requests: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Reply to the comment
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
PREFIX: '[Workflow Run ⚙️](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})\n\n'
|
||||
github_repository: ${{ github.repository }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${github_repository}/issues/${pr_number}/comments" \
|
||||
-f body="$(echo -e "$PREFIX")💔 This comment contains \`run-slow\`, but unknown error occurred and [the workflow run]($GITHUB_RUN_URL) aborted!"
|
||||
|
||||
reply_to_comment:
|
||||
name: Reply to the comment
|
||||
if: ${{ needs.get-tests.outputs.models != '[]' || needs.get-tests.outputs.quantizations != '[]' }}
|
||||
needs: [get-pr-number, get-tests]
|
||||
permissions:
|
||||
pull-requests: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Reply to the comment
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
PREFIX: '[Workflow Run ⚙️](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})'
|
||||
INFO: '\n\nThis comment contains `run-slow`, running the specified jobs'
|
||||
BODY: '\n\nmodels: ${{ needs.get-tests.outputs.models }}\nquantizations: ${{ needs.get-tests.outputs.quantizations }}'
|
||||
github_repository: ${{ github.repository }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${github_repository}/issues/${pr_number}/comments" \
|
||||
-f body="$(echo -e "$PREFIX")$(echo -e "$INFO"): $(echo -e "$BODY")"
|
||||
|
||||
create_run:
|
||||
name: Create run
|
||||
needs: [check-timestamps, reply_to_comment]
|
||||
permissions:
|
||||
statuses: write
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Create Run
|
||||
id: create_run
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
# Create a commit status (pending) for a run of this workflow. The status has to be updated later in `update_run_status`.
|
||||
# See https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
github_repository: ${{ github.repository }}
|
||||
pr_head_sha: ${{ needs.check-timestamps.outputs.PR_HEAD_SHA }}
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${github_repository}/statuses/${pr_head_sha}" \
|
||||
-f "target_url=$GITHUB_RUN_URL" -f "state=pending" -f "description=Slow CI job" -f "context=pytest/custom-tests"
|
||||
|
||||
model-ci:
|
||||
name: Model CI
|
||||
if: ${{ needs.get-tests.outputs.models != '[]' }}
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
needs: [get-pr-number, check-timestamps, get-tests, create_run]
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-pr"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
ci_event: PR Comment CI
|
||||
report_repo_id: hf-internal-testing/transformers_pr_ci
|
||||
commit_sha: ${{ needs.check-timestamps.outputs.PR_MERGE_SHA }}
|
||||
subdirs: ${{ needs.get-tests.outputs.models }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
secrets: inherit
|
||||
|
||||
quantization-ci:
|
||||
name: Quantization CI
|
||||
if: ${{ needs.get-tests.outputs.quantizations != '[]' }}
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
needs: [get-pr-number, check-timestamps, get-tests, create_run]
|
||||
with:
|
||||
job: run_quantization_torch_gpu
|
||||
slack_report_channel: "#transformers-ci-pr"
|
||||
docker: huggingface/transformers-quantization-latest-gpu
|
||||
ci_event: PR Comment CI
|
||||
report_repo_id: hf-internal-testing/transformers_pr_ci
|
||||
commit_sha: ${{ needs.check-timestamps.outputs.PR_MERGE_SHA }}
|
||||
subdirs: ${{ needs.get-tests.outputs.quantizations }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
secrets: inherit
|
||||
|
||||
report:
|
||||
name: Check & Report
|
||||
needs: [get-pr-number, get-pr-info, check-timestamps, create_run, model-ci, quantization-ci]
|
||||
permissions:
|
||||
pull-requests: write
|
||||
statuses: write
|
||||
if: ${{ always() && needs.create_run.result == 'success' }}
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
|
||||
with:
|
||||
pattern: new_failures_with_bad_commit_{run_models_gpu,run_quantization_torch_gpu}
|
||||
path: ./new_failures
|
||||
merge-multiple: false
|
||||
|
||||
- name: List downloaded artifacts
|
||||
run: |
|
||||
echo "Downloaded artifact files:"
|
||||
if [ -d "./new_failures/" ]; then
|
||||
find ./new_failures/ -type f
|
||||
else
|
||||
echo "No artifacts downloaded (directory doesn't exist)"
|
||||
fi
|
||||
|
||||
- name: Show reports from jobs
|
||||
run: |
|
||||
echo "=== Model CI Report ==="
|
||||
if [ -f "./new_failures/new_failures_with_bad_commit_run_models_gpu/new_failures_with_bad_commit.json" ]; then
|
||||
cat ./new_failures/new_failures_with_bad_commit_run_models_gpu/new_failures_with_bad_commit.json
|
||||
else
|
||||
echo "No model CI report found"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== Quantization CI Report ==="
|
||||
if [ -f "./new_failures/new_failures_with_bad_commit_run_quantization_torch_gpu/new_failures_with_bad_commit.json" ]; then
|
||||
cat ./new_failures/new_failures_with_bad_commit_run_quantization_torch_gpu/new_failures_with_bad_commit.json
|
||||
else
|
||||
echo "No quantization CI report found"
|
||||
fi
|
||||
|
||||
- name: Process and filter reports
|
||||
run: |
|
||||
# Preprocess with Python
|
||||
python3 << 'PYTHON_SCRIPT'
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
def count_failures(data):
|
||||
"""
|
||||
Count total number of failures (excluding None commits)
|
||||
"""
|
||||
total = 0
|
||||
for model, model_result in data.items():
|
||||
for device, failures in model_result.items():
|
||||
# Count failures where commit is not None
|
||||
total += sum(
|
||||
1 for failure in failures
|
||||
if isinstance(failure, dict) and failure.get('bad_commit') is not None
|
||||
)
|
||||
return total
|
||||
|
||||
def filter_and_format_report(data):
|
||||
"""
|
||||
Filter out entries where commit is `None` (failing tests who status is not certain) and format as text
|
||||
"""
|
||||
lines = []
|
||||
|
||||
for model, model_result in data.items():
|
||||
model_lines = []
|
||||
for device, failures in model_result.items():
|
||||
|
||||
filtered_failures = [
|
||||
failure for failure in failures
|
||||
if isinstance(failure, dict) and failure.get('bad_commit') is not None
|
||||
]
|
||||
|
||||
# Add tests to model lines
|
||||
for idx, failure in enumerate(filtered_failures):
|
||||
if idx == 0:
|
||||
job_link = failure['job_link']
|
||||
model_lines.append(f"- [{model}]({job_link}):")
|
||||
|
||||
test_name = failure['test']
|
||||
status = failure.get('status', '')
|
||||
if "DIFFERENT error message" in status:
|
||||
indicator = "(❌ ⟹ ❌)"
|
||||
else:
|
||||
indicator = "(✅ ⟹ ❌)"
|
||||
|
||||
model_lines.append(f" {test_name} {indicator}")
|
||||
|
||||
if len(model_lines) > 0:
|
||||
lines.extend(model_lines)
|
||||
lines.append("") # Empty line between models
|
||||
|
||||
return "\n".join(lines).strip()
|
||||
|
||||
# Read reports from downloaded artifact files
|
||||
model_report_path = Path('./new_failures/new_failures_with_bad_commit_run_models_gpu/new_failures_with_bad_commit.json')
|
||||
quant_report_path = Path('./new_failures/new_failures_with_bad_commit_run_quantization_torch_gpu/new_failures_with_bad_commit.json')
|
||||
|
||||
# Read URL files if they exist
|
||||
model_url_path = Path('./new_failures/new_failures_with_bad_commit_run_models_gpu/new_failures_with_bad_commit_url.txt')
|
||||
quant_url_path = Path('./new_failures/new_failures_with_bad_commit_run_quantization_torch_gpu/new_failures_with_bad_commit_url.txt')
|
||||
|
||||
model_url = None
|
||||
if model_url_path.exists():
|
||||
with open(model_url_path, 'r') as f:
|
||||
model_url = f.read().strip()
|
||||
|
||||
quant_url = None
|
||||
if quant_url_path.exists():
|
||||
with open(quant_url_path, 'r') as f:
|
||||
quant_url = f.read().strip()
|
||||
|
||||
model_report = {}
|
||||
if model_report_path.exists():
|
||||
with open(model_report_path, 'r') as f:
|
||||
model_report = json.load(f)
|
||||
|
||||
quant_report = {}
|
||||
if quant_report_path.exists():
|
||||
with open(quant_report_path, 'r') as f:
|
||||
quant_report = json.load(f)
|
||||
|
||||
# Count failures
|
||||
model_count = count_failures(model_report)
|
||||
quant_count = count_failures(quant_report)
|
||||
|
||||
formatted_model = filter_and_format_report(model_report)
|
||||
formatted_quant = filter_and_format_report(quant_report)
|
||||
|
||||
# Write to files
|
||||
with open('model_ci.txt', 'w') as f:
|
||||
if formatted_model:
|
||||
if model_url:
|
||||
f.write(f"❌ **[{model_count} new failed tests from this PR]({model_url})** 😭\n\n")
|
||||
else:
|
||||
f.write(f"❌ **{model_count} new failed tests from this PR** 😭\n\n")
|
||||
f.write(formatted_model)
|
||||
f.write('\n')
|
||||
|
||||
with open('quantization_ci.txt', 'w') as f:
|
||||
if formatted_quant:
|
||||
if quant_url:
|
||||
f.write(f"❌ **[{quant_count} new failed tests from this PR]({quant_url})** 😭\n\n")
|
||||
else:
|
||||
f.write(f"❌ **{quant_count} new failed tests from this PR** 😭\n\n")
|
||||
f.write(formatted_quant)
|
||||
f.write('\n')
|
||||
PYTHON_SCRIPT
|
||||
|
||||
- name: Post results as PR comment
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
github_repository: ${{ github.repository }}
|
||||
pr_number: ${{ needs.get-pr-number.outputs.PR_NUMBER }}
|
||||
pr_head_repo: ${{ needs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}
|
||||
run_commit: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_SHA }}
|
||||
pr_commit: ${{ needs.get-pr-info.outputs.PR_HEAD_SHA }}
|
||||
main_commit: ${{ needs.get-pr-info.outputs.PR_MERGE_COMMIT_BASE_SHA }}
|
||||
model_ci_result: ${{ needs.model-ci.result }}
|
||||
quantization_ci_result: ${{ needs.quantization-ci.result }}
|
||||
model_infrastructure_ok: ${{ needs.model-ci.outputs.is_infrastructure_ok }}
|
||||
quant_infrastructure_ok: ${{ needs.quantization-ci.outputs.is_infrastructure_ok }}
|
||||
run: |
|
||||
# Create commit links (8 chars display, full SHA in URL)
|
||||
run_commit_link="[${run_commit:0:8}](https://github.com/${github_repository}/commit/${run_commit})"
|
||||
pr_commit_link="[${pr_commit:0:8}](https://github.com/${pr_head_repo}/commit/${pr_commit})"
|
||||
main_commit_link="[${main_commit:0:8}](https://github.com/${github_repository}/commit/${main_commit})"
|
||||
|
||||
{
|
||||
echo '## CI Results'
|
||||
echo "[Workflow Run ⚙️]($GITHUB_RUN_URL)"
|
||||
echo ''
|
||||
|
||||
echo '### Commit Info'
|
||||
echo "| Context | Commit | Description |"
|
||||
echo "|---------|--------|-------------|"
|
||||
echo "| RUN | ${run_commit_link} | workflow commit (merge commit) |"
|
||||
echo "| PR | ${pr_commit_link} | branch commit (from PR) |"
|
||||
echo "| main | ${main_commit_link} | base commit (on \`main\`) |"
|
||||
echo ''
|
||||
|
||||
# Check infrastructure health - simple and clean!
|
||||
infrastructure_error=false
|
||||
|
||||
# Model CI
|
||||
if [[ "$model_ci_result" != "skipped" && "$model_ci_result" != "cancelled" ]]; then
|
||||
if [[ "$model_infrastructure_ok" == "false" ]]; then
|
||||
echo "⚠️ **Model CI failed to report results**"
|
||||
echo ''
|
||||
infrastructure_error=true
|
||||
fi
|
||||
fi
|
||||
|
||||
# Quantization CI
|
||||
if [[ "$quantization_ci_result" != "skipped" && "$quantization_ci_result" != "cancelled" ]]; then
|
||||
if [[ "$quant_infrastructure_ok" == "false" ]]; then
|
||||
echo "⚠️ **Quantization CI failed to report results**"
|
||||
echo ''
|
||||
infrastructure_error=true
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ "$infrastructure_error" == "true" ]]; then
|
||||
echo 'The test failure analysis could not be completed. Please check the [workflow run]('$GITHUB_RUN_URL') for details.'
|
||||
echo "STATUS=error" >> $GITHUB_ENV
|
||||
|
||||
# Check if both jobs were skipped or cancelled
|
||||
elif [[ "$model_ci_result" == "skipped" || "$model_ci_result" == "cancelled" ]] && \
|
||||
[[ "$quantization_ci_result" == "skipped" || "$quantization_ci_result" == "cancelled" ]]; then
|
||||
echo '⚠️ No test being reported (jobs are skipped or cancelled)!'
|
||||
echo "STATUS=error" >> $GITHUB_ENV
|
||||
|
||||
# Check if either file has content
|
||||
elif [ -s model_ci.txt ] || [ -s quantization_ci.txt ]; then
|
||||
echo "STATUS=failure" >> $GITHUB_ENV
|
||||
|
||||
# Check if model_ci.txt has content
|
||||
if [ -s model_ci.txt ]; then
|
||||
echo '### Model CI Report'
|
||||
echo ''
|
||||
cat model_ci.txt
|
||||
echo ''
|
||||
fi
|
||||
|
||||
# Check if quantization_ci.txt has content
|
||||
if [ -s quantization_ci.txt ]; then
|
||||
echo '### Quantization CI Report'
|
||||
echo ''
|
||||
cat quantization_ci.txt
|
||||
echo ''
|
||||
fi
|
||||
else
|
||||
echo "STATUS=success" >> $GITHUB_ENV
|
||||
echo '✅ No failing test specific to this PR 🎉 👏 !'
|
||||
fi
|
||||
} > comment_body.txt
|
||||
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${github_repository}/issues/${pr_number}/comments" \
|
||||
-F body=@comment_body.txt
|
||||
|
||||
- name: Update PR commit statuses
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GITHUB_RUN_URL: https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}
|
||||
github_repository: ${{ github.repository }}
|
||||
pr_head_sha: ${{ needs.check-timestamps.outputs.PR_HEAD_SHA }}
|
||||
# The env. variable `STATUS` used here is set in the previous step
|
||||
run: |
|
||||
gh api \
|
||||
--method POST \
|
||||
-H "Accept: application/vnd.github+json" \
|
||||
-H "X-GitHub-Api-Version: 2022-11-28" \
|
||||
"repos/${github_repository}/statuses/${pr_head_sha}" \
|
||||
-f "target_url=$GITHUB_RUN_URL" -f "state=$STATUS" -f "description=Slow CI job" -f "context=pytest/custom-tests"
|
||||
63
.github/workflows/self-nightly-caller.yml
vendored
Normal file
63
.github/workflows/self-nightly-caller.yml
vendored
Normal file
@@ -0,0 +1,63 @@
|
||||
name: Nvidia CI with nightly torch
|
||||
|
||||
on:
|
||||
repository_dispatch:
|
||||
# triggered when the daily scheduled Nvidia CI is completed.
|
||||
# This way, we can compare the results more easily.
|
||||
workflow_run:
|
||||
workflows: ["Nvidia CI"]
|
||||
branches: ["main"]
|
||||
types: [completed]
|
||||
push:
|
||||
branches:
|
||||
- run_ci_with_nightly_torch*
|
||||
|
||||
# Used for `push` to easily modify the target workflow runs to compare against
|
||||
env:
|
||||
prev_workflow_run_id: ""
|
||||
other_workflow_run_id: ""
|
||||
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build_nightly_torch_ci_images:
|
||||
name: Build CI Docker Images with nightly torch
|
||||
uses: ./.github/workflows/build-nightly-ci-docker-images.yml
|
||||
with:
|
||||
job: latest-with-torch-nightly-docker
|
||||
secrets: inherit
|
||||
|
||||
setup:
|
||||
name: Setup
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Setup
|
||||
env:
|
||||
PREV_WORKFLOW_RUN_ID: ${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}
|
||||
OTHER_WORKFLOW_RUN_ID: ${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}
|
||||
run: |
|
||||
mkdir "setup_values"
|
||||
echo "$PREV_WORKFLOW_RUN_ID" > "setup_values/prev_workflow_run_id.txt"
|
||||
echo "$OTHER_WORKFLOW_RUN_ID" > "setup_values/other_workflow_run_id.txt"
|
||||
|
||||
- name: Upload artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: setup_values
|
||||
path: setup_values
|
||||
|
||||
model-ci:
|
||||
name: Model CI
|
||||
needs: build_nightly_torch_ci_images
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-past-future"
|
||||
docker: huggingface/transformers-all-latest-torch-nightly-gpu
|
||||
ci_event: Nightly CI
|
||||
runner_type: "a10"
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci_with_torch_nightly
|
||||
commit_sha: ${{ github.event.workflow_run.head_sha || github.sha }}
|
||||
secrets: inherit
|
||||
102
.github/workflows/self-nightly-past-ci-caller.yml
vendored
Normal file
102
.github/workflows/self-nightly-past-ci-caller.yml
vendored
Normal file
@@ -0,0 +1,102 @@
|
||||
name: Self-hosted runner (nightly-past-ci-caller)
|
||||
|
||||
on:
|
||||
schedule:
|
||||
- cron: "17 2,14 * * *"
|
||||
push:
|
||||
branches:
|
||||
- run_past_ci*
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get_number:
|
||||
name: Get number
|
||||
runs-on: ubuntu-22.04
|
||||
outputs:
|
||||
run_number: ${{ steps.get_number.outputs.run_number }}
|
||||
steps:
|
||||
- name: Get number
|
||||
id: get_number
|
||||
run: |
|
||||
echo "${{ github.run_number }}"
|
||||
echo "$(python3 -c 'print(int(${{ github.run_number }}) % 10)')"
|
||||
echo "run_number=$(python3 -c 'print(int(${{ github.run_number }}) % 10)')" >> $GITHUB_OUTPUT
|
||||
|
||||
run_past_ci_tensorflow_2-11:
|
||||
name: TensorFlow 2.11
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 3 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.11"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-10:
|
||||
name: TensorFlow 2.10
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 4 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.10"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-9:
|
||||
name: TensorFlow 2.9
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 5 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.9"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-8:
|
||||
name: TensorFlow 2.8
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 6 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.8"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-7:
|
||||
name: TensorFlow 2.7
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 7 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.7"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-6:
|
||||
name: TensorFlow 2.6
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 8 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.6"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
run_past_ci_tensorflow_2-5:
|
||||
name: TensorFlow 2.5
|
||||
needs: get_number
|
||||
if: needs.get_number.outputs.run_number == 9 && (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
|
||||
uses: ./.github/workflows/self-past-caller.yml
|
||||
with:
|
||||
framework: tensorflow
|
||||
version: "2.5"
|
||||
sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
43
.github/workflows/self-past-caller.yml
vendored
Normal file
43
.github/workflows/self-past-caller.yml
vendored
Normal file
@@ -0,0 +1,43 @@
|
||||
name: Self-hosted runner (past-ci)
|
||||
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
framework:
|
||||
required: true
|
||||
type: string
|
||||
version:
|
||||
required: true
|
||||
type: string
|
||||
# Use this to control the commit to test against
|
||||
sha:
|
||||
default: 'main'
|
||||
required: false
|
||||
type: string
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-past-future"
|
||||
runner: past-ci
|
||||
docker: huggingface/transformers-${{ inputs.framework }}-past-${{ inputs.version }}-gpu
|
||||
ci_event: Past CI - ${{ inputs.framework }}-${{ inputs.version }}
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
slack_report_channel: "#transformers-ci-past-future"
|
||||
runner: past-ci
|
||||
docker: huggingface/transformers-${{ inputs.framework }}-past-${{ inputs.version }}-gpu
|
||||
ci_event: Past CI - ${{ inputs.framework }}-${{ inputs.version }}
|
||||
secrets: inherit
|
||||
17
.github/workflows/self-scheduled-amd-caller.yml
vendored
Normal file
17
.github/workflows/self-scheduled-amd-caller.yml
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
name: Self-hosted runner (AMD scheduled CI caller)
|
||||
|
||||
on:
|
||||
schedule:
|
||||
- cron: "17 5 * * *"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
run_scheduled_amd_ci:
|
||||
name: Trigger Scheduled AMD CI
|
||||
runs-on: ubuntu-22.04
|
||||
if: ${{ always() }}
|
||||
steps:
|
||||
- name: Trigger scheduled AMD CI via workflow_run
|
||||
run: echo "Trigger scheduled AMD CI via workflow_run"
|
||||
62
.github/workflows/self-scheduled-amd-mi250-caller.yml
vendored
Normal file
62
.github/workflows/self-scheduled-amd-mi250-caller.yml
vendored
Normal file
@@ -0,0 +1,62 @@
|
||||
name: Self-hosted runner (AMD mi250 scheduled CI caller)
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
|
||||
branches: ["main"]
|
||||
types: [completed]
|
||||
push:
|
||||
branches:
|
||||
- run_amd_scheduled_ci_caller*
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-amd"
|
||||
runner: mi250
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi250
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
secrets: inherit
|
||||
|
||||
torch-pipeline:
|
||||
name: Torch pipeline CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_pipelines_torch_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-amd"
|
||||
runner: mi250
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi250
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
secrets: inherit
|
||||
|
||||
example-ci:
|
||||
name: Example CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_examples_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-amd"
|
||||
runner: mi250
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi250
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-amd"
|
||||
runner: mi250
|
||||
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi250
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
secrets: inherit
|
||||
72
.github/workflows/self-scheduled-amd-mi325-caller.yml
vendored
Normal file
72
.github/workflows/self-scheduled-amd-mi325-caller.yml
vendored
Normal file
@@ -0,0 +1,72 @@
|
||||
name: Self-hosted runner scale set (AMD mi325 scheduled CI caller)
|
||||
|
||||
# Note: For every job in this workflow, the name of the runner scale set is finalized in the runner yaml i.e. huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml
|
||||
# For example, 1gpu scale set: amd-mi325-ci-1gpu
|
||||
# 2gpu scale set: amd-mi325-ci-2gpu
|
||||
# Important: Do not pin the reusable workflow ref to a SHA. AMD runner groups only route jobs for
|
||||
# workflows referenced at @main; a pinned SHA causes jobs to wait for a runner until the 24h timeout.
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
|
||||
branches: ["main"]
|
||||
types: [completed]
|
||||
push:
|
||||
branches:
|
||||
- run_amd_scheduled_ci_caller*
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: amd-mi325
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi325
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
env_file: /etc/podinfo/gha-gpu-isolation-settings
|
||||
secrets: inherit
|
||||
|
||||
torch-pipeline:
|
||||
name: Torch pipeline CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main
|
||||
with:
|
||||
job: run_pipelines_torch_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: amd-mi325
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi325
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
env_file: /etc/podinfo/gha-gpu-isolation-settings
|
||||
secrets: inherit
|
||||
|
||||
example-ci:
|
||||
name: Example CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main
|
||||
with:
|
||||
job: run_examples_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: amd-mi325
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi325
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
env_file: /etc/podinfo/gha-gpu-isolation-settings
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@main
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: amd-mi325
|
||||
docker: huggingface/transformers-pytorch-deepspeed-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi325
|
||||
report_repo_id: optimum-amd/transformers_daily_ci
|
||||
env_file: /etc/podinfo/gha-gpu-isolation-settings
|
||||
secrets: inherit
|
||||
66
.github/workflows/self-scheduled-amd-mi355-caller.yml
vendored
Normal file
66
.github/workflows/self-scheduled-amd-mi355-caller.yml
vendored
Normal file
@@ -0,0 +1,66 @@
|
||||
name: Self-hosted runner scale set (AMD mi355 scheduled CI caller)
|
||||
|
||||
# Note: For every job in this workflow, the name of the runner scale set is finalized in the runner yaml i.e. huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml
|
||||
# For example, 1gpu : amd-mi355-ci-1gpu
|
||||
# 2gpu : amd-mi355-ci-2gpu
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Self-hosted runner (AMD scheduled CI caller)"]
|
||||
branches: ["main"]
|
||||
types: [completed]
|
||||
push:
|
||||
branches:
|
||||
- run_amd_scheduled_ci_caller*
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: hfc-amd-mi355
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi355
|
||||
report_repo_id: hf-transformers-bot/transformers-ci-dummy
|
||||
secrets: inherit
|
||||
|
||||
torch-pipeline:
|
||||
name: Torch pipeline CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_pipelines_torch_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: hfc-amd-mi355
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi355
|
||||
report_repo_id: hf-transformers-bot/transformers-ci-dummy
|
||||
secrets: inherit
|
||||
|
||||
example-ci:
|
||||
name: Example CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_examples_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: hfc-amd-mi355
|
||||
docker: huggingface/transformers-pytorch-amd-gpu
|
||||
ci_event: Scheduled CI (AMD) - mi355
|
||||
report_repo_id: hf-transformers-bot/transformers-ci-dummy
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: huggingface/hf-workflows/.github/workflows/transformers_amd_ci_scheduled_arc_scale_set.yaml@63657f571a92cc9759159442936061c51d6d9ae4 # main
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
slack_report_channel: "#amd-hf-ci"
|
||||
runner_group: hfc-amd-mi355
|
||||
docker: huggingface/testing-rocm7.0-preview
|
||||
ci_event: Scheduled CI (AMD) - mi355
|
||||
report_repo_id: hf-transformers-bot/transformers-ci-dummy
|
||||
secrets: inherit
|
||||
138
.github/workflows/self-scheduled-caller.yml
vendored
Normal file
138
.github/workflows/self-scheduled-caller.yml
vendored
Normal file
@@ -0,0 +1,138 @@
|
||||
name: Nvidia CI
|
||||
|
||||
on:
|
||||
repository_dispatch:
|
||||
schedule:
|
||||
- cron: "17 2 * * *"
|
||||
push:
|
||||
branches:
|
||||
- run_nvidia_ci*
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
prev_workflow_run_id:
|
||||
description: 'previous workflow run id to compare'
|
||||
type: string
|
||||
required: false
|
||||
default: ""
|
||||
other_workflow_run_id:
|
||||
description: 'other workflow run id to compare'
|
||||
type: string
|
||||
required: false
|
||||
default: ""
|
||||
|
||||
|
||||
# Used for `push` to easily modify the target workflow runs to compare against
|
||||
env:
|
||||
prev_workflow_run_id: ""
|
||||
other_workflow_run_id: ""
|
||||
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup:
|
||||
name: Setup
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Setup
|
||||
env:
|
||||
prev_workflow_run_id: ${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}
|
||||
other_workflow_run_id: ${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}
|
||||
run: |
|
||||
mkdir "setup_values"
|
||||
echo "$prev_workflow_run_id" > "setup_values/prev_workflow_run_id.txt"
|
||||
echo "$other_workflow_run_id" > "setup_values/other_workflow_run_id.txt"
|
||||
|
||||
- name: Upload artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: setup_values
|
||||
path: setup_values
|
||||
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-models"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
ci_event: Daily CI
|
||||
runner_type: "a10"
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
torch-pipeline:
|
||||
name: Torch pipeline CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_pipelines_torch_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-pipeline-torch"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
ci_event: Daily CI
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
example-ci:
|
||||
name: Example CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_examples_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-examples"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
ci_event: Daily CI
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
trainer-fsdp-ci:
|
||||
name: Trainer/FSDP CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_trainer_and_fsdp_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-training"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
runner_type: "a10"
|
||||
ci_event: Daily CI
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-training"
|
||||
docker: huggingface/transformers-pytorch-deepspeed-latest-gpu
|
||||
ci_event: Daily CI
|
||||
working-directory-prefix: /workspace
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
quantization-ci:
|
||||
name: Quantization CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_quantization_torch_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-quantization"
|
||||
docker: huggingface/transformers-quantization-latest-gpu
|
||||
ci_event: Daily CI
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
|
||||
kernels-ci:
|
||||
name: Kernels CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_kernels_gpu
|
||||
slack_report_channel: "#transformers-ci-daily-kernels"
|
||||
docker: huggingface/transformers-all-latest-gpu
|
||||
ci_event: Daily CI
|
||||
report_repo_id: hf-internal-testing/transformers_daily_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
secrets: inherit
|
||||
66
.github/workflows/self-scheduled-flash-attn-caller.yml
vendored
Normal file
66
.github/workflows/self-scheduled-flash-attn-caller.yml
vendored
Normal file
@@ -0,0 +1,66 @@
|
||||
name: Nvidia CI - Flash Attn
|
||||
|
||||
on:
|
||||
repository_dispatch:
|
||||
schedule:
|
||||
- cron: "17 2 * * *"
|
||||
push:
|
||||
branches:
|
||||
- run_nvidia_ci_flash_attn*
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
prev_workflow_run_id:
|
||||
description: 'previous workflow run id to compare'
|
||||
type: string
|
||||
required: false
|
||||
default: ""
|
||||
other_workflow_run_id:
|
||||
description: 'other workflow run id to compare'
|
||||
type: string
|
||||
required: false
|
||||
default: ""
|
||||
|
||||
|
||||
# Used for `push` to easily modify the target workflow runs to compare against
|
||||
env:
|
||||
prev_workflow_run_id: ""
|
||||
other_workflow_run_id: ""
|
||||
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup:
|
||||
name: Setup
|
||||
runs-on: ubuntu-22.04
|
||||
steps:
|
||||
- name: Setup
|
||||
env:
|
||||
PREV_WORKFLOW_RUN_ID: ${{ inputs.prev_workflow_run_id || env.prev_workflow_run_id }}
|
||||
OTHER_WORKFLOW_RUN_ID: ${{ inputs.other_workflow_run_id || env.other_workflow_run_id }}
|
||||
run: |
|
||||
mkdir "setup_values"
|
||||
echo "$PREV_WORKFLOW_RUN_ID" > "setup_values/prev_workflow_run_id.txt"
|
||||
echo "$OTHER_WORKFLOW_RUN_ID" > "setup_values/other_workflow_run_id.txt"
|
||||
|
||||
- name: Upload artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: setup_values
|
||||
path: setup_values
|
||||
|
||||
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: ./.github/workflows/self-scheduled.yml
|
||||
with:
|
||||
job: run_models_gpu
|
||||
slack_report_channel: "#transformers-ci-flash-attn"
|
||||
docker: huggingface/transformers-all-latest-gpu:flash-attn
|
||||
ci_event: Daily CI
|
||||
runner_type: "a10"
|
||||
report_repo_id: hf-internal-testing/transformers_flash_attn_ci
|
||||
commit_sha: ${{ github.sha }}
|
||||
pytest_marker: "flash_attn_test or flash_attn_3_test or flash_attn_4_test or all_flash_attn_test"
|
||||
secrets: inherit
|
||||
350
.github/workflows/self-scheduled-intel-gaudi.yml
vendored
Normal file
350
.github/workflows/self-scheduled-intel-gaudi.yml
vendored
Normal file
@@ -0,0 +1,350 @@
|
||||
name: Self-hosted runner (scheduled-intel-gaudi)
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
slack_report_channel:
|
||||
required: true
|
||||
type: string
|
||||
runner_scale_set:
|
||||
required: true
|
||||
type: string
|
||||
ci_event:
|
||||
required: true
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: true
|
||||
type: string
|
||||
|
||||
env:
|
||||
NUM_SLICES: 2
|
||||
RUN_SLOW: yes
|
||||
PT_HPU_LAZY_MODE: 0
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
PT_ENABLE_INT64_SUPPORT: 1
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
HF_HOME: /mnt/cache/.cache/huggingface
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup:
|
||||
if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)
|
||||
name: Setup
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}
|
||||
folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}
|
||||
quantization_matrix: ${{ steps.set-matrix.outputs.quantization_matrix }}
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: "3.10"
|
||||
|
||||
- id: set-matrix
|
||||
if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)
|
||||
name: Identify models to test
|
||||
working-directory: tests
|
||||
env:
|
||||
JOB: ${{ inputs.job }}
|
||||
run: |
|
||||
if [ "$JOB" = "run_models_gpu" ]; then
|
||||
echo "folder_slices=$(python3 ../utils/split_model_tests.py --num_splits "$NUM_SLICES")" >> "$GITHUB_OUTPUT"
|
||||
echo "slice_ids=$(python3 -c 'import os, sys; print(list(range(int(os.environ[\"NUM_SLICES\"]))))')" >> "$GITHUB_OUTPUT"
|
||||
elif [ "$JOB" = "run_trainer_and_fsdp_gpu" ]; then
|
||||
echo "folder_slices=[['trainer'], ['fsdp']]" >> "$GITHUB_OUTPUT"
|
||||
echo "slice_ids=[0, 1]" >> "$GITHUB_OUTPUT"
|
||||
fi
|
||||
|
||||
- id: set-matrix-quantization
|
||||
if: ${{ inputs.job == 'run_quantization_torch_gpu' }}
|
||||
name: Identify quantization method to test
|
||||
working-directory: tests
|
||||
run: |
|
||||
echo "quantization_matrix=$(python3 -c 'import os; tests = os.getcwd(); quantization_tests = os.listdir(os.path.join(tests, "quantization")); d = sorted(list(filter(os.path.isdir, [f"quantization/{x}" for x in quantization_tests]))) ; print(d)')" >> $GITHUB_OUTPUT
|
||||
|
||||
run_models_gpu:
|
||||
if: ${{ inputs.job == 'run_models_gpu' }}
|
||||
name: " "
|
||||
needs: setup
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [1gaudi, 2gaudi]
|
||||
slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
|
||||
uses: ./.github/workflows/model_jobs_intel_gaudi.yml
|
||||
with:
|
||||
slice_id: ${{ matrix.slice_id }}
|
||||
machine_type: ${{ matrix.machine_type }}
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
runner: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}
|
||||
secrets: inherit
|
||||
|
||||
run_trainer_and_fsdp_gpu:
|
||||
if: ${{ inputs.job == 'run_trainer_and_fsdp_gpu' }}
|
||||
name: " "
|
||||
needs: setup
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [1gaudi, 2gaudi]
|
||||
slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
|
||||
uses: ./.github/workflows/model_jobs_intel_gaudi.yml
|
||||
with:
|
||||
slice_id: ${{ matrix.slice_id }}
|
||||
machine_type: ${{ matrix.machine_type }}
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
runner: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}
|
||||
report_name_prefix: run_trainer_and_fsdp_gpu
|
||||
secrets: inherit
|
||||
|
||||
run_pipelines_torch_gpu:
|
||||
if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}
|
||||
name: Pipelines
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [1gaudi, 2gaudi]
|
||||
runs-on:
|
||||
group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}
|
||||
container:
|
||||
image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest
|
||||
options: --runtime=habana
|
||||
-v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface
|
||||
--env OMPI_MCA_btl_vader_single_copy_mechanism=none
|
||||
--env HABANA_VISIBLE_DEVICES
|
||||
--env HABANA_VISIBLE_MODULES
|
||||
--cap-add=sys_nice
|
||||
--shm-size=64G
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile
|
||||
|
||||
- name: HL-SMI
|
||||
run: |
|
||||
hl-smi
|
||||
echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"
|
||||
echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"
|
||||
|
||||
- name: Environment
|
||||
run: python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
shell: bash
|
||||
run: |
|
||||
if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type=${{ matrix.machine_type }}
|
||||
fi
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run all pipeline tests on Intel Gaudi
|
||||
run: |
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_pipelines_torch_gpu_test_reports" tests/pipelines -m "not not_device_test"
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: |
|
||||
cat "reports/${machine_type}_run_pipelines_torch_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
|
||||
path: reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
|
||||
|
||||
run_examples_gpu:
|
||||
if: ${{ inputs.job == 'run_examples_gpu' }}
|
||||
name: Examples directory
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [1gaudi]
|
||||
runs-on:
|
||||
group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}
|
||||
container:
|
||||
image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest
|
||||
options: --runtime=habana
|
||||
-v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface
|
||||
--env OMPI_MCA_btl_vader_single_copy_mechanism=none
|
||||
--env HABANA_VISIBLE_DEVICES
|
||||
--env HABANA_VISIBLE_MODULES
|
||||
--cap-add=sys_nice
|
||||
--shm-size=64G
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile
|
||||
|
||||
- name: HL-SMI
|
||||
run: |
|
||||
hl-smi
|
||||
echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"
|
||||
echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"
|
||||
|
||||
- name: Environment
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: |
|
||||
pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
shell: bash
|
||||
run: |
|
||||
if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type=${{ matrix.machine_type }}
|
||||
fi
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run examples tests on Intel Gaudi
|
||||
run: |
|
||||
pip install -r examples/pytorch/_tests_requirements.txt
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_examples_gpu_test_reports" examples/pytorch -m "not not_device_test"
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: |
|
||||
cat "reports/${machine_type}_run_examples_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_examples_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_examples_gpu_test_reports
|
||||
path: reports/${{ env.machine_type }}_run_examples_gpu_test_reports
|
||||
|
||||
run_torch_cuda_extensions_gpu:
|
||||
if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}
|
||||
name: Intel Gaudi deepspeed tests
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [1gaudi, 2gaudi]
|
||||
runs-on:
|
||||
group: ${{ inputs.runner_scale_set }}-${{ matrix.machine_type }}
|
||||
container:
|
||||
image: vault.habana.ai/gaudi-docker/1.21.1/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest
|
||||
options: --runtime=habana
|
||||
-v /mnt/cache/.cache/huggingface:/mnt/cache/.cache/huggingface
|
||||
--env OMPI_MCA_btl_vader_single_copy_mechanism=none
|
||||
--env HABANA_VISIBLE_DEVICES
|
||||
--env HABANA_VISIBLE_MODULES
|
||||
--cap-add=sys_nice
|
||||
--shm-size=64G
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install -e .[testing,torch] "numpy<2.0.0" scipy scikit-learn librosa soundfile
|
||||
pip install git+https://github.com/HabanaAI/DeepSpeed.git@1.20.0
|
||||
|
||||
- name: HL-SMI
|
||||
run: |
|
||||
hl-smi
|
||||
echo "HABANA_VISIBLE_DEVICES=${HABANA_VISIBLE_DEVICES}"
|
||||
echo "HABANA_VISIBLE_MODULES=${HABANA_VISIBLE_MODULES}"
|
||||
|
||||
- name: Environment
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: |
|
||||
pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
shell: bash
|
||||
run: |
|
||||
if [ "${{ matrix.machine_type }}" = "1gaudi" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "${{ matrix.machine_type }}" = "2gaudi" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type=${{ matrix.machine_type }}
|
||||
fi
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run all deepspeed tests on intel Gaudi
|
||||
run: |
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_torch_cuda_extensions_gpu_test_reports" tests/deepspeed -m "not not_device_test"
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: |
|
||||
cat "reports/${machine_type}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
|
||||
path: reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
|
||||
|
||||
send_results:
|
||||
name: Slack Report
|
||||
needs:
|
||||
[
|
||||
setup,
|
||||
run_models_gpu,
|
||||
run_examples_gpu,
|
||||
run_torch_cuda_extensions_gpu,
|
||||
run_pipelines_torch_gpu,
|
||||
run_trainer_and_fsdp_gpu,
|
||||
]
|
||||
if: ${{ always() }}
|
||||
uses: ./.github/workflows/slack-report.yml
|
||||
with:
|
||||
job: ${{ inputs.job }}
|
||||
setup_status: ${{ needs.setup.result }}
|
||||
slack_report_channel: ${{ inputs.slack_report_channel }}
|
||||
quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
ci_event: ${{ inputs.ci_event }}
|
||||
|
||||
secrets: inherit
|
||||
70
.github/workflows/self-scheduled-intel-gaudi3-caller.yml
vendored
Normal file
70
.github/workflows/self-scheduled-intel-gaudi3-caller.yml
vendored
Normal file
@@ -0,0 +1,70 @@
|
||||
name: Self-hosted runner (Intel Gaudi3 scheduled CI caller)
|
||||
|
||||
on:
|
||||
repository_dispatch:
|
||||
workflow_dispatch:
|
||||
schedule:
|
||||
- cron: "17 2 * * *"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
model-ci:
|
||||
name: Model CI
|
||||
uses: ./.github/workflows/self-scheduled-intel-gaudi.yml
|
||||
with:
|
||||
job: run_models_gpu
|
||||
ci_event: Scheduled CI (Intel) - Gaudi3
|
||||
runner_scale_set: itac-bm-emr-gaudi3-dell
|
||||
slack_report_channel: "#transformers-ci-daily-intel-gaudi3"
|
||||
report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3
|
||||
|
||||
secrets: inherit
|
||||
|
||||
pipeline-ci:
|
||||
name: Pipeline CI
|
||||
uses: ./.github/workflows/self-scheduled-intel-gaudi.yml
|
||||
with:
|
||||
job: run_pipelines_torch_gpu
|
||||
ci_event: Scheduled CI (Intel) - Gaudi3
|
||||
runner_scale_set: itac-bm-emr-gaudi3-dell
|
||||
slack_report_channel: "#transformers-ci-daily-intel-gaudi3"
|
||||
report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3
|
||||
|
||||
secrets: inherit
|
||||
|
||||
example-ci:
|
||||
name: Example CI
|
||||
uses: ./.github/workflows/self-scheduled-intel-gaudi.yml
|
||||
with:
|
||||
job: run_examples_gpu
|
||||
ci_event: Scheduled CI (Intel) - Gaudi3
|
||||
runner_scale_set: itac-bm-emr-gaudi3-dell
|
||||
slack_report_channel: "#transformers-ci-daily-intel-gaudi3"
|
||||
report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3
|
||||
|
||||
secrets: inherit
|
||||
|
||||
deepspeed-ci:
|
||||
name: DeepSpeed CI
|
||||
uses: ./.github/workflows/self-scheduled-intel-gaudi.yml
|
||||
with:
|
||||
job: run_torch_cuda_extensions_gpu
|
||||
ci_event: Scheduled CI (Intel) - Gaudi3
|
||||
runner_scale_set: itac-bm-emr-gaudi3-dell
|
||||
slack_report_channel: "#transformers-ci-daily-intel-gaudi3"
|
||||
report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3
|
||||
|
||||
secrets: inherit
|
||||
|
||||
trainer-fsdp-ci:
|
||||
name: Trainer/FSDP CI
|
||||
uses: ./.github/workflows/self-scheduled-intel-gaudi.yml
|
||||
with:
|
||||
job: run_trainer_and_fsdp_gpu
|
||||
ci_event: Scheduled CI (Intel) - Gaudi3
|
||||
runner_scale_set: itac-bm-emr-gaudi3-dell
|
||||
slack_report_channel: "#transformers-ci-daily-intel-gaudi3"
|
||||
report_repo_id: optimum-intel/transformers_daily_ci_intel_gaudi3
|
||||
secrets: inherit
|
||||
678
.github/workflows/self-scheduled.yml
vendored
Normal file
678
.github/workflows/self-scheduled.yml
vendored
Normal file
@@ -0,0 +1,678 @@
|
||||
name: Nvidia CI (job definitions)
|
||||
|
||||
# Note that each job's dependencies go into a corresponding docker file.
|
||||
#
|
||||
# For example for `run_torch_cuda_extensions_gpu` the docker image is
|
||||
# `huggingface/transformers-pytorch-deepspeed-latest-gpu`, which can be found at
|
||||
# `docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile`
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
slack_report_channel:
|
||||
required: true
|
||||
type: string
|
||||
docker:
|
||||
required: true
|
||||
type: string
|
||||
ci_event:
|
||||
required: true
|
||||
type: string
|
||||
working-directory-prefix:
|
||||
default: ''
|
||||
required: false
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: true
|
||||
type: string
|
||||
commit_sha:
|
||||
required: false
|
||||
type: string
|
||||
runner_type:
|
||||
required: false
|
||||
type: string
|
||||
subdirs:
|
||||
default: ""
|
||||
required: false
|
||||
type: string
|
||||
pytest_marker:
|
||||
required: false
|
||||
type: string
|
||||
pr_number:
|
||||
required: false
|
||||
type: string
|
||||
outputs:
|
||||
is_infrastructure_ok:
|
||||
description: "Whether the CI infrastructure (slack reporting and failure checking) succeeded"
|
||||
value: ${{ jobs.send_results.outputs.is_slack_reporting_job_ok == 'true' && jobs.check_new_failures.outputs.is_check_failures_ok == 'true' }}
|
||||
|
||||
|
||||
env:
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
OMP_NUM_THREADS: 8
|
||||
MKL_NUM_THREADS: 8
|
||||
RUN_SLOW: yes
|
||||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
|
||||
# This token is created under the bot `hf-transformers-bot`.
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
CUDA_VISIBLE_DEVICES: 0,1
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
setup:
|
||||
name: Setup
|
||||
if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu", "run_quantization_torch_gpu"]'), inputs.job)
|
||||
strategy:
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
outputs:
|
||||
folder_slices: ${{ steps.set-matrix.outputs.folder_slices }}
|
||||
slice_ids: ${{ steps.set-matrix.outputs.slice_ids }}
|
||||
quantization_matrix: ${{ steps.set-matrix-quantization.outputs.quantization_matrix }}
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: |
|
||||
git fetch origin $commit_sha
|
||||
git fetch && git checkout $commit_sha
|
||||
|
||||
- name: Cleanup
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
rm -rf tests/__pycache__
|
||||
rm -rf tests/models/__pycache__
|
||||
rm -rf reports
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- id: set-matrix
|
||||
if: contains(fromJSON('["run_models_gpu", "run_trainer_and_fsdp_gpu"]'), inputs.job)
|
||||
name: Identify models to test
|
||||
working-directory: /transformers/tests
|
||||
env:
|
||||
job: ${{ inputs.job }}
|
||||
subdirs: ${{ inputs.subdirs }}
|
||||
NUM_SLICES: 2
|
||||
run: |
|
||||
if [ "$job" = "run_models_gpu" ]; then
|
||||
python3 ../utils/split_model_tests.py --subdirs "$subdirs" --num_splits "$NUM_SLICES" > folder_slices.txt
|
||||
echo "folder_slices=$(cat folder_slices.txt)" >> $GITHUB_OUTPUT
|
||||
python3 -c "import ast; folder_slices = ast.literal_eval(open('folder_slices.txt').read()); open('slice_ids.txt', 'w').write(str(list(range(len(folder_slices)))))"
|
||||
echo "slice_ids=$(cat slice_ids.txt)" >> $GITHUB_OUTPUT
|
||||
elif [ "$job" = "run_trainer_and_fsdp_gpu" ]; then
|
||||
echo "folder_slices=[['trainer'], ['fsdp'], ['ddp']]" >> $GITHUB_OUTPUT
|
||||
echo "slice_ids=[0, 1, 2]" >> $GITHUB_OUTPUT
|
||||
fi
|
||||
|
||||
- id: set-matrix-quantization
|
||||
if: ${{ inputs.job == 'run_quantization_torch_gpu' }}
|
||||
name: Identify quantization method to test
|
||||
working-directory: /transformers/tests
|
||||
env:
|
||||
subdirs: ${{ inputs.subdirs || 'None' }}
|
||||
run: |
|
||||
echo "quantization_matrix=$(python3 -c 'import ast; import os; tests = os.getcwd(); quantization_tests = os.listdir(os.path.join(tests, "quantization")); subdirs = ast.literal_eval(os.environ["subdirs"]); quantization_tests = [x.removeprefix("quantization/") for x in subdirs] if subdirs is not None else quantization_tests; d = sorted(list(filter(os.path.isdir, [f"quantization/{x}" for x in quantization_tests]))); print(d)')" >> $GITHUB_OUTPUT
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
run_models_gpu:
|
||||
if: ${{ inputs.job == 'run_models_gpu' }}
|
||||
name: " "
|
||||
needs: setup
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
slice_id: ${{ fromJSON(needs.setup.outputs.slice_ids) }}
|
||||
uses: ./.github/workflows/model_jobs.yml
|
||||
with:
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
machine_type: ${{ matrix.machine_type }}
|
||||
slice_id: ${{ matrix.slice_id }}
|
||||
docker: ${{ inputs.docker }}
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
runner_type: ${{ inputs.runner_type }}
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
pytest_marker: ${{ inputs.pytest_marker }}
|
||||
secrets: inherit
|
||||
|
||||
run_trainer_and_fsdp_gpu:
|
||||
if: ${{ inputs.job == 'run_trainer_and_fsdp_gpu' }}
|
||||
name: " "
|
||||
needs: setup
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
slice_id: [0, 1, 2]
|
||||
uses: ./.github/workflows/model_jobs.yml
|
||||
with:
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
machine_type: ${{ matrix.machine_type }}
|
||||
slice_id: ${{ matrix.slice_id }}
|
||||
docker: ${{ inputs.docker }}
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
runner_type: ${{ inputs.runner_type }}
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
report_name_prefix: run_trainer_and_fsdp_gpu
|
||||
secrets: inherit
|
||||
|
||||
run_pipelines_torch_gpu:
|
||||
if: ${{ inputs.job == 'run_pipelines_torch_gpu' }}
|
||||
name: PyTorch pipelines
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: git fetch && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
working-directory: /transformers
|
||||
shell: bash
|
||||
env:
|
||||
matrix_machine_type: ${{ matrix.machine_type }}
|
||||
run: |
|
||||
echo "$matrix_machine_type"
|
||||
|
||||
if [ "$matrix_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$matrix_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$matrix_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run all pipeline tests on GPU
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 -m pytest -n 1 -v --dist=loadfile --make-reports="${machine_type}_run_pipelines_torch_gpu_test_reports" tests/pipelines
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: cat "/transformers/reports/${machine_type}_run_pipelines_torch_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
|
||||
path: /transformers/reports/${{ env.machine_type }}_run_pipelines_torch_gpu_test_reports
|
||||
|
||||
run_examples_gpu:
|
||||
if: ${{ inputs.job == 'run_examples_gpu' }}
|
||||
name: Examples directory
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: huggingface/transformers-all-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: git fetch && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
working-directory: /transformers
|
||||
shell: bash
|
||||
env:
|
||||
matrix_machine_type: ${{ matrix.machine_type }}
|
||||
run: |
|
||||
echo "$matrix_machine_type"
|
||||
|
||||
if [ "$matrix_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$matrix_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$matrix_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run examples tests on GPU
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
pip install -r examples/pytorch/_tests_requirements.txt
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_examples_gpu_test_reports" examples/pytorch
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: cat "/transformers/reports/${machine_type}_run_examples_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_examples_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_examples_gpu_test_reports
|
||||
path: /transformers/reports/${{ env.machine_type }}_run_examples_gpu_test_reports
|
||||
|
||||
run_torch_cuda_extensions_gpu:
|
||||
if: ${{ inputs.job == 'run_torch_cuda_extensions_gpu' }}
|
||||
name: Torch CUDA extension tests
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: ${{ inputs.docker }}
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: git fetch && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: Update / Install some packages (for Past CI)
|
||||
if: ${{ contains(inputs.docker, '-past-') && contains(inputs.docker, '-pytorch-') }}
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
run: |
|
||||
python3 -m pip install -U datasets
|
||||
python3 -m pip install --no-cache-dir git+https://github.com/huggingface/accelerate@main#egg=accelerate
|
||||
|
||||
- name: Remove cached torch extensions
|
||||
run: rm -rf /github/home/.cache/torch_extensions/
|
||||
|
||||
# To avoid unknown test failures
|
||||
- name: Pre build DeepSpeed *again* (for daily CI)
|
||||
if: ${{ contains(inputs.ci_event, 'Daily CI') }}
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/
|
||||
run: |
|
||||
python3 -m pip uninstall -y deepspeed
|
||||
DS_DISABLE_NINJA=1 DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --no-build-isolation --config-settings="--build-option=build_ext" --config-settings="--build-option=-j8" --no-cache -v --disable-pip-version-check
|
||||
|
||||
# To avoid unknown test failures
|
||||
- name: Pre build DeepSpeed *again* (for nightly & Past CI)
|
||||
if: ${{ contains(inputs.ci_event, 'Nightly CI') || contains(inputs.ci_event, 'Past CI') }}
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/
|
||||
run: |
|
||||
python3 -m pip uninstall -y deepspeed
|
||||
rm -rf DeepSpeed
|
||||
git clone https://github.com/deepspeedai/DeepSpeed && cd DeepSpeed && rm -rf build
|
||||
DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install . --no-build-isolation --config-settings="--build-option=build_ext" --config-settings="--build-option=-j8" --no-cache -v --disable-pip-version-check
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
shell: bash
|
||||
env:
|
||||
matrix_machine_type: ${{ matrix.machine_type }}
|
||||
run: |
|
||||
echo "$matrix_machine_type"
|
||||
|
||||
if [ "$matrix_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$matrix_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$matrix_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run all tests on GPU
|
||||
working-directory: ${{ inputs.working-directory-prefix }}/transformers
|
||||
run: |
|
||||
script -q -c "python3 -m pytest -v -rsfE --make-reports=${machine_type}_run_torch_cuda_extensions_gpu_test_reports tests/trainer/distributed/test_trainer_distributed_deepspeed.py" test_outputs.txt
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
env:
|
||||
working_directory_prefix: ${{ inputs.working-directory-prefix }}
|
||||
run: cat "${working_directory_prefix}/transformers/reports/${machine_type}_run_torch_cuda_extensions_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
|
||||
path: ${{ inputs.working-directory-prefix }}/transformers/reports/${{ env.machine_type }}_run_torch_cuda_extensions_gpu_test_reports
|
||||
|
||||
run_quantization_torch_gpu:
|
||||
if: ${{ inputs.job == 'run_quantization_torch_gpu' }}
|
||||
name: " "
|
||||
needs: setup
|
||||
strategy:
|
||||
max-parallel: 4
|
||||
fail-fast: false
|
||||
matrix:
|
||||
folders: ${{ fromJson(needs.setup.outputs.quantization_matrix) }}
|
||||
machine_type: [aws-g5-4xlarge-cache, aws-g5-12xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: huggingface/transformers-quantization-latest-gpu
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Echo folder ${{ matrix.folders }}
|
||||
shell: bash
|
||||
env:
|
||||
matrix_folders_raw: ${{ matrix.folders }}
|
||||
run: |
|
||||
echo "$matrix_folders_raw"
|
||||
matrix_folders="${matrix_folders_raw/'quantization/'/'quantization_'}"
|
||||
echo "$matrix_folders"
|
||||
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV
|
||||
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: git fetch origin "$commit_sha" && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
working-directory: /transformers
|
||||
shell: bash
|
||||
env:
|
||||
matrix_machine_type: ${{ matrix.machine_type }}
|
||||
run: |
|
||||
echo "$matrix_machine_type"
|
||||
|
||||
if [ "$matrix_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$matrix_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$matrix_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run quantization tests on GPU
|
||||
working-directory: /transformers
|
||||
env:
|
||||
folders: ${{ matrix.folders }}
|
||||
run: |
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_quantization_torch_gpu_${matrix_folders}_test_reports" tests/${folders}
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: cat "/transformers/reports/${machine_type}_run_quantization_torch_gpu_${matrix_folders}_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
|
||||
path: /transformers/reports/${{ env.machine_type }}_run_quantization_torch_gpu_${{ env.matrix_folders }}_test_reports
|
||||
|
||||
run_kernels_gpu:
|
||||
if: ${{ inputs.job == 'run_kernels_gpu' }}
|
||||
name: Kernel tests
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
machine_type: [aws-g5-4xlarge-cache]
|
||||
runs-on:
|
||||
group: '${{ matrix.machine_type }}'
|
||||
container:
|
||||
image: ${{ inputs.docker }}
|
||||
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
run: git fetch && git checkout "$commit_sha"
|
||||
|
||||
- name: Reinstall transformers in edit mode
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .[testing]
|
||||
|
||||
- name: Install kernels
|
||||
working-directory: /transformers
|
||||
run: python3 -m pip install -U kernels
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: nvidia-smi
|
||||
|
||||
- name: Environment
|
||||
working-directory: /transformers
|
||||
run: python3 utils/print_env.py
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: Set `machine_type` for report and artifact names
|
||||
working-directory: /transformers
|
||||
shell: bash
|
||||
env:
|
||||
matrix_machine_type: ${{ matrix.machine_type }}
|
||||
run: |
|
||||
echo "$matrix_machine_type"
|
||||
|
||||
if [ "$matrix_machine_type" = "aws-g5-4xlarge-cache" ]; then
|
||||
machine_type=single-gpu
|
||||
elif [ "$matrix_machine_type" = "aws-g5-12xlarge-cache" ]; then
|
||||
machine_type=multi-gpu
|
||||
else
|
||||
machine_type="$matrix_machine_type"
|
||||
fi
|
||||
|
||||
echo "$machine_type"
|
||||
echo "machine_type=$machine_type" >> $GITHUB_ENV
|
||||
|
||||
- name: Run kernel tests on GPU
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
python3 -m pytest -v --make-reports="${machine_type}_run_kernels_gpu_test_reports" tests/kernels/test_kernels.py
|
||||
|
||||
- name: Failure short reports
|
||||
if: ${{ failure() }}
|
||||
continue-on-error: true
|
||||
run: cat "/transformers/reports/${machine_type}_run_kernels_gpu_test_reports/failures_short.txt"
|
||||
|
||||
- name: "Test suite reports artifacts: ${{ env.machine_type }}_run_kernels_gpu_test_reports"
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ${{ env.machine_type }}_run_kernels_gpu_test_reports
|
||||
path: /transformers/reports/${{ env.machine_type }}_run_kernels_gpu_test_reports
|
||||
|
||||
run_extract_warnings:
|
||||
# Let's only do this for the job `run_models_gpu` to simplify the (already complex) logic.
|
||||
if: ${{ always() && inputs.job == 'run_models_gpu' }}
|
||||
name: Extract warnings in CI artifacts
|
||||
runs-on: ubuntu-22.04
|
||||
needs: [setup, run_models_gpu]
|
||||
steps:
|
||||
# Checkout in order to run `utils/extract_warnings.py`. Avoid **explicit** checkout (i.e. don't specify `ref`) for
|
||||
# security reason.
|
||||
- name: Checkout transformers
|
||||
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install transformers
|
||||
run: pip install transformers
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
run: pip freeze
|
||||
|
||||
- name: Create output directory
|
||||
run: mkdir warnings_in_ci
|
||||
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
env:
|
||||
ACTIONS_ARTIFACT_MAX_ARTIFACT_COUNT: 2000
|
||||
with:
|
||||
path: warnings_in_ci
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Show artifacts
|
||||
run: echo "$(python3 -c 'import os; d = os.listdir(); print(d)')"
|
||||
working-directory: warnings_in_ci
|
||||
|
||||
- name: Extract warnings in CI artifacts
|
||||
env:
|
||||
github_run_id: ${{ github.run_id }}
|
||||
access_token: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
run: |
|
||||
python3 utils/extract_warnings.py --workflow_run_id "$github_run_id" --output_dir warnings_in_ci --token "$access_token" --from_gh
|
||||
echo "$(python3 -c 'import os; import json; fp = open("warnings_in_ci/selected_warnings.json"); d = json.load(fp); d = "\n".join(d); print(d)')"
|
||||
|
||||
- name: Upload artifact
|
||||
if: ${{ always() }}
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: warnings_in_ci
|
||||
path: warnings_in_ci/selected_warnings.json
|
||||
|
||||
send_results:
|
||||
name: Slack Report
|
||||
needs: [
|
||||
setup,
|
||||
run_models_gpu,
|
||||
run_trainer_and_fsdp_gpu,
|
||||
run_pipelines_torch_gpu,
|
||||
run_examples_gpu,
|
||||
run_torch_cuda_extensions_gpu,
|
||||
run_quantization_torch_gpu,
|
||||
run_kernels_gpu,
|
||||
run_extract_warnings
|
||||
]
|
||||
if: always() && !cancelled()
|
||||
uses: ./.github/workflows/slack-report.yml
|
||||
with:
|
||||
job: ${{ inputs.job }}
|
||||
# This would be `skipped` if `setup` is skipped.
|
||||
setup_status: ${{ needs.setup.result }}
|
||||
slack_report_channel: ${{ inputs.slack_report_channel }}
|
||||
# This would be an empty string if `setup` is skipped.
|
||||
folder_slices: ${{ needs.setup.outputs.folder_slices }}
|
||||
quantization_matrix: ${{ needs.setup.outputs.quantization_matrix }}
|
||||
ci_event: ${{ inputs.ci_event }}
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
|
||||
secrets: inherit
|
||||
|
||||
check_new_failures:
|
||||
if: ${{ always() && needs.send_results.result == 'success' }}
|
||||
name: Check new failures
|
||||
needs: send_results
|
||||
uses: ./.github/workflows/check_failed_tests.yml
|
||||
with:
|
||||
docker: ${{ inputs.docker }}
|
||||
commit_sha: ${{ inputs.commit_sha || github.sha }}
|
||||
job: ${{ inputs.job }}
|
||||
slack_report_channel: ${{ inputs.slack_report_channel }}
|
||||
ci_event: ${{ inputs.ci_event }}
|
||||
report_repo_id: ${{ inputs.report_repo_id }}
|
||||
pr_number: ${{ inputs.pr_number }}
|
||||
|
||||
secrets: inherit
|
||||
120
.github/workflows/slack-report.yml
vendored
Normal file
120
.github/workflows/slack-report.yml
vendored
Normal file
@@ -0,0 +1,120 @@
|
||||
name: CI slack report
|
||||
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
job:
|
||||
required: true
|
||||
type: string
|
||||
slack_report_channel:
|
||||
required: true
|
||||
type: string
|
||||
setup_status:
|
||||
required: true
|
||||
type: string
|
||||
folder_slices:
|
||||
required: true
|
||||
type: string
|
||||
quantization_matrix:
|
||||
required: true
|
||||
type: string
|
||||
ci_event:
|
||||
required: true
|
||||
type: string
|
||||
report_repo_id:
|
||||
required: true
|
||||
type: string
|
||||
commit_sha:
|
||||
required: false
|
||||
type: string
|
||||
outputs:
|
||||
is_slack_reporting_job_ok:
|
||||
description: "Whether the send_results job succeeded (not failed)"
|
||||
value: ${{ jobs.send_results.result != 'failure' }}
|
||||
|
||||
env:
|
||||
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
send_results:
|
||||
name: Send results to webhook
|
||||
runs-on: ubuntu-22.04
|
||||
if: always() && !cancelled()
|
||||
steps:
|
||||
- name: Preliminary job status
|
||||
shell: bash
|
||||
# For the meaning of these environment variables, see the job `Setup`
|
||||
env:
|
||||
setup_status: ${{ inputs.setup_status }}
|
||||
run: |
|
||||
echo "Setup status: $setup_status"
|
||||
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
fetch-depth: 2
|
||||
# Security: checkout to the `main` branch for untrusted triggers (issue_comment, pull_request_target), otherwise use the specified ref
|
||||
ref: ${{ (github.event_name == 'issue_comment' || github.event_name == 'pull_request_target') && 'main' || (inputs.commit_sha || github.sha) }}
|
||||
persist-credentials: false
|
||||
|
||||
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
|
||||
env:
|
||||
ACTIONS_ARTIFACT_MAX_ARTIFACT_COUNT: 2000
|
||||
with:
|
||||
github-token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Prepare some setup values
|
||||
run: |
|
||||
if [ -f setup_values/prev_workflow_run_id.txt ]; then
|
||||
echo "PREV_WORKFLOW_RUN_ID=$(cat setup_values/prev_workflow_run_id.txt)" >> $GITHUB_ENV
|
||||
else
|
||||
echo "PREV_WORKFLOW_RUN_ID=" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
if [ -f setup_values/other_workflow_run_id.txt ]; then
|
||||
echo "OTHER_WORKFLOW_RUN_ID=$(cat setup_values/other_workflow_run_id.txt)" >> $GITHUB_ENV
|
||||
else
|
||||
echo "OTHER_WORKFLOW_RUN_ID=" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
- name: Send message to Slack
|
||||
shell: bash
|
||||
env:
|
||||
CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
|
||||
CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
|
||||
CI_SLACK_CHANNEL_ID_DAILY: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY }}
|
||||
CI_SLACK_CHANNEL_DUMMY_TESTS: ${{ secrets.CI_SLACK_CHANNEL_DUMMY_TESTS }}
|
||||
SLACK_REPORT_CHANNEL: ${{ inputs.slack_report_channel }}
|
||||
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
|
||||
CI_EVENT: ${{ inputs.ci_event }}
|
||||
# This `CI_TITLE` would be empty for `schedule` or `workflow_run` events.
|
||||
CI_TITLE: ${{ github.event.head_commit.message }}
|
||||
CI_SHA: ${{ inputs.commit_sha || github.sha }}
|
||||
CI_TEST_JOB: ${{ inputs.job }}
|
||||
SETUP_STATUS: ${{ inputs.setup_status }}
|
||||
REPORT_REPO_ID: ${{ inputs.report_repo_id }}
|
||||
quantization_matrix: ${{ inputs.quantization_matrix }}
|
||||
folder_slices: ${{ inputs.folder_slices }}
|
||||
# We pass `needs.setup.outputs.matrix` as the argument. A processing in `notification_service.py` to change
|
||||
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
|
||||
# For a job that doesn't depend on (i.e. `needs`) `setup`, the value for `inputs.folder_slices` would be an
|
||||
# empty string, and the called script still get one argument (which is the emtpy string).
|
||||
run: |
|
||||
pip install huggingface_hub
|
||||
pip install slack_sdk
|
||||
pip show slack_sdk
|
||||
pip install .
|
||||
if [ "$quantization_matrix" != "" ]; then
|
||||
python utils/notification_service.py "$quantization_matrix"
|
||||
else
|
||||
python utils/notification_service.py "$folder_slices"
|
||||
fi
|
||||
|
||||
# Upload the directory containing CI reports prepared in `notification_service.py`
|
||||
- name: Failure table artifacts
|
||||
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
|
||||
with:
|
||||
name: ci_results_${{ inputs.job }}
|
||||
path: ci_results_${{ inputs.job }}
|
||||
155
.github/workflows/ssh-runner.yml
vendored
Normal file
155
.github/workflows/ssh-runner.yml
vendored
Normal file
@@ -0,0 +1,155 @@
|
||||
name: SSH into our runners
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
runner_type:
|
||||
description: 'Type of runner to test (a10)'
|
||||
required: true
|
||||
docker_image:
|
||||
description: 'Name of the Docker image'
|
||||
required: true
|
||||
num_gpus:
|
||||
description: 'Type of the number of gpus to use (`single` or `multi`)'
|
||||
required: true
|
||||
|
||||
env:
|
||||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
|
||||
HF_HOME: /mnt/cache
|
||||
TRANSFORMERS_IS_CI: yes
|
||||
OMP_NUM_THREADS: 8
|
||||
MKL_NUM_THREADS: 8
|
||||
RUN_SLOW: yes # For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. # This token is created under the bot `hf-transformers-bot`.
|
||||
TF_FORCE_GPU_ALLOW_GROWTH: true
|
||||
CUDA_VISIBLE_DEVICES: 0,1
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
get_runner:
|
||||
name: "Get runner to use"
|
||||
runs-on: ubuntu-22.04
|
||||
outputs:
|
||||
RUNNER: ${{ steps.set_runner.outputs.RUNNER }}
|
||||
steps:
|
||||
- name: Get runner to use
|
||||
shell: bash
|
||||
env:
|
||||
NUM_GPUS: ${{ github.event.inputs.num_gpus }}
|
||||
RUNNER_TYPE: ${{ github.event.inputs.runner_type }}
|
||||
run: |
|
||||
if [[ "$NUM_GPUS" == "single" && "$RUNNER_TYPE" == "a10" ]]; then
|
||||
echo "RUNNER=aws-g5-4xlarge-cache-ssh" >> $GITHUB_ENV
|
||||
elif [[ "$NUM_GPUS" == "multi" && "$RUNNER_TYPE" == "a10" ]]; then
|
||||
echo "RUNNER=aws-g5-12xlarge-cache-ssh" >> $GITHUB_ENV
|
||||
else
|
||||
echo "RUNNER=" >> $GITHUB_ENV
|
||||
fi
|
||||
|
||||
- name: Set runner to use
|
||||
id: set_runner
|
||||
run: |
|
||||
echo "$RUNNER"
|
||||
echo "RUNNER=$RUNNER" >> $GITHUB_OUTPUT
|
||||
|
||||
ssh_runner:
|
||||
name: "SSH"
|
||||
needs: get_runner
|
||||
runs-on:
|
||||
group: ${{ needs.get_runner.outputs.RUNNER }}
|
||||
container:
|
||||
image: ${{ github.event.inputs.docker_image }}
|
||||
steps:
|
||||
- name: Update clone
|
||||
working-directory: /transformers
|
||||
env:
|
||||
commit_sha: ${{ github.sha }}
|
||||
run: |
|
||||
git fetch && git checkout "$commit_sha"
|
||||
|
||||
- name: Cleanup
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
rm -rf tests/__pycache__
|
||||
rm -rf tests/models/__pycache__
|
||||
rm -rf reports
|
||||
|
||||
- name: Show installed libraries and their versions
|
||||
working-directory: /transformers
|
||||
run: pip freeze
|
||||
|
||||
- name: NVIDIA-SMI
|
||||
run: |
|
||||
nvidia-smi
|
||||
|
||||
- name: Create python alias
|
||||
run: |
|
||||
ln -sf $(which python3) /usr/local/bin/python
|
||||
ln -sf $(which pip3) /usr/local/bin/pip
|
||||
echo "✅ python -> python3 symlink created"
|
||||
|
||||
- name: Install psutil for memory monitor
|
||||
run: |
|
||||
pip install psutil --break-system-packages
|
||||
|
||||
- name: Download memory monitor script
|
||||
working-directory: /transformers
|
||||
run: |
|
||||
apt-get update && apt-get install -y curl
|
||||
curl -o memory_monitor.py https://raw.githubusercontent.com/huggingface/transformers/refs/heads/utility_scripts/utils/memory_monitor.py
|
||||
|
||||
- name: Start memory monitor
|
||||
working-directory: /transformers
|
||||
continue-on-error: true # Don't fail workflow if monitor has issues
|
||||
run: |
|
||||
python3 memory_monitor.py --threshold 90 --interval 1 > memory_monitor.log 2>&1 &
|
||||
echo $! > memory_monitor.pid
|
||||
echo "Memory monitor started with PID $(cat memory_monitor.pid)"
|
||||
# Give it a moment to start
|
||||
sleep 2
|
||||
# Verify it's running
|
||||
ps aux | grep memory_monitor | grep -v grep || echo "Warning: memory monitor may not be running"
|
||||
|
||||
- name: Install utilities
|
||||
run: |
|
||||
apt-get install -y nano
|
||||
|
||||
- name: Setup automatic environment for SSH login
|
||||
run: |
|
||||
# Create shared environment setup
|
||||
cat > /root/.env_setup << 'EOF'
|
||||
# Auto-setup (non-sensitive vars)
|
||||
export HF_HOME=/mnt/cache
|
||||
export TRANSFORMERS_IS_CI=yes
|
||||
export OMP_NUM_THREADS=8
|
||||
export MKL_NUM_THREADS=8
|
||||
export RUN_SLOW=yes
|
||||
export TF_FORCE_GPU_ALLOW_GROWTH=true
|
||||
export CUDA_VISIBLE_DEVICES=0,1
|
||||
|
||||
cd /transformers 2>/dev/null || true
|
||||
|
||||
# Remind user to set token if needed
|
||||
if [ -z "$HF_TOKEN" ]; then
|
||||
echo "⚠️ HF_TOKEN not set. Set it with:"
|
||||
echo " export HF_TOKEN=hf_xxxxx"
|
||||
else
|
||||
echo "✅ HF_TOKEN is set"
|
||||
fi
|
||||
|
||||
echo "📁 Working directory: $(pwd)"
|
||||
EOF
|
||||
|
||||
# Source from both .bash_profile and .bashrc
|
||||
echo 'source /root/.env_setup' >> /root/.bash_profile
|
||||
echo 'source /root/.env_setup' >> /root/.bashrc
|
||||
|
||||
- name: Tailscale # In order to be able to SSH when a test fails
|
||||
uses: huggingface/tailscale-action@7d53c9737e53934c30290b5524d1c9b4a7c98c8a # main
|
||||
with:
|
||||
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
|
||||
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
|
||||
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||
waitForSSH: true
|
||||
sshTimeout: 15m
|
||||
34
.github/workflows/stale.yml
vendored
Normal file
34
.github/workflows/stale.yml
vendored
Normal file
@@ -0,0 +1,34 @@
|
||||
name: Stale Bot
|
||||
|
||||
on:
|
||||
schedule:
|
||||
- cron: "0 8 * * *"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
close_stale_issues:
|
||||
name: Close Stale Issues
|
||||
if: github.repository == 'huggingface/transformers'
|
||||
runs-on: ubuntu-22.04
|
||||
permissions:
|
||||
issues: write
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Setup Python
|
||||
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
|
||||
with:
|
||||
python-version: 3.8
|
||||
|
||||
- name: Install requirements
|
||||
run: |
|
||||
pip install PyGithub
|
||||
- name: Close stale issues
|
||||
run: |
|
||||
python scripts/stale.py
|
||||
97
.github/workflows/trl-ci-bot.yml
vendored
Normal file
97
.github/workflows/trl-ci-bot.yml
vendored
Normal file
@@ -0,0 +1,97 @@
|
||||
# This workflow allows trusted contributors to trigger TRL CI runs against
|
||||
# specific Transformers commits by commenting `/trl-ci` on a PR in the TRL repo.
|
||||
# It is meant to be used during the ongoing Trainer refactor/unbloat in Transformers,
|
||||
# to help evaluate the downstream impact on TRL.
|
||||
name: TRL CI bot
|
||||
|
||||
on:
|
||||
issue_comment:
|
||||
types: [created]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: read
|
||||
issues: read
|
||||
|
||||
jobs:
|
||||
dispatch:
|
||||
if: >
|
||||
github.event.issue.pull_request &&
|
||||
contains(github.event.comment.body, '/trl-ci')
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Gate on trusted commenter
|
||||
id: trust
|
||||
run: |
|
||||
assoc="${{ github.event.comment.author_association }}"
|
||||
case "$assoc" in
|
||||
MEMBER|OWNER|COLLABORATOR) echo "trusted=true" >> $GITHUB_OUTPUT ;;
|
||||
*) echo "trusted=false" >> $GITHUB_OUTPUT ;;
|
||||
esac
|
||||
|
||||
- name: Reject untrusted commenter
|
||||
if: steps.trust.outputs.trusted != 'true'
|
||||
run: |
|
||||
echo "::error::Untrusted commenter (${{ github.event.comment.author_association }}); aborting."
|
||||
exit 1
|
||||
|
||||
- name: Fetch PR head SHA + number
|
||||
if: steps.trust.outputs.trusted == 'true'
|
||||
id: pr
|
||||
env:
|
||||
GH_TOKEN: ${{ github.token }}
|
||||
PR_URL: ${{ github.event.issue.pull_request.url }}
|
||||
run: |
|
||||
sha=$(gh api "$PR_URL" --jq .head.sha)
|
||||
number=$(gh api "$PR_URL" --jq .number)
|
||||
echo "sha=$sha" >> "$GITHUB_OUTPUT"
|
||||
echo "number=$number" >> "$GITHUB_OUTPUT"
|
||||
|
||||
- name: Dispatch TRL workflow
|
||||
if: steps.trust.outputs.trusted == 'true'
|
||||
id: dispatch
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.TRL_CI_DISPATCH_TOKEN }}
|
||||
STEPS_PR_OUTPUTS_SHA: ${{ steps.pr.outputs.sha }}
|
||||
run: |
|
||||
gh workflow run "Tests against Transformers branch" \
|
||||
-R huggingface/trl \
|
||||
-f transformers_ref=${STEPS_PR_OUTPUTS_SHA}
|
||||
|
||||
- name: Find TRL workflow run URL
|
||||
if: steps.trust.outputs.trusted == 'true'
|
||||
id: find_run
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.TRL_CI_DISPATCH_TOKEN }}
|
||||
run: |
|
||||
echo "Waiting for workflow to appear..."
|
||||
for i in {1..10}; do
|
||||
run_url=$(gh api \
|
||||
repos/huggingface/trl/actions/workflows \
|
||||
--jq '.workflows[] | select(.name=="Tests against Transformers branch") | .id')
|
||||
|
||||
if [ -n "$run_url" ]; then
|
||||
run=$(gh api \
|
||||
repos/huggingface/trl/actions/workflows/$run_url/runs \
|
||||
-f event=workflow_dispatch \
|
||||
-f per_page=5 \
|
||||
--jq '.workflow_runs[0].html_url')
|
||||
if [ -n "$run" ]; then
|
||||
echo "url=$run" >> $GITHUB_OUTPUT
|
||||
break
|
||||
fi
|
||||
fi
|
||||
sleep 5
|
||||
done
|
||||
|
||||
- name: Comment back on PR with link
|
||||
if: steps.trust.outputs.trusted == 'true'
|
||||
env:
|
||||
GH_TOKEN: ${{ github.token }}
|
||||
STEPS_PR_OUTPUTS_SHA: ${{ steps.pr.outputs.sha }}
|
||||
STEPS_FIND_RUN_OUTPUTS_URL: ${{ steps.find_run.outputs.url }}
|
||||
run: |
|
||||
gh api -X POST \
|
||||
/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments \
|
||||
-f body="🚀 **TRL CI triggered** against transformers commit \`${STEPS_PR_OUTPUTS_SHA}\`\n\n🔗 **Run:** ${STEPS_FIND_RUN_OUTPUTS_URL}"
|
||||
21
.github/workflows/trufflehog.yml
vendored
Normal file
21
.github/workflows/trufflehog.yml
vendored
Normal file
@@ -0,0 +1,21 @@
|
||||
on:
|
||||
push:
|
||||
|
||||
name: Secret Leaks
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
trufflehog:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
- name: Secret Scanning
|
||||
uses: trufflesecurity/trufflehog@6bd2d14f7a4bc1e569fa3550efa7ec632a4fa67b # main
|
||||
with:
|
||||
extra_args: --results=verified,unknown
|
||||
34
.github/workflows/update_metdata.yml
vendored
Normal file
34
.github/workflows/update_metdata.yml
vendored
Normal file
@@ -0,0 +1,34 @@
|
||||
name: Update Transformers metadata
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- update_transformers_metadata*
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build_and_package:
|
||||
runs-on: ubuntu-22.04
|
||||
defaults:
|
||||
run:
|
||||
shell: bash -l {0}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Setup environment
|
||||
run: |
|
||||
pip install --upgrade pip
|
||||
pip install datasets pandas
|
||||
pip install .[torch]
|
||||
|
||||
- name: Update metadata
|
||||
env:
|
||||
HF_TOKEN: ${{ secrets.LYSANDRE_HF_TOKEN }}
|
||||
run: |
|
||||
python utils/update_metadata.py --token "$HF_TOKEN" --commit_sha ${{ github.sha }}
|
||||
19
.github/workflows/upload_pr_documentation.yml
vendored
Normal file
19
.github/workflows/upload_pr_documentation.yml
vendored
Normal file
@@ -0,0 +1,19 @@
|
||||
name: Upload PR Documentation
|
||||
|
||||
on:
|
||||
workflow_run:
|
||||
workflows: ["Build PR Documentation"]
|
||||
types:
|
||||
- completed
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
build:
|
||||
uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@9ad2de8582b56c017cb530c1165116d40433f1c6 # main
|
||||
with:
|
||||
package_name: transformers
|
||||
secrets:
|
||||
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
|
||||
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}
|
||||
Reference in New Issue
Block a user