CI for InstructLab¶

Unit tests¶

All unit tests currently live in the tests/ directory.

Functional tests¶

The functional test script can be found at scripts/functional-tests.sh

End-to-end (E2E) tests¶

The end-to-end test script can be found at scripts/basic-workflow-tests.sh. This script takes arguments that control which features are used to allow varying test coverage based on the resources available on a given test runner.

There is currently a default E2E job that runs automatically on all PRs and after commits merge to main or release branches.

There are other E2E jobs that can be triggered manually on the actions page for the repository. These run on a variety of instance types and can be run at the discretion of repo maintainers.

E2E Test Coverage¶

You can specify the following flags to test various features of ilab with the basic-workflow-tests.sh script - you can see examples of these being used within the E2E job configuration files found here.

Flag	Feature
`e`	Run model evaluation
`m`	Run minimal configuration
`M`	Use Mixtral model (4-bit quantized)
`f`	Run “fullsize” training
`F`	Run “fullsize” SDG
`g`	Run with Granite model
`v`	Run with vLLM for serving

Trigger via GitHub Web UI¶

For the E2E jobs that are launched manually, they take an input field that specifies the PR number or git branch to run them against. If you run them against a PR, they will automatically post a comment to the PR when the tests begin and end so it’s easier for those involved in the PR to follow the results.

Visit the Actions tab.
Click on the E2E test workflow on the left side of the page.
Click on the Run workflow button on the right side of the page.
Enter a branch name or a PR number in the input field.
Click the green Run workflow button.

Here is an example of using the GitHub Web UI to launch an E2E workflow:

GitHub Actions Run Workflow Example

Current GPU-enabled Runners¶

The project currently supports the usage of the following runners for the E2E jobs:

GitHub built-in GPU runner – referred to as ubuntu-gpu in our workflow files Only e2e.yml uses this runner.
Ephemeral GitHub runners launched on demand on AWS. Most workflows work this way, granting us access to a wider variety of infrastructure at lower cost.

E2E Workflows¶

File	T-Shirt Size	Runner Host	Instance Type	GPU Type	OS
`e2e.yml`	Small	GitHub	N/A	1 x NVIDIA Tesla T4 w/ 16 GB VRAM	Ubuntu
`e2e-nvidia-t4-x1.yml`	Small	AWS	`g4dn.2xlarge`	1 x NVIDIA Tesla T4 w/ 16 GB VRAM	CentOS Stream 9
`e2e-nvidia-a10g-x1.yml`	Medium	AWS	`g5.2xlarge`	1 x NVIDIA A10G w/ 24 GB VRAM	CentOS Stream 9
`e2e-nvidia-a10g-x4.yml`	Large	AWS	`g5.12xlarge`	4 x NVIDIA A10G w/ 24 GB VRAM (98 GB)	CentOS Stream 9

E2E Test Matrix¶

Area	Feature	`e2e.yml` (small)	`e2e-nvidia-t4-x1.yml` (small)	`e2e-nvidia-a10g-x1.yml` (medium)	`e2e-nvidia-a10g-x4.yml` (large)
Serving	llama-cpp	✅	✅	✅	✅ (temporary)
	vllm	⎯	⎯	⎯	❌
Generate	simple	✅	✅	✅	⎯
	full	⎯	⎯	⎯	✅
Training	legacy+Linux	⎯	⎯	✅	⎯
	legacy+Linux+4-bit-quant	✅	✅	⎯	⎯
	training-lib	⎯	⎯	✅(*1)	❌
Eval	eval	⎯	⎯	✅(*2)	❌️

Points of clarification (*):

The training-lib testing is not testing using the output of the Generate step. https://github.com/instructlab/instructlab/issues/1655
The eval testing is not evaluating the output of the Training step. https://github.com/instructlab/instructlab/issues/1540