Tags: google/caliban
Tags
image_tag, stdin experiments, CPU mode by default for TPU - `--experiment_config` can now take experiment configs via stdin (pipes, yay!); specify `--experiment_config stdin`, or any-cased version of that, and the script will wait to accept your input. As an example, this command pipes in a config and also passes `--dry_run` to show the series of jobs that WILL be submitted when the `--dry_run` flag is removed: ``` cat experiment.json | caliban cloud -e gpu --experiment_config stdin --dry_run trainer.train ``` You could pipe the output of a nontrivial python script that generates a JSON list of dicts. - `--image_tag` argument to `caliban cloud`; if you supply this it will bypass the Docker build and push steps and use this image directly. This is useful if you want to submit a job quickly without going through a no-op build and push, OR if you want to broadcast an experiment to some existing container. - if you supply a `--tpu_spec` and DON'T supply an explicit `--gpu_spec`, caliban will default to CPU mode. `--gpu_spec` and `--nogpu` are still incompatible. You can use a GPU and TPU spec together without problems. Change-Id: Ieea1467163f374bab01010c6a439dfff5877920f
Experiment Config, GPU Spec and Machine Type - `caliban cloud` now supports: - `--gpu_spec`, which you can use to configure the GPU count and type for your job. - `--machine_type` allows you to specify the machine type for all jobs that run. - `--experiment_config` lets you submit batches of jobs at once. - `--force` skips all validations and forces a submission with the specified config. - `--dry_run` will generate logs showing what WOULD happen if you submit a batch of jobs. All of these validate as early as possible so that it's not possible to attempt to submit a job that has obvious mismatches in GPU type, count, machine type and region. - The `caliban cloud --stream_logs` argument is now gone; the command prints, so this is easy enough to run without special help, and the argument made batch job submission difficult. Change-Id: Ia0afcf5000390a6c50a9e8532e6994aaab2683dd
Add new flags, whole project --version flag This PR: - adds flags for region and project_id in cloud mode - adds a robust "--version" option - adds a flag that lets you specify a custom jupyterlab version - removes the default project_id of MY project! - updates the README with new options Change-Id: Ia3db1f96ffecc1e3123402000e75f8e312b4256c
PreviousNext