This guide will walk you through the basics of using Daestro to run your compute workloads across multiple cloud providers. Daestro simplifies the process of managing and executing jobs on cloud infrastructure.
Table of Contents
Core Concepts
Before diving into the steps, let’s define some key concepts:
- Cloud Auth: Credentials that grant Daestro permission to manage resources within your cloud provider accounts (e.g., DigitalOcean, Vultr, AWS, etc.).
- Compute Environment: A specification defining the type of compute instance (VM), its location (region/zone), and the associated Cloud Auth. Think of it as a template for launching servers.
- Job Queue: A prioritized queue that manages the execution of jobs. It controls concurrency, priority, and the Compute Environments used for running jobs.
- Container Registry Auth: Credentials used to access private container registries (like Docker Hub or private registries) to pull your application’s Docker images.
- Job Definition: A blueprint for your job, specifying the Docker image, commands, resource requirements, and other settings.
- Job: A runnable instance of the Job definition.
Steps to Run Your First Job
Follow these steps to get your first job running on Daestro:
Daestro needs permission to manage compute instances within your cloud provider accounts. You’ll create a “Cloud Auth” entry for each account you want to use.
- Navigate: Go to the “Cloud Auth” section in the Daestro console.
- Create New: Click “Add Cloud Auth” (or a similar button).
- Select Provider: Choose your cloud provider (e.g., DigitalOcean, Vultr) from the list.
- Name: Enter a descriptive name for this Cloud Auth entry (e.g., “My AWS Production Account”). This name will help you identify it later.
- Enter Credentials: Provide the necessary API credentials for your cloud provider account. This typically involves:
- Vultr: Access token
- DigitalOcean: Access token
- AWS: Access Key ID and Secret Access Key.
- Google Cloud: Service Account Key JSON file.
- Azure: Subscription ID, Client ID, Client Secret, and Tenant ID.
- Save: Click “Save” to store the Cloud Auth entry.
2. Create a Compute Environment
A Compute Environment defines where and how your jobs will run (which cloud provider, instance type, and location).
- Navigate: Go to the “Compute Environments” section.
- Create New: Click “Add Compute Environment” (or similar).
- Name: Enter a descriptive name.
- Cloud Auth: Select the Cloud Auth entry you created in the previous step. This links the Compute Environment to your cloud provider account.
- Instance Type: Choose the desired instance type (e.g.,
t3.medium
on AWS, n1-standard-1
on Google Cloud). The available options will depend on the selected Cloud Auth and provider. - Location: Select the specific region (and optionally, zone) where you want your instances to be launched (e.g.,
us-east-1
on AWS, us-central1-a
on Google Cloud). - Save: Click “Save”.
3. Define a Job Queue
Job Queues manage the order and concurrency of job execution.
- Navigate: Go to the “Job Queue” section.
- Create New: Click “Add Job Queue” (or similar).
- Name: Enter a unique name for the Job Queue (1-128 characters, using lowercase letters (a-z), uppercase letters (A-Z), numbers (0-9), and hyphens (-)).
- Priority: Set a priority value between 1 and 1000. Higher values indicate higher priority. Jobs in a higher-priority queue will be processed before jobs in lower-priority queues.
- Max Concurrency: Specify the maximum number of jobs that can run concurrently within this queue. Set to
0
for no limit. - Max Idle Time (seconds): Set the maximum time (in seconds) that a compute instance can remain idle (without running any jobs) before it’s automatically terminated. This helps optimize resource utilization and cost.
- Compute Environments: Select one or more Compute Environments to associate with this Job Queue. Daestro will use these environments to launch instances for running jobs. It’s recommended to add multiple Compute Environments for fault tolerance (if one fails, another can be used). You can add up to 5.
- Save: Click “Save”.
4. Add Container Registry Authentication (Optional)
If your job uses a Docker image from a private container registry, you’ll need to provide authentication credentials.
- Navigate: Go to the “Container Registry Auth” section.
- Create New: Click “Add Registry Auth” (or similar).
- Registry URL: Enter the URL of your container registry (Not required for DockerHub).
- Username: Enter your username for the registry.
- Access Token: Enter your access token (or password) for the registry. Daestro will encrypt this token using AES 256 GCM for secure storage.
- Save: Click “Save”.
5. Create a Job Definition
The Job Definition is the template for your job.
- Navigate: Go to the “Job Definitions” section.
- Create New: Click “Add Job Definition” (or similar).
- Name: Enter a unique name for the Job Definition (1-128 characters: a-z, A-Z, 0-9, and hyphens (-)).
- Execution Timeout (seconds): Set the maximum time (in seconds) a job can run before it’s automatically cancelled.
- Docker Image: Specify the full name and tag of your Docker image (e.g.,
myusername/myimage:latest
). If the image is in Docker Hub, use the username/image:tag
format. - Container Registry Auth (optional): If your Docker image is in a private registry, select the Container Registry Auth entry you created earlier.
- Command (optional): Provide an array of commands to execute within the container. This overrides the default command defined in the Docker image. You can override this command when submitting the job. Example:
["python", "myscript.py"]
- Command Parameters (optional): Define key-value pairs for variables that can be used in your command. Use the format
Param::<key>
in your command to refer to these parameters. Example: - Command:
["python", "myscript.py", "--input", "Param::input_file"]
- Command Parameters:
{"input_file": "data.csv"}
- Users can override these values when submitting the job.
- Environment Variables (optional): Define key-value pairs for environment variables to be set within the container. These values will be encrypted. If you mark a variable as “sensitive,” its value will not be displayed again in the console. Users can override these values during job submission. The total size of all environment variables is limited to 8KB.
- Save: Click “Save”.
6. Submit a Job
Now you can submit a job to be executed.
- Navigate: Go to the “Jobs” section (or a similar section for job submission).
- Create New: Click “Submit Job” (or similar).
- Name (optional): Enter a name for the job (letters, numbers, and dashes only). If you leave this blank, Daestro will generate a random name.
- Job Definition: Select the Job Definition you created earlier.
- Job Queue: Select the Job Queue you want to use for this job.
- Command (optional): Override the command defined in the Job Definition, if needed.
- Command Parameters (optional): Override the command parameters defined in the Job Definition, if needed. Provide values for any keys that you want to change.
- Environment Variables (optional): Override the environment variables defined in the Job Definition, if needed. Provide values for any keys that you want to change.
- Submit: Click “Submit” to run the job.
Daestro will now use the selected Job Queue and its associated Compute Environments to launch an instance, pull your Docker image, and execute your job. You can monitor the job’s progress in the Daestro console.