Skip to content

Commit 8bc8ca1

Browse files
committed
feat!: autoscaler with scaling schedules
Add the ability to use an autoscaler to scale down to zero outside the defined schedules. Only non-stateful MIGs can be used with autoscalers, so this commit also removes the responsibility of creating the home folder disk (atlantis-disk-0) from the MIG, effectively making it a stateless MIG. Nonetheless, destroying the group will not destroy the disk. Add resources for the disk and the autoscaler, and a usage example. Update the README. BREAKING CHANGE: the 50GB stateful disk is no longer created by the mig, which makes the mig no longer stateful. Additionally, if terraform destroy is executed, the disk is destroyed. Signed-off-by: David Costa <davidamorimc@gmail.com>
1 parent b070ef7 commit 8bc8ca1

File tree

6 files changed

+208
-24
lines changed

6 files changed

+208
-24
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ This Terraform module deploys various resources to run Atlantis on Google Comput
4848

4949
- **Confidential VM** - A Confidential VM is a type of Compute Engine VM that ensures that your data and applications stay private and encrypted even while in use. You can use a Confidential VM as part of your security strategy so you do not expose sensitive data or workloads during processing. Note that Confidential VM [does not support live migration](https://cloud.google.com/confidential-computing/confidential-vm/docs/error-messages#live_migration_isnt_supported), so if this feature is enabled, `onHostMaintenance` will be set to `TERMINATE`.
5050

51+
- **Scale to zero** - Use [scaling schedules](https://cloud.google.com/compute/docs/autoscaler/scaling-schedules#schedule_configuration_options) so that the instance group only scales up when configured, and down to zero otherwise. Useful to minimize costs.
52+
5153
## Prerequisites
5254

5355
This module expects that you already own or create the below resources yourself.
@@ -67,6 +69,7 @@ Here are some examples to choose from. Look at the prerequisites above to find o
6769
- [Secure Environment Variables](https://github.com/runatlantis/terraform-gce-atlantis/tree/master/examples/secure-env-vars)
6870
- [Cloud Armor](https://github.com/runatlantis/terraform-gce-atlantis/tree/master/examples/cloud-armor)
6971
- [Shared VPC](https://github.com/runatlantis/terraform-gce-atlantis/tree/master/examples/shared-vpc)
72+
- [Scale to zero](https://github.com/runatlantis/atlantis-on-gcp-vm/tree/master/examples/autoscaling)
7073

7174
```hcl
7275
module "atlantis" {
@@ -213,8 +216,10 @@ You can check the status of the certificate in the Google Cloud Console.
213216
| Name | Type |
214217
|------|------|
215218
| [google-beta_google_compute_instance_group_manager.default](https://registry.terraform.io/providers/hashicorp/google-beta/latest/docs/resources/google_compute_instance_group_manager) | resource |
219+
| [google_compute_autoscaler.default](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_autoscaler) | resource |
216220
| [google_compute_backend_service.default](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_backend_service) | resource |
217221
| [google_compute_backend_service.iap](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_backend_service) | resource |
222+
| [google_compute_disk.persistent](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_disk) | resource |
218223
| [google_compute_firewall.lb_health_check](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource |
219224
| [google_compute_global_address.default](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_global_address) | resource |
220225
| [google_compute_global_forwarding_rule.https](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_global_forwarding_rule) | resource |
@@ -235,6 +240,7 @@ You can check the status of the certificate in the Google Cloud Console.
235240
| Name | Description | Type | Default | Required |
236241
|------|-------------|------|---------|:--------:|
237242
| <a name="input_args"></a> [args](#input\_args) | Arguments to override the container image default command (CMD). | `list(string)` | `null` | no |
243+
| <a name="input_autoscaling"></a> [autoscaling](#input\_autoscaling) | Set schedules so that the instance group only scales up when configured | <pre>object({<br/> schedules = list(object({<br/> name = string<br/> description = string<br/> schedule = string<br/> time_zone = string<br/> duration_sec = number<br/> }))<br/> })</pre> | `null` | no |
238244
| <a name="input_block_project_ssh_keys_enabled"></a> [block\_project\_ssh\_keys\_enabled](#input\_block\_project\_ssh\_keys\_enabled) | Blocks the use of project-wide publich SSH keys | `bool` | `false` | no |
239245
| <a name="input_command"></a> [command](#input\_command) | Command to override the container image ENTRYPOINT | `list(string)` | `null` | no |
240246
| <a name="input_default_backend_security_policy"></a> [default\_backend\_security\_policy](#input\_default\_backend\_security\_policy) | Name of the security policy to apply to the default backend service | `string` | `null` | no |

examples/autoscaling/README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Example usage
2+
3+
This example uses [scaling schedules](https://cloud.google.com/compute/docs/autoscaler/scaling-schedules#schedule_configuration_options) to only deploy Atlantis during business hours.
4+
5+
The schedules follow the syntax [described in the documentation](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_autoscaler#nested_scaling_schedules), but in short:
6+
7+
- The time zone must be a time zone from the tz database: <http://en.wikipedia.org/wiki/Tz_database>
8+
- The schedule field uses the extended cron format
9+
10+
> [!NOTE]
11+
> It takes 2 to 3 minutes from the beginning of the scheduled time for the instance to be ready to serve requests. After the scheduled end time, it approximately takes 10 minutes for the instance to be destroyed.
12+
13+
Read through the below before you deploy this module.
14+
15+
- [Prerequisites](#prerequisites)
16+
- [How to deploy](#how-to-deploy)
17+
- [After it's successfully deployed](#after-its-successfully-deployed)
18+
19+
## Prerequisites
20+
21+
This module expects that you already own or create the below resources yourself.
22+
23+
- Google network, subnetwork and a Cloud NAT
24+
- Service account, [specifics can be found here](../../README.md#service-account)
25+
- Domain, [specifics can be found here](../../README.md#dns-record)
26+
27+
If you prefer an example that includes the above resources, see [`complete example`](https://github.com/runatlantis/atlantis-on-gcp-vm/tree/master/examples/complete).
28+
29+
## How to deploy
30+
31+
See [`main.tf`](https://github.com/runatlantis/atlantis-on-gcp-vm/tree/master/examples/basic/main.tf) and the [`server-atlantis.yaml`](https://github.com/runatlantis/atlantis-on-gcp-vm/tree/master/examples/basic/server-atlantis.yaml).
32+
33+
## After it's successfully deployed
34+
35+
Once you're done, see [Configuring Webhooks for Atlantis](https://www.runatlantis.io/docs/configuring-webhooks.html#configuring-webhooks)

examples/autoscaling/main.tf

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
locals {
2+
project_id = "<your-project-id>"
3+
network = "<your-network>"
4+
subnetwork = "<your-subnetwork>"
5+
region = "<your-region>"
6+
zone = "<your-zone>"
7+
domain = "<example.com>"
8+
managed_zone = "<your-managed-zone>"
9+
10+
github_repo_allow_list = "github.com/example/*"
11+
github_user = "<your-github-handle>"
12+
github_token = "<your-github-user>"
13+
github_webhook_secret = "<your-github-webhook-secret>"
14+
}
15+
16+
# Create a service account and attach the required Cloud Logging permissions to it.
17+
resource "google_service_account" "atlantis" {
18+
account_id = "atlantis"
19+
display_name = "Service Account for Atlantis"
20+
project = local.project_id
21+
}
22+
23+
resource "google_project_iam_member" "atlantis_log_writer" {
24+
role = "roles/logging.logWriter"
25+
member = "serviceAccount:${google_service_account.atlantis.email}"
26+
project = local.project_id
27+
}
28+
29+
resource "google_project_iam_member" "atlantis_metric_writer" {
30+
role = "roles/monitoring.metricWriter"
31+
member = "serviceAccount:${google_service_account.atlantis.email}"
32+
project = local.project_id
33+
}
34+
35+
module "atlantis" {
36+
source = "bschaatsbergen/atlantis/gce"
37+
name = "atlantis"
38+
network = local.network
39+
subnetwork = local.subnetwork
40+
region = local.region
41+
zone = local.zone
42+
service_account = {
43+
email = google_service_account.atlantis.email
44+
scopes = ["cloud-platform"]
45+
}
46+
# Note: environment variables are shown in the Google Cloud UI
47+
# See the `examples/secure-env-vars` if you want to protect sensitive information
48+
env_vars = {
49+
ATLANTIS_GH_USER = local.github_user
50+
ATLANTIS_GH_TOKEN = local.github_token
51+
ATLANTIS_GH_WEBHOOK_SECRET = local.github_webhook_secret
52+
ATLANTIS_REPO_ALLOWLIST = local.github_repo_allow_list
53+
ATLANTIS_ATLANTIS_URL = "https://${local.domain}"
54+
ATLANTIS_REPO_CONFIG_JSON = jsonencode(yamldecode(file("${path.module}/server-atlantis.yaml")))
55+
}
56+
57+
autoscaling = {
58+
schedules = [
59+
# Monday through Friday, between 7h30 and 19h30
60+
{
61+
name = "business-hours"
62+
description = "Deploy during business hours"
63+
schedule = "30 07 * * 1-5"
64+
time_zone = "Europe/London"
65+
duration_sec = 12 * 60 * 60
66+
},
67+
# Monday through Friday, all day
68+
# {
69+
# name = "mon-fri"
70+
# description = "Deploy during weekdays"
71+
# schedule = "00 00 * * 1-5"
72+
# time_zone = "Europe/London"
73+
# duration_sec = 24 * 60 * 60
74+
# },
75+
]
76+
}
77+
78+
domain = local.domain
79+
project = local.project_id
80+
}
81+
82+
# As your DNS records might be managed at another registrar's site, we create the DNS record outside of the module.
83+
# This record is mandatory in order to provision the managed SSL certificate successfully.
84+
resource "google_dns_record_set" "default" {
85+
name = "${local.domain}."
86+
type = "A"
87+
ttl = 60
88+
managed_zone = local.managed_zone
89+
rrdatas = [
90+
module.atlantis.ip_address
91+
]
92+
project = local.project_id
93+
}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
repos:
2+
- id: /.*/
3+
apply_requirements: [mergeable]
4+
allowed_overrides: [apply_requirements, workflow]
5+
allow_custom_workflows: true
6+
delete_source_branch_on_merge: true

main.tf

Lines changed: 53 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -175,24 +175,10 @@ resource "google_compute_instance_template" "default" {
175175

176176
# Persistent disk for Atlantis
177177
disk {
178-
device_name = "atlantis-disk-0"
179-
disk_type = var.persistent_disk_type
180-
mode = "READ_WRITE"
181-
disk_size_gb = var.persistent_disk_size_gb
182-
auto_delete = false
183-
labels = merge(
184-
local.atlantis_labels,
185-
{
186-
"disk-type" = "data"
187-
},
188-
)
189-
190-
dynamic "disk_encryption_key" {
191-
for_each = var.disk_kms_key_self_link != null ? [1] : []
192-
content {
193-
kms_key_self_link = var.disk_kms_key_self_link
194-
}
195-
}
178+
device_name = "atlantis-disk-0"
179+
mode = "READ_WRITE"
180+
source = google_compute_disk.persistent.name
181+
auto_delete = false
196182
}
197183

198184
network_interface {
@@ -226,6 +212,27 @@ resource "google_compute_instance_template" "default" {
226212
}
227213
}
228214

215+
resource "google_compute_disk" "persistent" {
216+
name = var.name
217+
type = var.persistent_disk_type
218+
size = var.persistent_disk_size_gb
219+
zone = var.zone
220+
labels = merge(
221+
local.atlantis_labels,
222+
{
223+
"disk-type" = "data"
224+
},
225+
)
226+
227+
dynamic "disk_encryption_key" {
228+
for_each = var.disk_kms_key_self_link != null ? [1] : []
229+
content {
230+
kms_key_self_link = var.disk_kms_key_self_link
231+
}
232+
}
233+
project = var.project
234+
}
235+
229236
resource "google_compute_health_check" "default" {
230237
name = var.name
231238
check_interval_sec = 1
@@ -272,17 +279,13 @@ resource "google_compute_instance_group_manager" "default" {
272279
port = local.atlantis_port
273280
}
274281

275-
stateful_disk {
276-
device_name = "atlantis-disk-0"
277-
delete_rule = "NEVER"
278-
}
279-
280282
auto_healing_policies {
281283
health_check = google_compute_health_check.default_instance_group_manager.id
282284
initial_delay_sec = 30
283285
}
284286

285-
target_size = 1
287+
# We cannot set target_size when using an autoscaler
288+
target_size = var.autoscaling == null ? 1 : null
286289

287290
update_policy {
288291
type = "PROACTIVE"
@@ -296,6 +299,32 @@ resource "google_compute_instance_group_manager" "default" {
296299
provider = google-beta
297300
}
298301

302+
resource "google_compute_autoscaler" "default" {
303+
count = var.autoscaling == null ? 0 : 1
304+
305+
name = var.name
306+
zone = var.zone
307+
target = google_compute_instance_group_manager.default.id
308+
309+
autoscaling_policy {
310+
max_replicas = 1 # Allow at most one instance
311+
min_replicas = 0 # Allow scaling down to zero
312+
cooldown_period = 60
313+
314+
dynamic "scaling_schedules" {
315+
for_each = var.autoscaling.schedules == null ? [] : var.autoscaling.schedules
316+
content {
317+
name = scaling_schedules.value.name
318+
description = scaling_schedules.value.description
319+
min_required_replicas = 1
320+
schedule = scaling_schedules.value.schedule
321+
time_zone = scaling_schedules.value.time_zone
322+
duration_sec = scaling_schedules.value.duration_sec
323+
}
324+
}
325+
}
326+
}
327+
299328
resource "google_compute_global_address" "default" {
300329
name = var.name
301330
project = var.project

variables.tf

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,4 +218,19 @@ variable "persistent_disk_type" {
218218
type = string
219219
description = "The type of persistent disk that Atlantis uses to store its data on"
220220
default = "pd-ssd"
221+
222+
}
223+
224+
variable "autoscaling" {
225+
description = "Set schedules so that the instance group only scales up when configured"
226+
type = object({
227+
schedules = list(object({
228+
name = string
229+
description = string
230+
schedule = string
231+
time_zone = string
232+
duration_sec = number
233+
}))
234+
})
235+
default = null
221236
}

0 commit comments

Comments
 (0)