Atlantis: The Terraform Automation Powerhouse
E19

Atlantis: The Terraform Automation Powerhouse

Atlantis is my second favourite
spin off of the Stargate series.

that's not what this episode is about.

Wait.

What was your first?

In all seriousness, though, David
couldn't even stop asking questions

because it was so interesting.

Terraform automation is a very
important aspect of production

engineering, so I was genuinely excited.

You weren't the only one.

Today, we got to talk with one of the
maintainers of Project Atlantis, a

Terraform pull request automation system.

So let's go deep, deep undersea, and
explore the wonderful world of Atlantis.

Not Stargate.

Alright, while David is having some
tea, uh, we are welcoming a new

guest today, uh, to talk a little
bit about the Atlantis project.

Uh, Pepe do you wanna introduce yourself
and tell us a little bit about you?

Sure.

I'm, uh, my name is, uh, Pepe.

Well, it's my nickname.

My real name is Jose, but everybody
calls me Pepe in the tech industry.

So let's, let's, let's give it Pepe.

I have been with the Atlantis project
for about four years, almost, I think.

I am not the original developer of
the Atlantis project I inherited

from, uh, Luke and Misra, which
both work in HashiCorp now.

and I became involved with the
project due to the fact that I

needed a feature really badly.

I was like, I, I need to do this.

And I used to work in Sonatype, which
is, uh, has a lot of open source

projects, and basically they support
me to introduce a feature and somehow

I end up with owning the project with,
uh, Dylan and, uh, Cheerio that are the

other core maintenance of the project.

So, yeah.

I know you said HashiCorp.

Yes, yes.

Yeah.

Yeah.

Yeah.

I was gonna say, I know, I know you said
HashiCorp, but all I heard was IBM, so.

Brutal.

Brutal.

there was a big discussion.

At some point.

We were kind of nervous at some point
about it because, the advantage was

created in Vancouver, uh, literally,
half a block away where I used to work.

So maybe me and Luke and Mishra
eat in the same restaurant in the

corner, uh, of the Ray Wall Street,
because it was created by Hootsuite.

so when they were in Hootsuite,
they created Atlantis, and

then they open source it.

Hootsuite was, happy to do that, and
then it became an open source project.

And then, um, Mishra,
look, moved to HashiCorp.

and when I joined the
Atlantic project was a bit.

It didn't have much activity.

because, where I used to work
in Sonotype, we have, uh, a

lot of open source libraries.

I ask a, what if we, Because
we're using Atlantis.

What if we actually fork Atlantis
under Sonotype and we maintain it?

And they were okay.

There's a bunch of, uh,
goal developers there.

So, um, and I was not a golang developer
at all, and I'm not a Goalang developer.

I can't call myself a Golang developer
and I, so I learned Golang just to build

this feature, and that's when basically
that spark kind of like look to add.

Try to find out more,
uh, uh, contributors.

Um, so this was basically the first
time that Alanis ever had external

contributors except for Michelin, Luke.

and that's how we became part of the
project then, and, well, a few years after

that, after the change of license of,
uh, Terraform, uh, we got an section from

HashiCorp for, because we have the binary
of Terraform within the Atlantis image.

So we were not sure if that was, was
in breach of the, of the new license.

And, and then at the same time, that's
when we start asking the question,

should we move it to the CNCF?

Should we apply to be A-C-N-C-F project?

And, and that's when basically,
uh, look drop from the project

and give us ownership of the
whole project to the three of us.

Awesome.

For anyone listening, let's
make two assumptions right now.

One, they know what Terraform is and
two, they don't know what Atlantis is.

Can you give us the kind of
the rundown of what is Atlantis

and how do people integrate it?

Yeah, so Atlantis is basically,
um, I would say maybe one of the

first, if not the first GitOps
workflow integration with Terraform.

Um, so, uh, the idea is that, developers
don't have to go and use you know, their

login environment or a log or another
tool to run their Terraform workflows to

deploy their infrastructure and was moving
the that responsibility or that tooling

to the left closer to the developers.

So the Advantis integrates with the
VCS, which could be GitHub, GitLab,

Azure, DevOps, GT, Bitbucket.

and basically you create a webhook
configuration in your repo, and then

the, the user, the developer can just
basically interact with Atlantis commands

and so comments within the PR and Atlantis
will respond with the Terraform plan

against the workflow that is defined.

Atlantis can do it automatically,
like if you have a few Terraform

files in the repo, it has auto
discovery, uh, auto planning, and,

and it discovers those files, and
then it will basically tell you.

Hey, here's your plan, and it
will create a formatted comment

within your PR so you can see it.

And then people that review
the PR can see it too.

And then it's like, okay, this is
what is supposed to, uh, what, uh,

Terraform wants to, uh, uh, plan.

And then you can have a better
idea of what the, the outcome

of that PR is going to be.

So then you, you get kind of like a sense
of what is going to look like in the

current infrastructure after the PR is
merged, uh, when the deployment happens.

So all that interaction
is based on command.

So you do Atlantis plan and
then you get the comment.

if you don't have, uh, if
you have, uh, auto, auto plan

enable it will do it by itself.

Um, and then if after the reviews,
after rules that you have in your repo.

Atlantis will look at those rules
in the VCS, and then it will

say, okay, well now this, this is
approved, it's ready for merge.

and then you run Atlantis apply, and
then it runs Terraform apply for you.

And Atlantis is not a wrapper.

Atlantis just runs your
binary of Terraform.

So people use it to run CDKTF some people
use it to run Terra Grant, some people

use it to run atmos from cloud policy.

So you can actually run anything that
outputs a Terraform plan at some point.

So you can glue a lot of
scripts in between run policies

and a bunch of other things.

So that's basically what Atlantis, you
can extend the quite a lot, but in, in

an essence is a basically, uh, GitOps
interaction with your telephone workflows.

Yeah.

I never knew that angle I used
CDKTF quite a lot these days.

And I like, I've not used it in
production, but I like the Atlas

project and the use of HCL for
database schemas and, and migrations.

And I don't even think for a second that
you could run Atlantis as your kinda

CI/CD glue at that point to get kind of
visibility into what is gonna change.

And I just want to focus on,
uh, kind of that one statement

is like for anyone listening to
that description, it was great.

Right?

The take home is Atlantis seems
to be that window into what will

change if I merge this pull request?

Is that, is

That's correct.

That that's exactly correct.

Yeah.

Yeah, yeah.

It's kind of like the snapshot
of like, what is this pr going

to change of my infrastructure?

Obviously there is, we can get into
discussions about, you know, apply

before merch and apply after merch,
which, Atlantis is apply before

merch, which has these drawbacks.

But basically at the point that the
command was done, the infrastructure

looked like that, and your code, was
trying to change that snapshot of time

of the infrastructure in that moment.

I guess this goes down to that
wonderful naive assumption that many

people make if they merge a pull
request a code that's gonna end up in

the main branch is the code that was
in the PR, and that's not always the

That's correct because in the, between,
this is, this is kinda like type of

versus a spaces conversation in any, in
any tool that relates to, Terraform, due

to the state and how the state works.

because the state is not synchronous
entity where by itself will

actually check the status of the
infrastructure is only by triggers.

The, any tool that uses a GitOps flow
will have a delay between what reality

is, which is your infrastructure,
live infrastructure with the PR.

So if someone goes in and has the
ability to change manually, something

that you want to change in your plan,
that plan is already out of date.

Damn, those click offers.

But anyway, uh, you
don't work for our IBM.

Right.

No, I don't, I don't, I work
for a consulting company.

Yeah.

So I can say, does, does
Altantis work with OpenTofu?

Yes, it does.

Yeah, yeah, yeah, yeah.

Yeah.

So we, we added, uh, OpenTofu support.

Actually, this is a, a bit of a
misconception about Atlantis, or

maybe it's not well documented,
so maybe it's our fault too.

So you can run anything in
Atlantis that produces, Terraform.

Plan output.

A plan file.

As long as it runs Terraform, you
generate a Terraform plan file.

You can do anything you want.

You can create a Jira ticket in
between and then run Terraform.

You can do all the stuff because
you can add a, your custom

workflow with your custom script.

A lot of people even use, you know,
connections with AD to figure it out is

a users are part of a group in AD before
the plan gets run and a bunch of things.

So you can customize this, uh, really
heavily so you can run anything it.

So before we actually, we had
official OpenTofu support.

People already were using open
Tofu because you can customize the

workflow to just run the tofu binary
instead of the Terraform binary.

So that there's a blog post
in our site about that.

And then we release OpenTofu official
support and what I mean by official

support is that now Atlantis is able
to download automatically the version

of OpenTofu that you described in the,
in the workflow, uh, required versions.

So that's the difference between kind of
like efficient and nonofficial support

is that we were out to download because
we had that capability for Terraform

and we want to offer the same in open.

Do you find that you have like some
usage stats at this point where you can

say like, okay, you know, 50% of people
are using OpenTofu versus Terraform.

I'm, I'm actually kind of
curious if you have that.

Actually I do.

No, they're, they're not,
they're not, uh, up to date.

Because we did a survey, uh, a while back.

we, I remember sending
this, um, screenshot to, the

OpenTofu, uh, slack channel.

Because at that point, uh,
open Tofu was how long?

I can't, I can't recall the version
of Open Tofu, but there was about

20 or 30% of the people were using
our open tofu at that point already.

So it, that's, that's way higher than I
was very surprised because, at that point,

OpenTofu was like super, super early
stages and people were already using it.

I was like, wow, that's, that's crazy that
a lot of people are using it, so, yeah.

Yeah.

All right.

I'm gonna ask a, a flippant question.

Please don't hate me too much, right?

But, uh.

Atlantis is a wrapper, if you will,
of Terraform or OpenTofu, et cetera.

It basically just gets plan
output and then post to a comment.

Right.

But you've been maintaining this
project for four years, I'm assuming

it's got a forward trajectory.

Things are changing, like is it not done?

Like what comes next?

Like what are you,

Um, yeah.

Ooh, that's a hard question
to answer right now.

We, we literally have a meeting
last week, uh, or two weeks

ago to talk about the future.

This is a problem.

We are all working on, we have our day
jobs as many, many, or pretty much 99%

of every contributor in our open source
project, they do not work a hundred

percent of the time in the project.

So, and a lot of us have
changed kind of like, uh,

responsibilities within our jobs.

So, or we are more busy than before
or are less busy, but we need to focus

on learning something new or whatever.

So, we are all kind of like
in transition period where.

Uh, we are trying to figure
out what are we going to do.

There's new people that join us, um,
and they're core contributors, not

yet maintainers, so we are hoping that
they will have the time to join us.

A maintainer.

There's a lot of, a lot of commitment.

It's a lot of code.

we support all those
VCSs that I mentioned.

It's hard to keep up with the API
changes of the VCS and then on

top of that, try to move towards

the feature, the feature of Lantis,
right now, the short term feature is

that we are going to release version
1.0 as official, um, due to the fact

that it has been used in production for
many, many years already, but we didn't

ever, move to basically follow Ember.

Correctly, we are already in
production, therefore we should

have in, uh, version 1.0.

Uh, but for that we need to do
some work because, um, you know,

the docs, for example, don't
reflect versioning of the features.

So we want to reflect that we already
deprecated a bunch of, uh, old, uh,

configs, uh, and, um, parameters
that we don't support anymore.

So we did that work.

So there's a few other things
that we need to do to, to get to,

like release 1.0 and, and then.

2.0 will be basically what we
want, we would like to see in the

future, but for that we need a lot
of more maintainers to help us out.

So the ideally what will happen with
Atlantis is that it's is going to

be a queing system for PRs that will
be, synchronous to the ClusterAPI.

So ideally what will happen is that
we will, we will use something like.

I don't know, Temporal or something
like that, that will be able to manage

queues of, of PRs, incoming PRs for
multiple VCSs at the same time and then

they can basically have worker nodes
that will run the, the plan and there

will be some sort of metadata share
storage where if the node goes down, we

still can find information and so on.

And have a statuses globally across
the whole Cluster which is not true the

right now, you can run multiple advantage
instances a HA mode, but it's a bit

tricky and a bit slow because you need
to share storage and things like that.

So hopefully we'll get,
we can get to that point.

Um, interestingly enough, the Advances
project was, had a, um, a huge

amount of, contribution from Lyft.

Um, there were two developers from Lyft.

Lyft use Atlantis quite
heavily, uh, internally.

and a lot of the features, some of
the features that we have right now,

for example, live logs and things
like that are actually due to the,

their contributions and at some point
they wanted to actually create, uh,

something called, now TEUs under our repo

and to be able to, Houston Pearl
as a Qing system for Atlan so that

was what they were building inside.

So we were basically going to move
towards that, but then there were

some disagreements between HashiCorp
and then that could not happen.

hopefully, you know, in the future, that's
where we will get to where, where we can

have like a more distributed system for
Advantis instead of having, like, Advantis

actually scales really well vertically.

We have people that have, uh, 500
PRs a day, going through an Atlantic

server, um, and only one Atlantis
server, so for multiple environments.

So it really, really is really, uh,
performance, but obviously all that

depends on how you, you know, structure
your term from projects and so on.

No.

So, yeah.

I guess I wanna make sure I
captured the, the kinda problem

statement there correctly, right?

Like GitHub now has merge queues, so
we have some sort of guarantee that,

well we do have guarantees that pill
requests be merged in a certain order.

So it's not the terraform apply.

I think it is causing
problems here, right?

correct me if I'm wrong, the challenge
that you have and why horizontal

scale and temporal may be really
useful is that if you've got multiple

PRs beingopened at the same time and
they're all running a terraform plan.

Those will be very different depending
on when their actions sequentially, or

even if they happen at the same time,
the plans would actually be incorrect,

and that's the problem that if you push
this, well, I guess if you push it through

temporal or some sort of queuing system
still doesn't give you a guaranteed

output, but it's gonna be a, a more
consistent, I don't know, like that's

I, yeah.

the problem is mostly like which
Atlantis server picks the PR and if

that Atlantis server dies and then
someone, you know, runs another command,

will we have enough information to
continue the work that was lost?

Or, and the other problem is that
parallelism in Terraform plants is hard.

within one specific instance, like if you
run multiple plants of the same ripple

on your computer, it is actually tricky.

you could potentially lock the project
completely and then the next person

will not, if you have login, enable
will not be able to, uh, use it.

You can just work spaces
and do things like that.

So, so there is a batch of
problems that the queue and,

and worker notes could solve.

That not only pertains to the Terraform
plan in itself, but but to who manages

that request at that particular moment.

And if we can run multiple PRs
from the same Repo, for example,

in, uh, multiple different server.

Yeah.

And then synchronize that metadata back.

Okay, so how deep is the Get
and GitHub integration then?

And the use case is the one.

Again, I just talked about if we
look at two pill requests, they run,

a plan, and then one gets merged.

We're now at the point where pill request
B is outta sync with the main branch.

Now, I could go to the pill request
and say, rebase, which would

trigger a rebuild because the
actual pill request was updated.

But does Atlantis have enough
understanding to block or at least warn

the user before merging that pill request,

Well, that would be very
tricky with multiple servers.

So that would be a challenge if
we move to that worker type of

No, no, that would be very tricky.

We, we have different modes
of, repo, settings for like

divergent and, and so on.

I would say that there are not as
stable as you would, you would think,

or I wouldn't say not as stable.

Is that.

The definition of divergent for
certain people is different of

what we offer in the code, and
that's where it gets a bit tricky.

Um, so yeah, so the problem about merging
or the workflow, how you work with repo

will directly affect how, you know, the
multiple worker nodes will interact too.

So, so those, all those kind of
like design decisions are going

to be really hard to get there,
but I mean, it is possible.

So it's, it's a matter of like having
enough people to like, you know, sit down,

write it down, and test it and figure it
out but, we are not the only ones that,

you know, want to see something like that
a lot of people there is a huge issue in,

in Atlants that have been there for many,
many years, or people talking about how

can we get Atlantis to be a p real HA um,
instead of, you know having to do this

kinda like share storage and we have ready
support for locking and some metadata.

So it, it kinda works, but it
doesn't actually do it in a huge

scale for like people that has, you
know, 500, a thousand PRs a day from

multiple teams and things like that.

So.

That will be the goal.

Yeah.

But it's going to be tricky, that's
for sure and the multiple VCS uh,

thing is, it is a huge deal for us.

You will not believe how complicated
it gets from one VCS client

library we have to the other one.

So from GitLab to GitHub,
what is our status?

Of a simple status.

What, what is, is done status?

How is defined within the VCS?

It's always a problem.

Um, and what consists of mergeable PR
within GitLab, Azure, DevOps, and GitHub.

They're totally different.

So, that part is actually really hard.

We would love to see more
contribution from the VCSs themselves.

Into Atlantis, but that's a hard one.

We do have contributions from
GitLab, actually, uh, because

GitLab uses Atlantis internally.

Um, so every now and then we actually
do have, uh, PRs from, from GitLab going

into change certain definitions because
the, the, the API change and so on.

So, so yeah.

And, and another part of the 2.0
and the feature is like create a

a plugable BCS interface so that
people that has other type of BCSs

or they want to, uh, you know, for
example, play with version two of

the GitHub, API or GitHub API can do
it easily without breaking the core.

So that's another thing that we're,
we are thinking about so many,

many things in the list of like
what we would like to see for 2.0.

So I'm gonna be the one to ask it
'cause I know David is thinking it Is

rust anywhere in that conversation.

Mostly I'm trolling David, but you know.

Yeah, I asked that question no long
ago to the other core, uh, contributors

and, and we were talking about it.

I guess the answer is
kind of simple in a way.

If you think about it, because we are a
CNCF project now, um, we are moving to the

incubation project and then hopefully at
some point we are going to be released,

as a encourager, as a CNCF project.

The ecosystem in, in the CNCFs
is, Golang and that's, and

that's a simple answer of it.

Yeah.

So if Rust would had a major, I
guess, Appearance within the, the

other, um, CNCF projects, it will
be a easier sell for in that case.

So, um, and there is a lot of things
that we, within the ecosystem of

the CNCF that support, I guess,
you know, go and more than others.

Um, so if I was going to maybe ask for
help for something within, Atlantis,

which is really in Golang, I might have
more luck finding someone within the CNCF

to help us out than compared with Rust.

You.

But it's a but but it's a choice.

It's a total choice.

I mean, you, you don't have to write
it in Golan you if you want to do it.

NCI mean, there's a lot of stuff
that are, are really see other

languages but Golan is a main,
language within the CNCF projects.

I have thoughts on that, but
I'll, I'll say that for later.

I mean, you also are consuming Terraform,
I'm assuming, as a go modular, right?

So you,

No, we, we actually downloaded the
binary, no, we downloaded the binary.

So that's when, when the, when Terraform
got changed the license we, that's

why we didn't have an issue, uh, with,
with HashiCorp or one of the reasons.

But, but yeah, so we don't,
we just download the binary.

That's, that's what we do, and
then we just run against it.

That's all we do.

Yeah.

All right.

Well, I'm gonna come back to some more
hard questions then, because Atlantis

is obviously a very cool project.

It's in the CNCF, it has, you
know, lots of contributors forward

momentum to 1.0 in a future 2.0.

Right.

But we're talking about a project that
has the ability to execute Terraform,

which is obviously downloaded as a binary,
so there's one security concern there.

I'm sure there's checks and balances in
place, but you know, we'll get there.

It also has the ability to run against all
of my Cloud Native infrastructure, whether

that be on AWS, GCP, any integrations,
Datadog, New Relics, anything.

I can use Terraform these days, which
is pretty much every, everything

so you've got access to all of
my credentials, security concern.

And we've put this into a repository,
which may or may not be a open source,

publicly available repository with
CI/CD runs, with log outputs and the

ability for people to execute commands
or comments, which I'm assuming

that how Atlantis is orchestrated.

So security concern.

So maybe you can guide me through
what, what checks and balances are

in place to make sure that Atlantis
is a secure project for me to run

in my production infrastructure,
whether it's open source or not.

Yeah, I think that, the official
answer is you need to follow the cloud

provider guidelines or security to
when, when running any Terraform code.

Doesn't matter if you run an Atlantis.

So those guidelines need to be
already in your project to be secure

because we, we run whatever you
want within your Terraform project.

So if you're running locally
executing and then doing big

bitcoin mining, we, we can't stop it.

That that would be in your code within,
the, within your telephone project.

So, um, so after, after Atlantis
runs, um, gets a credential somehow.

For, to pass it to the provider,
which kind of like, that's kind of

like the big first big security,
problem that you want to solve.

Then, then we run any code that is there.

So it is crucially important that whatever
gets to the repo, it is obviously has

some sort of check and balances within
the company or the project or whatever

it's public or not, and, and that the
code is, you know set to be safe and

for that you can use any, any tools
for IC, you know, security checks, you

know, study code analysis tools like
Snyk Fossa or what, whatever supports

Terraform and after that, then it, it
becomes kind of like an interaction

between how do you deploy Atlantis
into the cloud provider and how you are

interacting with your cloud provider.

And that is where, kind of like the.

Well architected framework type of
doc will come into consideration

and then realize, okay, so how
is the safe, what is the safest

way to pass to a CI/CD pipeline?

Credentials to then run Terraform and
Atlantis will fit in between there.

So usually what, what people will do
the, they will have, they will deploy

Atlantis in GCP or Azure or uh, a LS and
they, they will use role assumption and

then the role assumption may some people
use a centralized account management

account where they have an Atlantis, uh,
server or deployment that basically has

good mode to assume roles everywhere.

Then, their responsibility of, of having
the, the, the roles set up properly so

that you don't allow all the services,
for example, or all the lambdas, and you

actually create your role of policies to
constrain the least access possible is

the responsibility of the person that you
know is maintaining the address account.

So, that's why I say it's not just
Atlantis it's actually more, far more

important than how the project is
being built within, like the security

constraints, uh, and recommendation of the
cloud provider and how the cloud provider

recommends you to win then assume those
different role levels to then interact,

interact with, so that you could,
can interact with the cloud provider?

Uh, Atlantis does have some checks in
the sense of, like, for example, it, it

doesn't allow you to, bypass or to, uh,
traverse your directory where the PRs

so because, at some point there was, um,
uh, the possibility to, uh, be able to

graph files outside of your repo so we
have now, a setting that defaults to only

allow the PR repository to be looked at.

And if you want to inject files to these
other met methods to, to do so, but they

will only be pertain to that checkout
because Advantage is what, what it does,

what it gets an event that what we event,
the first thing it does is checks the

code of that repo of that PR at that.

hash, you know, and so he downloads a
repo, creates a repository structure

within Atlantis, and that's where all
your calls is and all the commands

live and where the plan file is
going to be, everything is there.

so now you cannot obviously traverse in
between directories In very old versions,

you, you could potentially do that.

and then, one thing that I, I would say
that Atlantic could do better and that we

don't really, have a good way to do so,
a lot of people solve this problems with.

Uh, policy as code.

So we, we have policy checks so
you can run, um, uh, comp test

policies against your code.

So then you can check against provider
versions and what kind of providers

are allowed because we have a mechanism
called, pre workflow hooks where

you can run any type of tooling or a
script before Atlantis actually run.

So then you can do all
those checks yourself.

But it will be nice to see, for example,
that we add, uh, some sort of way to

list the providers that are allowed to
run within Atlantis, and then Atlantis

will actually not run that project
if it has those, uh, kind of like

an extension of the policy checks,
but in an easy configurable setting.

We, we don't currently have that and that
has been kind of like one of the things

that's like a, it would be nice if we
could potentially do that because, one

of the biggest issues in, I guess, that I
see as a consultant and, and have seen it

from comments on the slack is that, um.

They have the Atlantic server, they
have this, all this set up, with

all these roles and permissions.

But now anyone that creates a PR can add
a provider or can add, a new resource that

will execute something and they want to
not allow anyone to do so, but they don't

have those capabilities, maybe at the VCS
level because they don't have admin rights

to that or whatever, but they want to do
it in the Atlantis side, but they can't.

So that's when they get crafty and then
create a pre workflow, script and so on.

So I, I think that there is more work
to do there, but, like I said before,

since Atlantis is just runs your binary
of Terraform, I guess the, the best

way for you to, accommodate for, any
security issues is to actually download,

maybe create your own image, put all
the tooling that you want to wrap around

your different projects with, and then
maybe if you want, you create a custom,

uh, workflow that will run those tools
before you probably actually runs

through Atlantis and then you can kind
of like avoid some of those kinda like

middleman injection, attacks, let's
say, if we want to call them that way,

to avoid those kind of like raw user
issues, which are more common than not

actually, in all the GitOps tooling.

So, so, yeah.

Interesting.

That was a great answer.

You covered a lot more there

Yeah.

Sorry.

expecting you to, but
No, no, that's great.

Right, because what we want is
people to be, you know, we want

'em to be excited by the project.

We want 'em to get value from it, and
from that you need to make sure that

there's security postures in the right
place and that they're comfortable

with this kind of workload running.

Right?

Because the more automation we bring,
the more that a attack surface expands.

So let's.

Let's assume people listening
are, this sounds great.

I, I'm happy with my security posture.

I wanna start running
this a couple of times.

Now.

You've mentioned the concept of an
Atlantis server, and I want to just

understand, is that a prerequisite
for people to get started or can

they just use a GitHub action to
run some sort of Atlantis binary?

Like what is that, that
getting started process?

We, we actually do have, in our,
survey, we, we do have a lot of people

that wants to see an Atlantis GitHub
actions, but we, we don't have that yet.

So Atlantis is a binary that you need to
run listening to, into a port, to listen

to, uh, webhook requests, so no matter
where you run it, uh, some people have run

it in even app run and things like that.

You, it has to be listening
as a demon somewhere.

So most probably, most people use
container to do that in any type of

flavored cloud flavor service that
can run a container, which are many.

And we do offer a helm chart for
people that are running Kubernetes, so

that they can install the help chart.

So at the end of the day, you have to
have that this, uh, binary running and

listening port 41 41, which is the default
and then, that, that will become your

Atlantis instance basically and you will
have to configure that your Atlantis

instance, um, to be able to fulfill kind
of like the forecast of, you know, PR and

plans that you might have, depending on
the size of your project and size, is a

word that is used very widely within, um,
GitOps, but there is a size of turf on

project that is very important to be, be
able to size your instance or your server.

So can you run Atlants?

Using Atlantis?

Is that what I'm hearing?

Uh, no, no.

Well,

I'm trolling a little bit.

No, no, no.

Actually, that's a good question.

You so, I, I, funny enough, a lot
of people do, so a a lot of people

have, for example, a help chart
integration, uh, in a, in a pipeline

in Terraform that then updates itself.

Great.

Okay.

I, I had to ask 'cause I thought that
that would be really entertaining if it's

Yeah, yeah, yeah.

Actually, in many projects that I, I
set up, uh, Atlantis was, uh, basically

resource zero of the infrastructure,
and it will, it will run within this

account that I was mentioning before
that will assume roles in all the

different accounts, and Atlantis
will actually create the first

resources in every single account.

So in a way, Atlantis was the
bootstrapping of all the other accounts,

but at the same time, I needed to update
Atlantis after the accounts were created

to like update maps and things like that.

And, and basically you could potentially
run in the Atlantics Atlantis, uh, repo

Atlantis plan and you will actually
update itself with the new values.

Yeah, as long as it are not destructive.

Things that you're doing within
the Terraform code, then it

should be, it should be fine
otherwise, you will destroy itself.

No.

So that could happen.

It's potentially possible.

I think I have one more question.

Uh, so again, under the assumption,
the interest of the audience is peaked.

They're like, this sounds really cool.

Uh, let's assume there's 10,000 people
that go, I wanna go do this tomorrow.

Is there ever a use case or
ever an instance where you

would tell any of those people?

No.

Atlantis is not the right project for
you now, like should all Terraform

be under management of Atlantis or is
there a wait until you hit these certain

constraints and then it becomes important?

Yeah, that's a, that's
a very good question.

I am,

We are really planning on having this
tough of questions on a podcast for you.

yes.

Yes.

confident as like he's
getting hard questions.

That's a, very good question.

I think that is not an
Atlantis responsibility.

I think that is a business decision
security constrained decision

within the person's project,
company, project and so on.

You take that decision before
you use Atlantis, and then you

decide if you want a GitOps.

Or any type of key of automation
tooling in, connected to that project.

I mean, for example, we could
potentially say that a project that

is creating new, for example, let's,
let's talk about a OS because I'm more

familiar with it but imagine that I'm
creating the a OS accounts and I'm

setting up SCPS, uh, service control
policies within the account and so on.

You, you might want to say, you know,
at some point, or maybe the security

team will say, you know what we, we
want to manage that but we, we don't

want that as a GitOps approach because
we run those so infrequently that we

will prefer to actually do it in a
pipeline that is very simplistic but

we don't want Atlantis in between.

We, we want to provide to Atlantis the
required access, and that's about it which

is more, most more like an a brown field,
a scenario, and that happens quite often.

Where the security posture of the
company doesn't allow, you know,

you to run semi-automated tools
or on a high security or high,

access level Terraform on project.

So that depends of mostly on that.

Not, not that you can run anything
Atlants you can run absolutely every

single piece of coding Atlantis and
I have done the same, kind of like,

workflow of creating CPs and creating
accounts and so on within Atlantis, but

that Atlantis basically had got access.

No.

So if you are okay with that,
then that's, that's fine.

which is true to, every other project out
there terraform, CloudTerra made anything

that can automate Terraform, workflows.

Okay.

I've got one more now just because we're
talking about gut access mode, right.

But say.

I mean it, if they pop in
my head, I'm gonna ask them.

That's, that's my, my own detriment.

But if I run Atlantis, let's use AWS
because that's the one you just mentioned.

Right.

And I, I assign a role to that instance.

I'm assuming I can use, you know, you
said assume role a couple of times.

Right.

But workload identity, I can say
this workload can assume this

role because it is this workload.

So, I mean, again, we're.

You know, it's not deflection, but
when we talk about the security

of Atlantis, it does just come
down to good cloud practices, the

standard things you would have to do
regardless of running Atlantis or not.

That's correct.

So it's not, it's that's why I said,
you know, that's what people, we get a

lot of questions of, of companies like
the same me Slack messages directly,

Hey, um, you know, we want to use
it, but it's not PCI complainer.

We, it is not PIPA complianers.

But that's not, is it, that's,
is an advantage responsibility.

Where, where does it, the line
needs to be drawn, you know?

So, and, and a lot of people actually
get confused and say, oh, but

Atlantis will have all these access.

No, no, no.

You are giving access to Atlantis.

That just doesn't have any access.

It doesn't want you to set up a s We don't
have an a s client within the library.

We just run Terraform and your Terraform.

That is in your binary.

So if you create your image,
you have to put Terraform there.

Otherwise, it doesn't run.

We don't have client built in and go.

So, um, and that's where people actually
get confused because they, they think

that Atlantis is the one that has
access, but in reality it's the same as

a, me as the administrator, the DevOps
person or platform engineer that has

access to renter from is the same deal.

it's not different.

the difference I guess is that that
automated tooling, which in this

case will be Atlantis, will have
more, most probably will have more

access than you running locally.

Um, but.

Um, but that, but that will be it.

One, one thing that you could do in
Atlantis, which we recommend most of

the time, is to run multiple Atlantis,
uh, installations or containers,

servers in different accounts.

That's for sure.

We always recommend, like if someone
ask, if you really want to, make the,

the blast radius of axis, I guess from of
the, of the Atlantic server, smaller just

run an Atlantic server in each account.

So that then, and then we have, uh, a
way in Atlantis to configure multiple

Atlantic servers using the same repo.

So if you have a monorepo to do all
the those things, you can actually

use the Atlantis configuration to
be able to say a Atlantis dash dev.

Plan, and then it would just
run plan on dev and so on.

And then you can apply and
follow approval processes and

approval rules within the repo.

That will be actually, uh,
the, the best approach.

And a lot of people that do that do,
uh, currently, um, obviously, uh,

it depends on your BCs if supports
multiple web hooks and things like that.

Some BCSS are different.

Um, but, um, you know, GitHub and GitLab
for example, are, you know, that's

totally normal and, and people do that so.

I mean, I'll, I'll only
challenge one thing there.

It's not the best approach.

The best approach is to have
no infrastructure and no code.

Correct.

Yeah, And the best security is
disconnect the RJ 45 from your rack.

That's just pull it

Yeah, just turn it all off.

And I think on that note, we
have had a wonderful episode.

So any last words for our our
audience before we let you go today?

Well, I guess that if anyone wants
to use Atlantis and or has any

questions about it, we are always
very active in the Slack channel, in

the CNCF, in the Atlantis Channel.

Um, so you're welcome to join,
ask questions, try it out.

I would try to, you know, guide
you within my time limits and

anything that you are trying to do,
and we'll do the best to help you

out, you know, with the community.

So, yeah.

Very nice.

Well, thank you so much for joining us.

We're really happy to have had you.

Yeah.

Thank you for the interview.

That's amazing.

It's pretty good time, so thank you.

Thanks for joining us.

If you want to keep up with us,
consider us subscribing to the podcast

on your favorite podcasting app, or
even go to cloud native compass.fmc

and if you want us to talk with someone
specific or cover a specific topic, reach

out to us on any social media platform

and tell next time when exploring
the cloud native landscape on three

on three.

1, 2, 3. Don't forget your compass.

Don't forget

your compass.

Episode Video

Creators and Guests

David Flanagan
Host
David Flanagan
I teach people advanced Kubernetes & Cloud Native patterns and practices. I am the founder of the Rawkode Academy and KubeHuddle, and I co-organise Kubernetes London.
Laura Santamaria
Host
Laura Santamaria
🌻💙💛Developer Advocate 🥑 I ❤️ DevOps. Recovering Earth/Atmo Sci educator, cloud aficionado. Curator #AMinuteOnTheMic; cohost The Hallway Track, ex-PulumiTV.
Jose (PePe) Amengual
Guest
Jose (PePe) Amengual
Principal @Slalom Build | HashiCorp Core Contributor 2022 | Atlantis Core Maintainer