[tor-project] GitLab Runner updates
Jim Newsome
jnewsome at torproject.org
Tue Jun 21 15:33:52 UTC 2022
On 6/20/22 09:20, Antoine Beaupré wrote:
>> While
>> it's fairly straightforward to install a gitlab-runner and execute
>> locally, as far as I can tell a malicious GitLab installation could
>> still send a modified "script" (post-processed .gitlab-ci.yml) or repo
>> checkout down to the runner. Maybe there's some way to audit this, but I
>> couldn't find an obvious one. Maybe configuring the runner to log at
>> debug level would record enough?
>> https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section
> Thtat's not what I mean. I don't mean installing your own runner locally
> and hooking it up with GitLab. I mean installing the gitlab-runner
> package (only!) and *not* hooking it up in GitLab.
>
> Instead, you run the job completely locally, without involving GitLab at
> all. That's done with the `gitlab-runner exec` command:
>
> https://docs.gitlab.com/runner/commands/#gitlab-runner-exec
>
> We have docs about this here:
>
> https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/ci#running-a-job-locally
>
> This removes a large part of the attack surface because GitLab is taken
> out of the equation. It reduces the stack to:
>
> * your local computer and operating system
> * your git repository
> * git
> * gitlab-runner
> * the executor (e.g. Docker) and its image
>
> It's still pretty darn large, but it's better than before. :)
Ahhh right, I'd forgotten about `gitlab-runner`'s `exec` feature.
Unfortunately the current implementation of the feature is a bit hacky
and not super well-documented. IIUC they took it from a 3rd party pull
request, tried to rip it back out, but too many people screamed so it's
still there in a semi-zombie state. It looks like they're working on
designing a new implementation that they'll be happier with.
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2797.
The current version only runs a single job, not a whole pipeline, so you
still need some wrapper logic for multi-job pipelines to run them in the
right order, copy artifacts between each-other, initialize
pipeline-level variables, etc.
For the debian package build I got it partly working, but couldn't find
a way to run a single-job out of a parameterized matrix (which they use
to build for multiple platforms and architectures). Given the other
headaches and lack of documentation I shelved this approach for the
moment
(https://gitlab.torproject.org/tpo/core/tor/-/issues/40615#note_2808336).
I agree that this feature is potentially very useful. The "v2" proposal
of the feature will run a whole pipeline, but communicates with Gitlab
to help do so, which may defeat the purpose again from our perspective
(at least without some careful auditing of the communication between
gitlab and the runner).
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/2797#proposal
>> For that issue I ended up hacking together a small python script that
>> processes the .gitlab-ci.yml into something to feed directly through
>> Docker. It's currently a bit hacky and specialized for the Debian tor
>> package build. I think it could be generalized further to be reusable if
>> that's of interest (maybe using Docker Compose to orchestrate jobs
>> within a pipeline), but am still thinking about whether there's a better
>> way...
>> https://gitlab.torproject.org/jnewsome/reproduce-tor-debian-build/-/blob/main/reproduce_pipeline.py
> Note that @eighthave has done a similar thing for F-Droid, you might
> want to collaborate.
Thanks, good to know!
> I think the improvement of that over the above is that you remove the
> "gitlab-runner" part of the attack surface. It's a pretty large attack
> surface because the runners are a surprisingly large amount of code, but
> I wonder if it's worth the trouble...
>
> What's the threat model here specifically? Backdoored gitlab-runner code?
Right - I agree there's not much security benefit over the
`gitlab-runner exec` approach. I just found I ultimately wasn't getting
that much benefit out of it since I was already having to write all the
pipeline-orchestration, and got tired of wrestling with the lack of
documentation etc :).
>> Right now my top candidate we haven't tried yet is to install a full
>> local GitLab in addition to a local gitlab-runner; maybe using their
>> published Docker imageshttps://docs.gitlab.com/ee/install/docker.html.
>> This seems like the least engineering effort (~none) but a bit more work
>> for every individual wanting to do such a local build.
> Other organisations run *two* GitLab instances for that purpose, by the
> way. GitLab.com included, from what I understand.
Interesting
>> Keeping as much logic out of the .gitlab-ci.yml as possible so that the
>> gitlab yml is trivial to manually reproduce outside of gitlab (e.g. run
>> `./build.sh`) is probably ideal, though gives up some gitlab
>> functionality.
> What functionality are you thinking of here?
For example the debian package build in particular makes heavy use of
yml templating. The same thing could be achieved other ways - e.g.
moving the yml snippets out to shell files/functions that can be invoked
by the other "job scripts", but it adds more indirection and
fragmentation vs having everything in one place in the yml file.
For multi-job pipelines, you also still end up having to duplicate the
outer orchestration between jobs in the pipeline between yml and some
other driver script. You can mitigate this by using fewer jobs (maybe
just 1) but that's again giving up some gitlab functionality.
> Thanks for the input! :)
:)
More information about the tor-project
mailing list