[tor-project] gitolite to gitlab migration completed (TPA-RFC-36)

Al Smith smith at torproject.org
Wed May 1 15:20:45 UTC 2024


Congratulations on reaching this milestone!


On 5/1/24 8:16 AM, Antoine Beaupré wrote:
> Hi again everyone!
>
> This is the last update of the Gitolite migration. It's a little more
> detailed than previous updates, so I made it in a blog post:
>
> https://blog.torproject.org/gitolite-gitlab-migration/
>
> ... which I attach a copy here if you're the kind of people who prefer
> to read email than web. :)
>
> Enjoy!
>
> ----
>
> Tor has finally completed a long migration from legacy Git
> infrastructure ([Gitolite and GitWeb][]) to our self-hosted
> [GitLab][] server.
>
>   [GitLab]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/gitlab
>   [Gitolite and GitWeb]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/git
>
> Git repository addresses have therefore changed. Many of you probably
> have made the switch already, but if not, you will need to change:
>
>      https://git.torproject.org/
>
> to:
>
>      https://gitlab.torproject.org/
>
> In your Git configuration.
>
> The [GitWeb front page][] is now an archived listing of all the
> repositories before the migration. Inactive git repositories were
> archived in GitLab [legacy/gitolite namespace][] and the
> `gitweb.torproject.org` and `git.torproject.org` web sites now
> redirect to GitLab.
>
>   [legacy/gitolite namespace]: https://gitlab.torproject.org/legacy/gitolite/
>   [GitWeb front page]: https://gitweb.torproject.org/
>
> Best effort was made to reproduce the original gitolite repositories
> faithfully and also avoid duplicating too much data in the
> migration. But it's *possible* that some data present in Gitolite has
> not migrated to GitLab.
>
> User repositories are particularly at risk, because they were
> massively migrated, and they were "re-forked" from their upstreams, to
> avoid wasting disk space. If a user had a project with a matching name
> it was *assumed* to have the right data, which might be inaccurate.
>
> The two virtual machines responsible for the legacy service (`cupani`
> for `git-rw.torproject.org` and `vineale` for `git.torproject.org` and
> `gitweb.torproject.org`) have been shutdown. Their disks will remain
> for 3 months (until the end of July 2024) and their backups for
> another year after that (until the end of July 2025), after which
> point all the data from those hosts will be destroyed, with only the
> GitLab archives remaining.
>
> The rest of this article expands on how this was done and what kind of
> problems we faced during the migration.
>
> # Where is the code?
>
> Normally, nothing should be lost. All repositories in gitolite have
> been either explicitly migrated by their owners, forcibly migrated by
> the sysadmin team ([TPA][]), or explicitly destroyed at their owner's
> request.
>
>   [TPA]: https://gitlab.torproject.org/tpo/tpa/team/
>
> An exhaustive [rewrite map][] translates gitolite projects to GitLab
> projects. Some of those projects actually redirect to their *parent*
> in cases of empty repositories that were obvious forks. Destroyed
> repositories redirect to the GitLab front page.
>
>   [rewrite map]: https://archive.torproject.org/websites/gitolite2gitlab.txt
>
> Because the migration happened progressively, it's technically
> possible that commits pushed to gitolite were lost after the
> migration. We took great care to avoid that scenario. First, we
> adopted a proposal ([TPA-RFC-36][]) in June 2023 to announce the
> transition. Then, in [March 2024][], we locked down all repositories
> from any further changes. Around that time, only a [handful of
> repositories][] had changes made after the adoption date, and we
> examined each repository carefully to make sure nothing was lost.
>
>   [handful of repositories]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41214#note_2983302 "handful of repositories"
>   [March 2024]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41213
>   [TPA-RFC-36]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-36-gitolite-gitweb-retirement
>
> Still, we built a [diff of all the changes in the git references][]
> that archivists can peruse to check for data loss. It's large (6MiB+)
> because a lot of repositories were migrated before the mass migration
> and then kept evolving in GitLab. Many other repositories were rebuilt
> in GitLab from parent to rebuild a fork relationship which added extra
> references to those clones.
>
>   [diff of all the changes in the git references]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41215#note_3023924
>
> A note to amateur archivists out there, it's probably too late for one
> last crawl now. The Git repositories now all redirect to GitLab and
> are effectively unavailable in their original form.
>
> That said, the GitWeb site was crawled into the [Internet Archive][] [in
> February 2024][], so at least some copy of it is available in the
> [Wayback Machine][]. At that point, however, many developers had already
> migrated their projects to GitLab, so the copies there were already
> possibly out of date compared with the repositories in GitLab.
>
>   [Wayback Machine]: https://web.archive.org/web/20240204162238/https://gitweb.torproject.org/
>   [in February 2024]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41218#note_2992296
>   [Internet Archive]: https://archive.org/
>
> [Software Heritage][] also has a copy of all repositories hosted on
> Gitolite [since June 2023][] and have continuously kept mirroring the
> repositories, where they will be kept hopefully in eternity. There's
> an [issue][] where the main website can't find the repositories when
> you search for `gitweb.torproject.org`, instead [search for
> `git.torproject.org`][].
>
>   [search for `git.torproject.org`]: https://archive.softwareheritage.org/browse/search/?q=git.torproject.org&visit_type=git&with_content=true&with_visit=true
>   [issue]: https://gitlab.softwareheritage.org/swh/devel/swh-web/-/issues/4787
>   [since June 2023]: https://gitlab.softwareheritage.org/swh/infra/sysadm-environment/-/issues/4939
>   [Software Heritage]: https://www.softwareheritage.org/
>
> In any case, if you believe data is missing, please do let us know by
> [opening an issue with TPA][].
>
>   [opening an issue with TPA]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/new
>
> # Why?
>
> This is an old project in the making. The first [discussion about
> migrating from gitolite to GitLab][] started in 2020 (almost 4 years
> ago). But [going further back][], the first GitLab experiment was in
> 2016, almost a decade ago.
>
>   [going further back]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/trac#history
>
>   [discussion about migrating from gitolite to GitLab]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40472
>
> The current GitLab server dates from 2019, [replacing Trac for issue
> tracking in 2020][]. It was originally supposed to host only mirrors
> for merge requests and issue trackers but, naturally, one thing led to
> another and eventually, GitLab had grown a container registry,
> continuous integration (CI) runners, GitLab Pages, and, of course,
> hosted most Git repositories.
>
>   [replacing Trac for issue tracking in 2020]: https://blog.torproject.org/from-trac-into-gitlab-for-tor/
>
> There were hesitations at moving to GitLab for code hosting. We had
> [discussions about the increased attack surface][] and [ways to
> mitigate that][], but, ultimately, it seems the issues were not that
> serious and the community embraced GitLab.
>
>   [ways to mitigate that]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/98
>   [discussions about the increased attack surface]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/81
>
> TPA actually migrated its most critical repositories out of shared
> hosting entirely, into specific servers (e.g. the Puppet Git
> repository is just on the Puppet server now), leveraging Git's
> decentralized nature and removing an entire attack surface from our
> infrastructure. Some of those repositories are *mirrored* back into
> GitLab, but the authoritative copy is not on GitLab.
>
> In any case, the proposal to migrate from Gitolite to GitLab was
> effectively just formalizing a *fait accompli*.
>
> # How to migrate from Gitolite / cgit to GitLab
>
> The progressive migration was a challenge. If you intend to migrate
> between hosting platforms, we strongly recommend to make a "flag day"
> during which you migrate *all* repositories *at once*. This ensures a
> smoother transition and avoids elaborate rewrite rules.
>
> When Gitolite access was shutdown, we had repositories on both GitLab
> and Gitolite, without a clear relationship between the two. A priori,
> the plan then was to import all the remaining Gitolite repositories
> into the `legacy/gitolite` namespace, but that seemed wasteful,
> particularly for large repositories like [Tor Browser][] which uses
> nearly a gigabyte of disk space. So we took special care to avoid
> duplicating repositories.
>
>   [Tor Browser]: https://gitlab.torproject.org/tpo/applications/tor-browser
>
> When the [mass migration][] started, only 71 of the 538 Gitolite
> repositories were `Migrated to GitLab` in the `gitolite.conf`
> file. So, given that we had *hundreds* of repositories to migrate:, we
> developed some automation to "[save time][]". We already automate
> similar ad-hoc tasks with [Fabric][], so we used that framework here
> as well. (Our normal configuration management tool is [Puppet][],
> which is a poor fit here.)
>
>   [Puppet]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/puppet
>   [Fabric]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/fabric/
>   [save time]: https://xkcd.com/1205/
>   [mass migration]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41215
>
> So a relatively [large amount of Python code][] was produced to
> basically do the following:
>
>   [large amount of Python code]: https://gitlab.torproject.org/tpo/tpa/fabric-tasks/-/blob/85121b4a8a293cebb0d9dfd68ebf26e2cc95ed76/fabric_tpa/gitolite.py
>
>   1. check if all on-disk repositories are listed in `gitolite.conf`
>      (and vice versa) and either add missing repositories or delete
>      them from disk if garbage
>   2. for each repository in `gitolite.conf`, if its category is marked
>      `Migrated to GitLab`, skip, otherwise;
>   3. find a matching GitLab project by name, prompt the user for
>      multiple matches
>   4. if a match is found, redirect if the repository is non-empty
>      * we have GitLab projects that *look* like the real thing, but are
>      only present to host migrated Trac issues
>      * in such cases we cloned the Gitolite project locally and pushed
>      to the existing repository instead
>   5. otherwise, a new repository is created in the `legacy/gitolite`
>      namespace, using the "import" mechanism in GitLab to automatically
>      import the repository from Gitolite, creating redirections and
>      updating `gitolite.conf` to document the change
>
> User repositories (those under the `user/` directory in Gitolite) were
> handled specially. First, the existing redirection map was checked to
> see if a similarly named project was migrated (so that,
> e.g. `user/dgoulet/tor` is properly treated as a fork of
> `tpo/core/tor`). Then the parent project was forked in GitLab and the
> Gitolite project force-pushed to the fork. This allows us to show the
> fork relationship in GitLab and, more importantly, benefit from the
> "pool" feature in GitLab which deduplicates disk usage between forks.
>
> Sometimes, we found no such relationships. Then we simply imported
> multiple repositories with similar names in the `legacy/gitolite`
> namespace, sometimes creating forks between user repositories, on a
> first-come-first-served basis from the `gitolite.conf` order.
>
> The code used in this migration is now available publicly. We
> encourage other groups planning to migrate from Gitolite/GitWeb to
> GitLab to use (and contribute to) our [fabric-tasks][] repository,
> even though it does have its fair share of hard-coded assertions.
>
>   [fabric-tasks]: https://gitlab.torproject.org/tpo/tpa/fabric-tasks/
>
> The main entry point is the `gitolite.mass-repos-migration` task. A
> typical migration job looked like:
>
> ```
> anarcat at angela:fabric-tasks$ fab -H cupani.torproject.org gitolite.mass-repos-migration
> [...]
> INFO: skipping project project/help/infra in category Migrated to GitLab
> INFO: skipping project project/help/wiki in category Migrated to GitLab
> INFO: skipping project project/jenkins/jobs in category Migrated to GitLab
> INFO: skipping project project/jenkins/tools in category Migrated to GitLab
> INFO: searching for projects matching fastlane
> INFO: Successfully connected to https://gitlab.torproject.org
> import gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'? [Y/n]
> INFO: importing gitolite project project/tor-browser/fastlane into gitlab legacy/gitolite/project/tor-browser/fastlane with desc 'Tor Browser app store and deployment configuration for Fastlane'
> INFO: building a new connect to cupani
> INFO: defaulting name to fastlane
> INFO: importing project into GitLab
> INFO: Successfully connected to https://gitlab.torproject.org
> INFO: loading group legacy/gitolite/project/tor-browser
> INFO: archiving project
> INFO: creating repository fastlane (fastlane) in namespace legacy/gitolite/project/tor-browser from https://git.torproject.org/project/tor-browser/fastlane into https://gitlab.torproject.org/legacy/gitolite/project/tor-browser/fastlane
> INFO: migrating Gitolite repository project/tor-browser/fastlane to GitLab project legacy/gitolite/project/tor-browser/fastlane
> INFO: uploading 399 bytes to /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive
> INFO: making /srv/git.torproject.org/repositories/project/tor-browser/fastlane.git/hooks/pre-receive executable
> INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
> INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
> INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project project/tor-browser/fastlane to category Migrated to GitLab
> INFO: skipping project project/bridges/bridgedb-admin in category Migrated to GitLab
> [...]
> ```
>
> In the above, you can see migrated repositories skipped then the
> [fastlane project][] being archived into GitLab. Another example with
> a later version of the script, processing only user repositories and
> showing the interactive prompt and a force-push into a fork:
>
>   [fastlane project]: https://gitlab.torproject.org/legacy/gitolite/project/tor-browser/fastlane
>
> ```
> $ fab -H cupani.torproject.org  gitolite.mass-repos-migration --include 'user/.*' --exclude '.*tor-?browser.*'
> INFO: skipping project user/aagbsn/bridgedb in category Migrated to GitLab
> [...]
> INFO: skipping project user/phw/atlas in category Migrated to GitLab
> INFO: processing project user/phw/obfsproxy (Philipp's obfsproxy repository) in category Users' development repositories (Attic)
> INFO: Successfully connected to https://gitlab.torproject.org
> INFO: user repository detected, trying to find fork phw/obfsproxy
> WARNING: no existing fork found, entering user fork subroutine
> INFO: found 6 GitLab projects matching 'obfsproxy' (https://gitweb.torproject.org/user/phw/obfsproxy.git)
> 0 legacy/gitolite/debian/obfsproxy
> 1 legacy/gitolite/debian/obfsproxy-legacy
> 2 legacy/gitolite/user/asn/obfsproxy
> 3 legacy/gitolite/user/ioerror/obfsproxy
> 4 tpo/anti-censorship/pluggable-transports/obfsproxy
> 5 tpo/anti-censorship/pluggable-transports/obfsproxy-legacy
> select parent to fork from, or enter to abort: ^G4
> INFO: repository is not empty: in-pack: 2104, packs: 1, size-pack: 414
> fork project tpo/anti-censorship/pluggable-transports/obfsproxy into legacy/gitolite/user/phw/obfsproxy^G [Y/n]
> INFO: loading project tpo/anti-censorship/pluggable-transports/obfsproxy
> INFO: forking project user/phw/obfsproxy into namespace legacy/gitolite/user/phw
> INFO: waiting for fork to complete...
> INFO: fork status: started, sleeping...
> INFO: fork finished
> INFO: cloning and force pushing from user/phw/obfsproxy to legacy/gitolite/user/phw/obfsproxy
> INFO: deleting branch protection: <class 'gitlab.v4.objects.branches.ProjectProtectedBranch'> => {'id': 2723, 'name': 'master', 'push_access_levels': [{'id': 2864, 'access_level': 40, 'access_level_description': 'Maintainers', 'deploy_key_id': None}], 'merge_access_levels': [{'id': 2753, 'access_level': 40, 'access_level_description': 'Maintainers'}], 'allow_force_push': False}
> INFO: cloning repository git-rw.torproject.org:/srv/git.torproject.org/repositories/user/phw/obfsproxy.git in /tmp/tmp6orvjggy/user/phw/obfsproxy
> Cloning into bare repository '/tmp/tmp6orvjggy/user/phw/obfsproxy'...
> INFO: pushing to GitLab: https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
> remote:
> remote: To create a merge request for bug_10887, visit:
> remote:   https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy/-/merge_requests/new?merge_request%5Bsource_branch%5D=bug_10887
> remote:
> [...]
> To ssh://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
>   + 2bf9d09...a8e54d5 master -> master (forced update)
>   * [new branch]      bug_10887 -> bug_10887
> [...]
> INFO: migrating repo
> INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/obfsproxy.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/obfsproxy
> INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
> INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
> INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/obfsproxy to category Migrated to GitLab
> INFO: processing project user/phw/scramblesuit (Philipp's ScrambleSuit repository) in category Users' development repositories (Attic)
> INFO: user repository detected, trying to find fork phw/scramblesuit
> WARNING: no existing fork found, entering user fork subroutine
> WARNING: no matching gitlab project found for user/phw/scramblesuit
> INFO: user fork subroutine failed, resuming normal procedure
> INFO: searching for projects matching scramblesuit
> import gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'?^G [Y/n]
> INFO: checking if remote repo https://git.torproject.org/user/phw/scramblesuit exists
> INFO: importing gitolite project user/phw/scramblesuit into gitlab legacy/gitolite/user/phw/scramblesuit with desc 'Philipp's ScrambleSuit repository'
> INFO: importing project into GitLab
> INFO: Successfully connected to https://gitlab.torproject.org
> INFO: loading group legacy/gitolite/user/phw
> INFO: creating repository scramblesuit (scramblesuit) in namespace legacy/gitolite/user/phw from https://git.torproject.org/user/phw/scramblesuit into https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
> INFO: archiving project
> INFO: migrating Gitolite repository https://gitweb.torproject.org/user/phw/scramblesuit.git to GitLab project https://gitlab.torproject.org/legacy/gitolite/user/phw/scramblesuit
> INFO: adding entry to rewrite_map /home/anarcat/src/tor/tor-puppet/modules/profile/files/git/gitolite2gitlab.txt
> INFO: modifying gitolite.conf to add: "config gitweb.category = Migrated to GitLab"
> INFO: rewriting gitolite config /home/anarcat/src/tor/gitolite-admin/conf/gitolite.conf to change project user/phw/scramblesuit to category Migrated to GitLab
> [...]
> ```
>
> Acute eyes will notice the [bell used as a notification mechanism][]
> as well in this transcript.
>
>   [bell used as a notification mechanism]: https://anarc.at/blog/2022-11-08-modern-bell-urgency/
>
> A lot of the code is now useless for us, but some, like "commit and
> push" or [`is-repo-empty`][] live on in the [git module][] and, of
> course, the [gitlab module][] has grown some legs along the
> way. We've also found fun bugs, like a [file descriptor exhaustion in
> bash][], among other oddities. The [retirement milestone][] and
> [issue 41215][] has a detailed log of the migration, for those
> curious.
>
>   [issue 41215]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41215
>   [retirement milestone]: https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/11#tab-issues
>   [file descriptor exhaustion in bash]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642504
>   [gitlab module]: https://gitlab.torproject.org/tpo/tpa/fabric-tasks/-/blob/85121b4a8a293cebb0d9dfd68ebf26e2cc95ed76/fabric_tpa/gitlab.py
>   [git module]: https://gitlab.torproject.org/tpo/tpa/fabric-tasks/-/blob/85121b4a8a293cebb0d9dfd68ebf26e2cc95ed76/fabric_tpa/git.py
>   [`is-repo-empty`]: https://gitlab.torproject.org/tpo/tpa/fabric-tasks/-/blob/85121b4a8a293cebb0d9dfd68ebf26e2cc95ed76/fabric_tpa/git.py#L120-153
>
> This was a challenging project, but it feels nice to have this behind
> us. This gets rid of 2 of the 4 remaining machines running Debian
> "old-old-stable", which moves a bit further ahead in our late
> [bullseye upgrades milestone][].
>
>   [bullseye upgrades milestone]: https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/5#tab-issues
>
> Full transparency: we tested GPT-3.5, GPT-4, and other large language
> models to see if they could answer the question "write a set of
> rewrite rules to redirect GitWeb to GitLab". This has become a
> standard LLM test for your faithful writer to figure out how good a
> LLM is at technical responses. None of them gave an accurate,
> complete, and functional response, for the record.
>
> The actual rewrite rules as of this writing follow, for humans that
> actually like working answers provided by expert humans instead of
> artificial intelligence which currently seem to be, glorified,
> mansplaining interns.
>
> ## git.torproject.org rewrite rules
>
> Those rules are relatively simple in that they rewrite a single URL to
> its equivalent GitLab counterpart in a 1:1 fashion. It relies on the
> [rewrite map][] mentioned above, of course.
>
>   [rewrite map]: https://archive.torproject.org/websites/gitolite2gitlab.txt
>
> ```
> RewriteEngine on
> # this RewriteMap connects the gitweb projects to their GitLab
> # equivalent
> RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"
> # if this becomes a performance bottleneck, convert to a DBM map with:
> #
> #  $ httxt2dbm -i mapfile.txt -o mapfile.map
> #
> # and:
> #
> # RewriteMap mapname "dbm:/etc/apache/mapfile.map"
> #
> # according to reports lavamind found online, we hit such a
> # performance bottleneck only around millions of entries, which is not our case
>
> # those two rules can go away once all the projects are
> # migrated to GitLab
> #
> # this matches the request URI so we can check the RewriteMap
> # for a match next
> #
> # WARNING: this won't match URLs without .git in them, which
> # *do* work now. one possibility would be to match the request
> # URI (without query string!) with:
> #
> # /git/(.*)(.git)?/(((branches|hooks|info|objects/).*)|git-.*|upload-pack|receive-pack|HEAD|config|description)?.
> #
> # I haven't been able to figure out the actual structure of
> # those URLs, so it's really hard to figure out the boundaries
> # of the project name here. I stopped after pouring around the
> # http-backend.c code in git
> # itself. https://www.git-scm.com/docs/http-protocol is also
> # kind of incomplete and unsatisfying.
> RewriteCond %{REQUEST_URI} ^/(git/)?(.*).git/.*$
> # this makes the RewriteRule match only if there's a match in
> # the rewrite map
> RewriteCond ${gitolite2gitlab:%2|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(git/)?(.*).git/(.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$2}.git/$3 [R=302,L]
>
> # Fallback everything else to GitLab
> RewriteRule (.*) https://gitlab.torproject.org [R=302,L]
> ```
>
> ## gitweb.torproject.org rewrite rules
>
> Those are the vastly more complicated GitWeb to GitLab rewrite
> rules.
>
> Note that we say "GitWeb" but we were actually *not* running
> [GitWeb][] but [cgit][], as the former didn't actually scale for us.
>
>   [cgit]: https://git.zx2c4.com/cgit/
>   [GitWeb]: https://git-scm.com/docs/gitweb
>
> ```
> RewriteEngine on
> # this RewriteMap connects the gitweb projects to their GitLab
> # equivalent
> RewriteMap gitolite2gitlab "txt:/etc/apache2/gitolite2gitlab.txt"
>
> # special rule to process targets of the old spec.tpo site and
> # bring them to the right redirect on the new spec.tpo site. that should turn, for example:
> #
> # https://gitweb.torproject.org/torspec.git/tree/address-spec.txt
> #
> # into:
> #
> # https://spec.torproject.org/address-spec
> RewriteRule ^/torspec.git/tree/(.*).txt$ https://spec.torproject.org/$1 [R=302]
>
> # list of endpoints taken from cgit's cmd.c
>
> # those two RewriteCond are necessary because we don't move
> # all repositories at once. once the migration is completed,
> # they can be removed.
> #
> # and yes, they are copied all over the place below
> #
> # create a match for the project name to check if the project
> # has been moved to GitLab
> RewriteCond %{REQUEST_URI} ^/(.*).git(/.*)?$
> # this makes the RewriteRule match only if there's a match in
> # the rewrite map
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> # main project page, like summary below
> RewriteRule ^/(.*).git/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]
>
> # summary
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/summary/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]
>
> # about
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/about/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/ [R=302,L]
>
> # commit
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond "%{QUERY_STRING}" "(.*(?:^|&))id=([^&]*)(&.*)?$"
> RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%2 [R=302,L,QSD]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/commit/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L]
>
> # diff, incomplete because can diff arbitrary refs and files in cgit but not in GitLab, hard to parse
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} id=([^&]*)
> RewriteRule ^/(.*).git/diff/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1 [R=302,L,QSD]
>
> # patch
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} id=([^&]*)
> RewriteRule ^/(.*).git/patch/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1.patch [R=302,L,QSD]
>
> # rawdiff, incomplete because can show only one file diff, which GitLab cannot
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} id=([^&]*)
> RewriteRule ^/(.*).git/rawdiff/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commit/%1.diff [R=302,L,QSD]
>
> # log
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} h=([^&]*)
> RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/%1 [R=302,L,QSD]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/log/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/log(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD$2 [R=302,L]
>
> # atom
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} h=([^&]*)
> RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/%1 [R=302,L,QSD]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/atom/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/commits/HEAD [R=302,L,QSD]
>
> # refs, incomplete because two pages in GitLab, defaulting to "tags"
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/refs/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tags [R=302,L]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} h=([^&]*)
> RewriteRule ^/(.*).git/tag/? https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tags/%1 [R=302,L,QSD]
>
> # tree
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} id=([^&]*)
> RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/%1$2 [R=302,L,QSD]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/tree(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/HEAD$2 [R=302,L]
>
> # /-/tree has no good default in GitLab, revert to HEAD which is a good
> # approximation (we can't assume "master" here anymore)
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/tree/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/tree/HEAD [R=302,L]
>
> # plain
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteCond %{QUERY_STRING} h=([^&]*)
> RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/raw/%1$2 [R=302,L,QSD]
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/plain(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/raw/HEAD$2 [R=302,L]
>
> # blame: disabled
> #RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> #RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> #RewriteCond %{QUERY_STRING} h=([^&]*)
> #RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/blame/%1$2 [R=302,L,QSD]
> # same default as tree above
> #RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> #RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> #RewriteRule ^/(.*).git/blame(/?.*)$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/blame/HEAD/$2 [R=302,L]
>
> # stats
> RewriteCond %{REQUEST_URI} ^/(.*).git/.*$
> RewriteCond ${gitolite2gitlab:%1|NOT_FOUND} !NOT_FOUND
> RewriteRule ^/(.*).git/stats/?$ https://gitlab.torproject.org/${gitolite2gitlab:$1}/-/graphs/HEAD [R=302,L]
>
> # still TODO:
> # repolist: once migration is complete
> #
> # cannot be done:
> # atom: needs a feed token, user must be logged in
> # blob: no direct equivalent
> # info: not working on main cgit website?
> # ls_cache: not working, irrelevant?
> # objects: undocumented?
> # snapshot: pattern too hard to match on cgit's side
>
> # special case, we keep a copy of the main index on the archive
> RewriteRule ^/?$ https://archive.torproject.org/websites/gitweb.torproject.org.html [R=302,L]
> # Fallback: everything else to GitLab
> RewriteRule .* https://gitlab.torproject.org [R=302,L]
> ```
>
> The reference copy of those is available in our (currently private)
> Puppet git repository.
>
>
> _______________________________________________
> tor-project mailing list
> tor-project at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x463061FF1733AD6D.asc
Type: application/pgp-keys
Size: 4093 bytes
Desc: OpenPGP public key
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20240501/74fb0be5/attachment-0001.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20240501/74fb0be5/attachment-0001.sig>


More information about the tor-project mailing list