English 中文(简体)
如何组织多个git存储库,以便将所有存储库备份在一起?
原标题:
  • 时间:2008-08-31 13:54:20
  •  标签:

使用SVN,我在服务器上保存了一个大型存储库,并在几台机器上进行了检查。这是一个非常好的备份系统,让我可以轻松地在任何机器上工作。我可以签出一个特定的项目,提交并更新主项目,也可以签出整个项目。

现在,我有一堆git存储库,用于各种项目,其中有几个在github上。我还有我提到的SVN存储库,通过git-SVN命令导入。。

基本上,我喜欢把我所有的代码(不仅仅是项目,还有随机的片段和脚本,还有一些东西,比如我的简历、我写的文章、我制作的网站等等)放在一个大的存储库中,我可以很容易地克隆到远程机器或存储棒/硬盘上作为备份。

问题是,因为它是一个私人存储库,git不允许签出特定的文件夹(我可以将其作为一个单独的项目推送到github,但更改会同时出现在主repo和子repo中)

可以使用git子模块系统,但它也不能按照我的要求运行(子模块是指向其他存储库的指针,并不真正包含实际代码,因此它对备份毫无用处)

目前,我有一个git repos文件夹(例如,~/code_projects/proj1/.git/~/code.projects/proj2/.git/),在对proj1进行更改后,我会执行git-push github,然后将文件复制到~/Documents/code/python/projects/proj2/中,并执行一次提交(而不是单个repos中的许多提交)。然后执行git-push-backupdrive1git-push-mymemorystick等操作

那么,问题是:你的个人代码和项目如何使用git存储库,并保持同步和备份?

最佳回答

I would strongly advise against putting unrelated data in a given Git repository. The overhead of creating new repositories is quite low, and that is a feature that makes it possible to keep different lineages completely separate.

Fighting that idea means ending up with unnecessarily tangled history, which renders administration more difficult and--more importantly--"archeology" tools less useful because of the resulting dilution. Also, as you mentioned, Git assumes that the "unit of cloning" is the repository, and practically has to do so because of its distributed nature.

One solution is to keep every project/package/etc. as its own bare repository (i.e., without working tree) under a blessed hierarchy, like:

/repos/a.git
/repos/b.git
/repos/c.git

Once a few conventions have been established, it becomes trivial to apply administrative operations (backup, packing, web publishing) to the complete hierarchy, which serves a role not entirely dissimilar to "monolithic" SVN repositories. Working with these repositories also becomes somewhat similar to SVN workflows, with the addition that one can use local commits and branches:

svn checkout   --> git clone
svn update     --> git pull
svn commit     --> git push

You can have multiple remotes in each working clone, for the ease of synchronizing between the multiple parties:

$ cd ~/dev
$ git clone /repos/foo.git       # or the one from github, ...
$ cd foo
$ git remote add github ...
$ git remote add memorystick ...

You can then fetch/pull from each of the "sources", work and commit locally, and then push ("backup") to each of these remotes when you are ready with something like (note how that pushes the same commits and history to each of the remotes!):

$ for remote in origin github memorystick; do git push $remote; done

The easiest way to turn an existing working repository ~/dev/foo into such a bare repository is probably:

$ cd ~/dev
$ git clone --bare foo /repos/foo.git
$ mv foo foo.old
$ git clone /repos/foo.git

which is mostly equivalent to a svn import--but does not throw the existing, "local" history away.

Note: submodules are a mechanism to include shared related lineages, so I indeed wouldn t consider them an appropriate tool for the problem you are trying to solve.

问题回答

我想添加到Damien的回答他建议:

$ for remote in origin github memorystick; do git push $remote; done

您可以设置一个特殊的遥控器,通过1个命令推送到所有单独的真实遥控器;我在http://marc.info/?l=git&m=116231242118202&;w=2

So for "git push" (where it makes sense to push the same branches multiple times), you can actually do what I do:

  • .git/config包含:

    [remote "all"]
    url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
    url = login.osdl.org:linux-2.6.git
    
  • and now git push all master will push the "master" branch to both
    of those remote repositories.

您还可以通过使用以下结构节省两次键入URL的时间:

[url "<actual url base>"]
    insteadOf = <other url base>

我也很好奇建议的处理方法,并将描述我使用的当前设置(与SVN一起)。我基本上已经创建了一个存储库,其中包含一个迷你文件系统层次结构,包括它自己的bin和lib目录。在这个树的根目录中有一个脚本,它将设置您的环境,将这些bin、lib等…其他目录添加到适当的环境变量中。因此根目录基本上看起来像:

./bin/            # prepended to $PATH
./lib/            # prepended to $LD_LIBRARY_PATH
./lib/python/     # prepended to $PYTHONPATH
./setup_env.bash  # sets up the environment

现在在/bin和/lib中有多个项目和它们对应的库。我知道这不是一个标准项目,但我所在团队的其他人很容易签出repo,运行setup_env.bash脚本,并在本地签出所有项目的最新版本。他们不必担心安装/更新/usr/bin或/usr/lib,而且每次签出都有多个签出和一个非常本地化的环境。有些人也可以只对整个存储库进行rm操作,而不用担心卸载任何程序。

这对我们来说很好,我不确定我们是否会改变它。问题是在这个大的存储库中有很多项目。有没有git/Hg/bzr标准方法来创建这样的环境,并将项目分解为自己的存储库?

,我还没有尝试嵌套git存储库,因为我还没有遇到需要的情况。正如我在#git channelgit似乎因嵌套存储库而混淆,即您试图在git存储库中进行git init。管理嵌套git结构的唯一方法是使用git子模块或Android的repo实用程序。

至于你所描述的备份责任,我说委派…对我来说,我通常将每个项目的“原始”存储库放在工作中的网络驱动器上,由it技术人员根据他们选择的备份策略定期备份。这很简单,我不必担心。)

使用mr用于同时管理多个Git转发:

The mr(1) command can checkout, update, or perform other actions on a set of repositories as if they were one combined respository. It supports any combination of subversion, git, cvs, mercurial, bzr, darcs, cvs, vcsh, fossil and veracity repositories, and support for other revision control systems can easily be added. [...]

It is extremely configurable via simple shell scripting. Some examples of things it can do include:

[...]

  • When updating a git repository, pull from two different upstreams and merge the two together.
  • Run several repository updates in parallel, greatly speeding up the update process.
  • Remember actions that failed due to a laptop being offline, so they can be retried when it comes back online.

还有另一种方法可以嵌套git repos,但它并不能解决你想要的问题。尽管如此,对于其他正在寻找解决方案的人来说,我是:

在顶级git repo中,只需隐藏.gitignore中包含嵌套git repo的文件夹。这使得有两个单独的(但嵌套的!)git repos变得很容易。





相关问题
热门标签