Git vs. Subversion – Which to Use for Your Next Project

I recently did some research to support using Git or Subversion for a new project, and decided to include that in my blog (with permission).  While I don’t formally give attributions, any items in quotes came from other resources on the web.

 

Git Advantages

  1. Just as Subversion is the next evolution of open source code control from CVS, Git is the next open source code control evolutionary step
  2. Git offers distributed and federated branching as opposed to Subversion’s limitation of a single server with multiple clients.  A Git client can check out from a remote Git repo.  The user can make changes in units of work, and then commit those changes locally.  They can repeat the cycle of local units of work locally committed.  When ready, they can then decide to push their changes back to the remote repo, where everyone else can later pull them.  This allows the local developer the safety of make small changes as a unit and committing, as part of a much larger change, and then submitting that larger change as a single unit when ready (which at the same time constantly be pulling the latest changes from the remote depot)
  3. In addition, other users can be set up to check out from the user’s local depot, resulting in a federated model.  There is no strong convention as to which depot is the official master, except by convention and agreement.  Git uses a peer to peer model, which Subversion is client server.  This becomes even more important should the official repo be lost for some reason
  4. Because it is a distributed model, the workflow is established by the developer, not by the centralized repository owner.  Git does not depend on a centralized server, but does have the ability to syncronize with other Git repositories – to push and pull changes between them. This means that you can add multiple remote repositories to your project, some read-only and some possibly with write access as well, meaning you can have nearly any type of workflow you can think of”
  5. Due to being distributed, you inherently do not have to give commit access to other people in order for them to use the versioning features. Instead, you decide when to merge what from whom.  That is, because subversion controls access, in order for daily checkins to be allowed – for example – the user requires commit access. In git, users are able to have version control of their own work while the source is controlled by the repo owner.”
  6. Creating a new branch in Git in much quicker (i.e., 5 seconds), easier, and less centralized than in Subversion.  A developer can make that decision locally without having to consult or impact anyone else.  Until all parties agree, that branch can remain invisible to the rest of the team.   This allows more experimentation, parallel development, and rollback of failed prototypes or Scrum spiking with no impact to the rest of the team.  New branches require 41Kb, and deleting a branch means just deleting a single file (though there are commands to do it).  “Creating a repository is a trivial operation: mkdir foo; cd foo; git init That’s it”
  7. Branches and labels are not just copies that can be altered, but are true first class citizens in Git.  Audit trail and real TAGS, as opposed to using BRANCHES to simulate tags.  In GIT at each point in history, a SHA key is generated that identifies the stated of the code. It is easy to track the history if someone tries to tamper with the code or mistakenly deploys the wrong code into production environments. Git has a very strong audit trail
  8. Integrating branches and merging is far easier and less conflict ridden (with less chances of accidents or problems) in Git.  Git has very strong merge algorithms.  Developers can do full merges locally before having to push the merge back into the main branch
  9. “Branch merging is simpler and more automatic in Git. In Subversion you need to remember what was the last revision you merged from so you can generate the correct merge command. Git does this automatically, and always does it right. Which means there’s less chance of making a mistake when merging two branches together”
  1. “Branch merges are recorded as part of the proper history of the repository. If I merge two branches together, or if I merge a branch back into the trunk it came from, that merge operation is recorded as part of the repostory history as having been performed by me, and when. It’s hard to dispute who performed the merge when it’s right there in the log”
  2. “If you have partial merges for a work in progress, you will take advantage of the Git staging area (index) to commit only what you need [break it up and check in what you want now], stash the rest, and move on on another branch.”  By stash, he means if you are working on a project, and a bug comes from production, you can stash your current work as a built-in function of Git, seamlessly switch to the production branch, make the code change, check it in, and then unstash your work and continue working just as you were before
  3. When you check out with Git, you get the entirety of the repo, not just the one branch.  You get the full history, branches, merges, versions, and everything in your local version.  This is what allows you to fully work remotely without having to have a network connection.  In addition, each new branch created carries forward the pre-branch history
  4. It is faster.  Since all operations (except for push and fetch) are local there is no network latency involved to a) perform a diff, b) view file history, c) commit changes, d) merge branches, e) obtain any other revision of a file (not just the prior committed revision), or f) switch branches”
  5. Git stores its information in a more compressed manner than Subversion, which reduces the size effects of the previously noted advantage.  “Git’s file format is very good at compressing data, despite it’s a very simple format. The Mozilla project’s CVS repository is about 3 GB; it’s about 12 GB in Subversion’s fsfs format. In Git it’s around 300 MB”
  6. “The repository’s internal file formats are incredible simple. This means repair is very easy to do, but even better because it’s so simple its very hard to get corrupted. I don’t think anyone has ever had a Git repository get corrupted. I’ve seen Subversion with fsfs corrupt itself. And I’ve seen Berkley DB corrupt itself too many times to trust my code to the bdb backend of Subversion
  7. Git does not require little .svn folders in each of the subdirectories as SVN does, which can cause minor problems sometimes.  All the git information is stored in a .git folder at the top level of the depot.  In SVN, I’ve dealt with developers from novice to experts, and the novices and intermediates seem to introduce File conflicts if they copy one folder from another SVN project in order to re-use it. Whereas, I think in Git, you just copy the folder and it works, because Git doesn’t introduce .git folders in all its subfolders (as SVN does).
  8. SVN is the third implementation of a revision controlRCS, then CVS and finally SVN manage directories of versioned data. SVN offers VCS features (labeling and merging), but its tag is just a directory copy (like a branch, except you are not “supposed” to touch anything in a tag directory), and its merge is still complicated, currently based on meta-data added to remember what has already been merged.  Git is a file content management (a tool made to merge files), evolved into a true Version Control System, based on a DAG (Directed Acyclic Graph) of commits, where branches are part of the history of datas (and not a data itself), and where tags are a true meta-data.”  In other words, having started as a tool to merge files into a true VCS is what makes Git so much more powerful than Subversion
  9. You have to go with a DVCS, it is like a quantum leap in source management

 

Subversion Advantages

 The following are the advantages of Subversion:

  1. You can check out part of a branch instead of the entire thing having to be checked out
  2. Subversion is stronger in storing and managing very large binary files.  SVN is the only VCS (distributed or not) that doesn’t choke on my TrueCrypt files (please correct me if there’s another VCS that handles 500MB+ files effectively). This is because diff comparisons are streamed (this is a very essential point). Rsync is unacceptable because it’s not 2-way.”
  3. There were earlier problems with using Git on Windows back in 2008 due to lack of support, but that has been addressed at this point
  4. If your development is linear and simpler (without requiring branches and parallel work), you should stick with Subversion
  5. Because Subversion has been around longer, it may have better tool support.  This was more a problem around 5 years ago, but Git has mainstream tool adoption at this point
  6. Most people already know how to use Subversion instead of Git.  To use Git, some internal training (which I can do) will be involved to not only use Git, but to use Git as it was intended (and not to use as if one were using Subversion)
  7. Walking through versions is simpler in Subversion because it uses sequential revision numbers (1,2,3,..); Git uses unpredictable SHA-1 hashes. Walking backwards in Git is easy using the “^” syntax, but there is no easy way to walk forward”

Leave a comment