This is a guest post from reader Björn Stiel.
Let’s Git Excel Under Version Control!
Git has become one of the most popular and trusted version control systems. In fact, so popular that even Microsoft moved their Windows codebase to Git. Why is Git so popular? Well, Git’s branching model is simple and works (branches provide isolated environments for changes without interfering with production-quality code on master and other branches). Also, Git is very flexible. It can be extended and customized in a myriad of imaginable (and unimagibale) ways, which is the basis for any successful ecosystem.
So what about version-controlling Excel workbooks then? If Git works great for code and is customizable, what does it actually take to make it work for Excel workbooks, too? Let’s get started with what happens when we put an Excel workbook file under Git version-control (I’m assuming some basic Git knowledge here, so if you are an absolute Git noob, have a look the excellent Atlassian Git tutorial, or, if you are are very short on time, check out the simple Git guide.
Let’s hit the ground running and get started with an example repository that contains a simple Excel workbook named Book1.xlsb. Clone the repo…
|
C:\Users\Bjoern>git clone https://github.com/ZoomerAnalytics/git-xltrail-examples.git |
…and have a look at Book1’s version history (aka the file’s commit log):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
|
C:\Users\Bjoern>git log Book1.xlsb commit 429ee1ff383b8c706aa69c6a87f3a2c50fa1bcd1 Author: Bjoern Stiel <bjoern.stiel@zoomeranalytics.com> Date: Fri Feb 9 14:17:14 2018 +0000 Added new VBA Module commit 04b45b99c883e5d184a20cfd73e4556ef8d06bfd Author: Bjoern Stiel <bjoern.stiel@zoomeranalytics.com> Date: Fri Feb 9 14:14:34 2018 +0000 Added README.md commit f9b5dadca880808ea1e73e252cc8ee236f98499e Author: Bjoern Stiel <bjoern.stiel@zoomeranalytics.com> Date: Fri Feb 9 14:14:16 2018 +0000 Added ExtractNumber VBA function commit 37b7bfde9464209217dc57214d16a6cd707d6f21 Author: Bjoern Stiel <bjoern.stiel@zoomeranalytics.com> Date: Fri Feb 9 14:06:33 2018 +0000 First commit |
So, you might wonder what changes were actually made in the latest commit “Added new VBA Module”? That is, you want to see the changes (the diff) between commits 04b45b99c883e5d184a20cfd73e4556ef8d06bfd
and 429ee1ff383b8c706aa69c6a87f3a2c50fa1bcd1
:
|
C:\Users\Bjoern>git diff 04b45b99c883e5d184a20cfd73e4556ef8d06bfd 429ee1ff383b8c706aa69c6a87f3a2c50fa1bcd1 diff --git a/Book1.xlsb b/Book1.xlsb index 24f1c48..3a9d8e7 100644 Binary files a/Book1.xlsb and b/Book1.xlsb differ |
Well, that isn’t very helpful, is it? The problem is that Git does not understand Excel workbook files; to Git, an Excel workbook is just any binary file and therefore a black box.
Fortunately, thanks to Git’s extensibility and modularity, we can configure it to use another application to “diff” Excel file formats. Question is: Which application can diff Excel workbooks? Microsoft’s Spreadsheet Compare (which is part of Office Professional Plus and Office 365) is one option but there is quite a bit of technical DIY plumbing involved. If you are interested in the details, you can find them here.
In order to make Git Excel-ready almost out of the box, I created git-xltrail, a free, open-source Git extension. git-xltrail comes with a custom Excel workbook differ that understands the VBA code inside your workbook (handling spreadsheets is on the roadmap). And git-xltrail also takes care of the correct Git configuration so that all the Excel oddities (such as temporary files) are handled.
To get started, download and run the latest installer version. This installs the Git command-line extension, the workbook differ, and configures a few environment variables (more details can be found here). As soon as the installation is complete, open a new command line window:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
|
C:\Users\Bjoern>git xltrail git-xltrail/0.1.0 (windows; Python 3.6.4;) git xltrail <command> [<args>] Git xltrail is a system for managing Excel workbook files in association with a Git repository. Git xltrail: * installs a special git-diff for Excel files * makes Git ignore temporary Excel files via .gitignore Commands -------- * git xltrail install: Install Git xltrail. * git xltrail uninstall: Uninstall Git xltrail. * git xltrail version: Report the version number. |
Run git xltrail install
once to make git-xltrail work across all your (existing and future) repositories (the fire-and-forget solution). Alternatively, git-xltrail can be installed on a per-repository basis. In the repository’s root folder, run:
|
C:\Users\Bjoern>git xltrail install --local |
When installing git-xltrail in local mode, git-xltrail creates .gitattributes
(or modifies in case it already exists); make sure .gitattributes
is tracked as part of your repository.
With git-xltrail installed (either globally or locally in the example repository), revisit the example repository’s
commit history and compare the latest two versions. This time, you actually get a meaningful diff:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
|
C:\Users\Bjoern>git diff 04b45b99c883e5d184a20cfd73e4556ef8d06bfd 429ee1ff383b8c706aa69c6a87f3a2c50fa1bcd1 diff --xltrail a/Book1.xlsb b/Book1.xlsb --- /dev/null +++ b/Book1.xlsb/VBA/Module2 +Option Explicit + +Function Count_once(Count_range As Range) As Long + +Dim strAddress As String +Dim lMaxRow As Long, lEndCol As Long, lStartCol As Long +Dim lColCount As Long +Dim lLoop As Long, lArrElement As Long +Dim lArray() As Long + + + + lMaxRow = Rows.Count + + lColCount = Count_range.Columns.Count + + lEndCol = Count_range.Columns(lColCount).Column + + ReDim lArray(lColCount) + + lStartCol = Count_range.Columns(1).Column + + For lLoop = lStartCol To lEndCol + lArray(lArrElement) = Cells(lMaxRow, lLoop).End(xlUp).Row + lArrElement = lArrElement + 1 + Next lLoop + + lMaxRow = WorksheetFunction.Max(lArray) + + strAddress = Range(Count_range.Cells(1, 1), _ + Cells(lMaxRow, lEndCol)).Address + + + Count_once = Evaluate("sumproduct((" & strAddress & "<>"""")/" _ + & "countif(" & strAddress & "," & strAddress & "&""""))") - 1 + +End Function + + |
Much more helpful this time round. You can try it yourself by editing the VBA inside your working copy’s Book1.xlsb (and/or the text file README.md in the repository to see how it works when you edit Excel- and non-Excel files in the same commit) and compare your working copy versus your working version:
With this under your toolbelt, you can cross-check expected versus actual code changes and use branching so that you don’t mess up your production workbooks. In short, you can write better VBA code. What’s next for git-xltrail? We are planning to support merging (and to make our diffs support not only VBA but also worksheets). If you would like to see a feature, you are very welcome to open an issue or contribute to the project.
Documentation is available: