Excel Version Control with Git

This is a guest post from reader Björn Stiel.


Let’s Git Excel Under Version Control!

Git has become one of the most popular and trusted version control systems. In fact, so popular that even Microsoft moved their Windows codebase to Git. Why is Git so popular? Well, Git’s branching model is simple and works (branches provide isolated environments for changes without interfering with production-quality code on master and other branches). Also, Git is very flexible. It can be extended and customized in a myriad of imaginable (and unimagibale) ways, which is the basis for any successful ecosystem.

So what about version-controlling Excel workbooks then? If Git works great for code and is customizable, what does it actually take to make it work for Excel workbooks, too? Let’s get started with what happens when we put an Excel workbook file under Git version-control (I’m assuming some basic Git knowledge here, so if you are an absolute Git noob, have a look the excellent Atlassian Git tutorial, or, if you are are very short on time, check out the simple Git guide.

Let’s hit the ground running and get started with an example repository that contains a simple Excel workbook named Book1.xlsb. Clone the repo…

…and have a look at Book1’s version history (aka the file’s commit log):

So, you might wonder what changes were actually made in the latest commit “Added new VBA Module”? That is, you want to see the changes (the diff) between commits 04b45b99c883e5d184a20cfd73e4556ef8d06bfd and 429ee1ff383b8c706aa69c6a87f3a2c50fa1bcd1:

Well, that isn’t very helpful, is it? The problem is that Git does not understand Excel workbook files; to Git, an Excel workbook is just any binary file and therefore a black box.

Fortunately, thanks to Git’s extensibility and modularity, we can configure it to use another application to “diff” Excel file formats. Question is: Which application can diff Excel workbooks? Microsoft’s Spreadsheet Compare (which is part of Office Professional Plus and Office 365) is one option but there is quite a bit of technical DIY plumbing involved. If you are interested in the details, you can find them here.

In order to make Git Excel-ready almost out of the box, I created git-xltrail, a free, open-source Git extension. git-xltrail comes with a custom Excel workbook differ that understands the VBA code inside your workbook (handling spreadsheets is on the roadmap). And git-xltrail also takes care of the correct Git configuration so that all the Excel oddities (such as temporary files) are handled.

To get started, download and run the latest installer version. This installs the Git command-line extension, the workbook differ, and configures a few environment variables (more details can be found here). As soon as the installation is complete, open a new command line window:

Run git xltrail install once to make git-xltrail work across all your (existing and future) repositories (the fire-and-forget solution). Alternatively, git-xltrail can be installed on a per-repository basis. In the repository’s root folder, run:

When installing git-xltrail in local mode, git-xltrail creates .gitattributes (or modifies in case it already exists); make sure .gitattributes is tracked as part of your repository.

With git-xltrail installed (either globally or locally in the example repository), revisit the example repository’s
commit history and compare the latest two versions. This time, you actually get a meaningful diff:

Much more helpful this time round. You can try it yourself by editing the VBA inside your working copy’s Book1.xlsb (and/or the text file README.md in the repository to see how it works when you edit Excel- and non-Excel files in the same commit) and compare your working copy versus your working version:

With this under your toolbelt, you can cross-check expected versus actual code changes and use branching so that you don’t mess up your production workbooks. In short, you can write better VBA code. What’s next for git-xltrail? We are planning to support merging (and to make our diffs support not only VBA but also worksheets). If you would like to see a feature, you are very welcome to open an issue or contribute to the project.

Documentation is available:

Userform Textbox Autocomplete

I’m working on a project where the user types some stuff into a textbox. A good portion of the time, what the user will type will match one of the last few things he typed. I wanted the textbox to autocomplete if there was a match to a list. Pretty simple, I think. For purposes of this demonstration, I’m going to match to a list of random sentences in a listbox.

I had to use that old disable events in a userform trick otherwise setting the .Text property would call the change event again.

I only look at the first five characters. After that, you just have to type what you want. If there’s a match, I set the .Text property to the matching sentence and set the selection so that the user can continue typing. It all worked very nicely except for backspacing. In the above screenshot, I’ve typed He but the textbox contains the whole sentence. If I hit backspace in this situation, I delete the highlighted portion and I’m left with He. Backspace does nothing.

I was hoping to find a simple and elegant solution. Instead, I did this.

I’m using a module-level variable to determine if the backspace was pressed while in the textbox. If it was and there’s still at least one character, I simply shorten the sEntered variable by one character. That leaves the whole SelStart and SelLength mechanism working as expected.

Todo.txt TDD Part 3

As mentioned at the end of Part 2, after the creation date, the rest of the string is called the Description. It can contain projects that start with a plus sign(+) or contexts that start with an at symbol(@) or key/value pairs with a colon(:). We’ll test the projects piece now.

I’m testing zero, one, and two projects. Now let’s update Raw to make this pass

This loops through the rest of the elements of the split array and looks for a plus sign at the start. If it finds one, it creates a Project instance and adds it to the Projects collection class. The contexts will be handled similarly.

The final special case inside the description is key/value pairs.

Again I’m testing zero, one, and two instances.

Everything else is the description

Here are the changes to the bottom of Raw

And that’s it. A properly parsed Todo.txt string ready to be used in your application. And if I make an changes to my app, I can run these tests to make sure I didn’t break anything.

You can download TodoTxt.zip

Series:
Series:

  1. todo-txt-tdd-part-1/
  2. todo-txt-tdd-part-2/
  3. todo-txt-tdd-part-3/

Todo.txt TDD Part 2

In Part I, I started writing tests and then writing code to make them pass. Let’s continue with more tests.

The next test will be for an incomplete todo with no priority and a completion date.

Hey, it already passes. Let’s add some tests for when there’s a priority and a completion date

I expected these would already pass as a result of my refactoring, and they did. The next part of the spec says “Optional Creation Date, must be specified if completion date is”. First, I just want to test that it exists. That is, if there’s a completion date, there must be a creation date.

This fails because I haven’t parsed the creation date yet. So let’s do that.

My test passes, but I broke a previous one. In my prior completion date testing, I didn’t include a creation date because I wasn’t that far in the spec yet. I need to rewrite those tests

I added a creation date to the Raw for each of those tests, and now all tests pass. Now I can move on to testing what the creation date actually is.

This test already passes. Once I get past the creation date, the rest of the string is called the Description. It can contain projects that start with a plus sign(+) or contexts that start with an at symbol(@) or key/value pairs with a colon(:). We’ll test those in the next part.

You can download TodoTxt.zip

Series:
Series:

  1. todo-txt-tdd-part-1/
  2. todo-txt-tdd-part-2/
  3. todo-txt-tdd-part-3/

Todo.txt TDD Part 1

Earlier, I wrote a post inviting you to try your hand at test-first development. This post is the first in a series of how I did it. In the previous post, I had all the tests written, but here I’m starting from scratch and writing the tests as I go. Well, I’m not starting from scratch in that the classes are already set up. If you want to see what the classes look like, download the workbook from the previous post or the one at the bottom of this post.

First, create the property in CTodo that will parse the string. There’s nothing in it, but we’ll get to that shortly.

Write a test. This test will determine if the todo item is complete. Per the spec, the first thing in the string is an “x” if it’s complete

Now write the simplest code to make the test pass. I probably could have written simpler code than this, but don’t get too hung up on that. Just write simple code and don’t try to solve the next test – only this test.

When I split the string on a space, the Complete property is set to whether the first element is “x”. The test runs successfully. Next, write a test for incomplete todos.

Oh goodness, that test already runs successfully. There’s no “x”, so Complete is set to False. Next, write a test for a completed todo with a priority. Per the spec, the first element after the optional “x” is a capital letter in parentheses.

This test fails on Debug.Assert clsTodo.Priority = "A", so it’s time to write the simplest code to make it pass.

The Priority property is set to the second character of the second element. The test passes. Did we break anything? Let’s see.

Nope, everything passes so far. Time for the next test. Check the priority for an incomplete todo.

It fails, so let’s write some code

If my fist element is an “x”, get the second element, otherwise get the first element. Pretty simple and the test passes. Every test I write, I add to the TEST_All() procedure to make sure I don’t break any prior tests. The next part of the spec is an optional completion date. Let’s start with a completed todo with no priority and a completion date.

My new test passes, but I get an error in one of my old ones. Plus this code is getting pretty ugly. When your code is ugly or repetitive, it’s time to refactor. Instead of a bunch of nested If’s, I’ll just move a pointer down the line.

I use lNext to keep track of where I am in the array. If the first element is an “x”, I advance the pointer. Then I check vaSplit(lNext) rather than a specific element number. All my tests pass.

In the next installment, I keep writing tests, writing code, and refactoring.

The below workbook has all the tests and the completed Raw property. It also has a userform, but it’s not complete.

You can download TodoTxt.zip

Series:

  1. todo-txt-tdd-part-1/
  2. todo-txt-tdd-part-2/
  3. todo-txt-tdd-part-3/

Test First Todo.txt

In my ongoing struggle to find a todo list app I like, I took a look at Todo.txt. I ended up going back to GoodTodo, but I was intrigued by the text based system. I wanted to build something in Excel to be an interface to Todo.txt and I used a test-first methodology to parse the file. You may remember my first foray into test-first development when I converted numbers into words.

Later this week, I’ll post how I wrote the parser using tests. If you’ve ever wanted to try to write code using test-first, here’s your chance. Download the workbook below. I’ve setup all the classes and the tests. I even wrote enough code to make the first test pass. If you’re interested in this exercise, follow these steps:

  1. Download, unzip, and open the workbook
  2. Go to the MTest module and run the TEST_All sub and see that it passes
  3. Uncomment each link in TEST_All one at a time
  4. Run TEST_All to see that it doesn’t pass. If it passes, uncomment the next test
  5. If it doesn’t pass, go to the Raw property in the CTodo module and write just enough code to get the test to pass
  6. When your code looks cumbersome or you see a pattern emerge, refactor Raw
  7. Repeat until all the lines in TEST_All are uncommented

Check back later this week to see what I came up with.

You can download TodoTxtTestFirst.zip

Ribbon customUI Text Editor

Over at my site, yoursumbuddy, I’ve published an Excel addin for creating, editing and validating the XML that makes up Excel Ribbons. I developed it because I can’t install executables at work and want an addin that I can just run from Excel. It works on ribbons in workbooks and addins and runs in Excel 2010 and later.

Along the way I learned about XPATH, SAX, and DOM as they relate to XML and VBA. I’ve already forgotten what those things are and how they work, but they were important for a tool that downloads customUI/customUI14 XML, validates multiple errors, creates and/or modifies the .rel entries, and then uploads it all back into the zip file that is an Excel workbook. Whew. Should you care to learn more, my unpassword-protected code is available for your reading pleasure. If you poke around you’ll see that my code is built on the work of Jan Karel Pieterse, Ken Puls, Ron deBruin and even keepitcool. Thanks to all of them!

Here’s a couple of pictures of the addin form:

form with tips

form with highlighted error

To learn more and download the addin, just visit the yoursumbuddy Ribbon customUI XML Editor page.