Renumbering Arrays in Code

I’ve got this bit of code where I’m listing table fields that I’m going to eventually Join into a SELECT statement.

As you can see, I needed to add a new field in position 1. Now I’m faced with renumbering the rest of the array. Terrible. So I wrote this:

Now I can copy the code, run this procedure, and paste the results.

Ahhh. Satisfying. Here’s how the stuff inside the loop works.

This splits the line into:

vaLine
0 fields(17
1 = “BOLState”

This results in:

vaLineStart
0 fields
1 17

Then I just concatenate the relevant parts back together with a different number.

Counting Files by Date

Someone told me we are posting more frequently lately. (For non-accountants, posting means taking the entered transactions and updating other files with the information.) Ever the skeptic, I decide to see for myself. Whenever we post, we produce a pre-post report in the form of

Pre-Post_Sales_Journal_yyyymmddhhmmss.TXT

PATH is a module level constant pointing to the folder.

If this was more than a one-off program, I would have written this line in a way that you could read it. The inner Split creates an array like

[0] = Pre-Post_Sales_Journal_yyyymmddhhmmss, [1] = .TXT

and I take the first element (the zeroth index) of the array. Then I split that further

[0] = Pre-Post, [1] = Sales, [2] = Journal, [3] = yyyymmddhhmmss

and I take the fourth element (index = 3) of that array. That’s my date in string form.

I put a bunch of dates in column A of sheet1 for as far back as I wanted to go. Then I add 1 to the cell to the right of the date. It turns out we are posting more frequently.

Inconsistent ListRow Copy

Here’s a cautionary tale for you. Let’s say you want to let users copy rows from a Source table row by row to a Dest table, by pushing a button:

…so you whip up a bit of code like this, and assign it to that button:

And when you run it, it works just fine:

…until that is, someone hides an entire column in the source table, sets a filter that similarly hides some rows, and leaves a cell in that Table selected before running the code again, in which case you get this:

As you can see from the above:

  • For any ListRows in your source that happen to be visible, only the cells from the visible columns get copied, to a contiguous block in the Dest table, but
  • For hidden ListRows, all cells get copied

Add to this the fact that everything works just fine if the user happened to select a cell outside the table before triggering the code:

…and you’ve got the makings of a hard-to-diagnose bug that will eat up hours of your time trying to replicate.

The fix? Don’t use the .copy method. Just set the values of the second range directly to the values of the first:

…which works fine, and is faster anyhow:

In case you’re wondering what happens if you bring the whole DataBodyRange through from the Source Table using that dangerous .Copy method i.e using code like this:

…then again the results depend on whether a cell is selected in the Source table:

…or not:

Again, avoid the inconsistency by setting the values of the second range directly to the values of the first:

…which works fine, fast:

Here’s a sample file:LRCopy Test

Always Use Stored Procedures

I take data that has been entered in Excel and I store it in SQL Server. A lot. I do that a lot. The proper way to do that is to create a stored procedure for every database operation you need and to execute that stored procedure from VBA. The quick and dirty way is to build a SQL string and execute it. As you might have guessed from the title, I chose the quick and dirty way and was recently bit in the ass.

Here’s the long and the short of it: Some numbers got formatted as dates and it really screwed stuff up. I had some code that looked similar to

The field ManifestID is a BIGINT and vaData(i,1) contained 4/15/2023. The ManifestID was 45031, someone (me) mistook that for a date that lost its formatting and promptly fixed (broke) the formatting. I noticed that several dozen entries in Blend had a ManifestID of zero. SQL Server dutifully took 4/15/2023, did the division (4 divided by 15 divided by 2,023), came up with zero, and put zero in the field.

After some self-flagellation, I wondered if a stored procedure would have caught this error. I assumed that when I tried to pass a date into a BIGINT parameter, the code would error out and I would have avoided this whole mess. But I was wrong. Instead, the stored procedure converted the date to its integer value – not by dividing like in the SQL String method, but by some conversion that I didn’t think was possible. Excel stores dates as the number of days since 12/31/1899. That’s not unique, but I’m pretty sure SQL server doesn’t store them that way. And how would ADO or T-SQL know to convert it in that way?

I devised a test. First create a table

Next, create a stored procedure to insert records

Then I wrote some code to insert rows

In the code, I define two formats in an array: General and m/d/yyyy. I loop through that array and apply the formats to cell G1 where I have an unsuspecting integer. In the first pass, it’s formatted as General and looks like a proper integer. I build up a INSERT INTO Sql string and execute it right off the connection. Then, still inside the loop, I do it the right way: Create a command object, add a parameter, and execute it.

In the second iteration of the loop, cell G1 gets formatted as a date and it all happens again.

I was expecting an error, so I had an error handler that printed out the whole table whenever thing bombed. But it never bombed. It executed just fine.

With the integer formatted as a number, both the string method and the stored procedure method inserted properly. That’s the first two 45000’s. The third 45000 is the string method when the integer is formatted as a date. That’s the one where SQL does division. The last 45000 is the one I thought would error out. But passing in a date to a BIGINT parameter converted it to the proper number. I even put G1 into a variant array to simulate my real world situation.

I still don’t know, and am interested to know, what is doing the conversion. But in the meantime I’m happy to learn my lesson and vow to use stored procedures like a good boy.

Junk Chart? Cover Art!

Yes, I have all of Edward Tufte’s beautiful books, as well as quite a few from Stephen Few and others. Yes, I know that you shouldn’t embellish charts with images and unnecessary bits for the sake of it. But even so, I just love this cover from The Economist that just arrived in the mail:

It’s not a junk chart when it’s cover art.

If only someone in the PowerBI community would create a custom visual for this, and upload it to the Office store. Actually, let’s go one bigger: If only Microsoft open-sourced the Excel visualization engine like they have done with PowerBI visuals.

Don’t know what I’m talking about? Microsoft enables external developers to create their own custom visuals into Power BI, and even better, to share those visuals with the wider PowerBI community. Heck, they’ve even published the code for all their own Microsoft visualizations to GitHub, so that developers can study them, learn from them, and improve on them. Meaning developers can create new charts that Microsoft haven’t got around to making yet, haven’t thought of yet, have stuffed up, or wouldn’t bother with.

Imagine how cool life would be if we could do this too.

Cue dream sequence

Say you’ve got some internal migration data sourced from a couple of Population Census tables. And say you want to see what net inflow and outflows look like in your home town. You go look on the ribbon for some suitable geographical chart to display the data, but find the cupboard is pretty bare:

So you head over to the Office store, and see something that looks promising, put together by some generous non-Microsoft developer with far too much time on his or her hands:

You download and import the visualization template into your Excel project, see it’s icon appear in the ribbon, click it, connect the resulting chart up to your own data, and reap the insights:

Man, that kind of custom functionality would be enough to make Jon Peltier do this:

Like my dream? Vote for it at UserVoice. Your vote matters.

Look Ma…No PowerPivot!

So back in 2015 when I was trying to write a book, I took time out to rant that despite every version of Excel since 2013 having the Data Model baked in as standard, it wasn’t a heck of a lot of use to you in the event that:

  1. You wanted to mash together say a UnitPrice from one Table with a SalesQuantity from another, in order to display the derived Total Revenue in a PivotTable; and
  2. You were a tightwad that didn’t happen to have one of the premium PowerPivot SKUs installed.

Because while you could create a relationship between two Tables without PowerPivot:

…the option to create a calculated field to multiply Qty against Price was greyed out.

Disheartened with the lack of functionality in the non-Premium-Excel space (and sternly talked-to by my wife about the need to shave and go out earn a crust) I flushed the nearly-finished manuscript down the toilet in disgust. But now that I’m flush again, I’ve dusted off that manuscript (it wasn’t really down the toilet) and am updating it, so that I can nearly-finish it all over again.

Anyways, I was subsequently looking at this DataModel thing again, and rewriting the bit that said how sucky it is that you need to pay a premium in order to do a pretty standard thing, when I came across this:

Wow…It IS possible. Tight-wad analysts of the world: You too can do PowerPivoty stuff, without paying a cent more. You heard it hear first. Or second, rather.
Thanks, Mike Girvin, from the bottom of my wallet. :-)

Look Ma…no PowerPivot:

I wonder what limitations this PowerPivot back door has, if any?

—UPDATE—
I found this from Chris Webb’s blog:
PowerPivot has been integrated into Excel, kind of. This means that the xVelocity (aka Vertipaq) engine is now native to Excel, and you can do all the basic PowerPivot stuff like loading vast amounts of data from multiple data sources and querying it via PivotTables directly in Excel, without installing any extra addins. PowerPivot does still exist as an optional extra however: you need it if you want to use the more advanced functionality that exists in PowerPivot today, such as filtering data before import, using diagram view, defining hierarchies and perspectives and so on.

And further down the page there’s some great commentary between Chris and Colin Banflied, including this from Colin:

  • xVelocity is divorced from PowerPivot, and built into Excel
  • PowerPivot is now simply a UI to administer the data model.

I’d always thought that while any edition of Office 2013/2016 can interact with a PowerPivot Excel Workbook (open it, click on slicers, change the pivot table etc.), if you wanted to “author” the Model you would need the correct version of Office, which includes the Power Pivot authoring functionality i.e. the PowerPivot window, which lets you define relationships/measures etc. But to some degree it appears not. Which has got me wondering…to what extent is PowerPivot just a UI for the DataModel baked in to every model of Excel? Sure, it can help you filter data before import, but so can PowerQuery, which is free in every version of Excel 2013 or later. Meaning there’s probably nothing stopping a savvy Excel VBA developer from simply rolling their own UI to help users of any post 2010 version of Excel do 99% of what they’d likely otherwise need PowerPivot for.

To PowerBI Premium, or not to PowerBI Premium…

…that is the question.

Heck, never mind PowerBI…I’ve only recently started using PowerPivot and PowerQuery. (Not because I’m a slow adopter mind, but rather because the places I’ve been working have been slow adopters.) And yet here I am considering waging war on a new front entirely. Will PowerBI let me make considerable further gains without imposing considerable future cost to my organization’s long-term budget? Or can a large part of what PowerBI offers be achieved at much lower cost using my modern version of Excel, in conjunction with models and dashboards being built by my colleagues using the SQL Server stack – technology we already own?

On that first point, I found Matt Allington’s post Which To Use – PowerBI Or Excel? immensely helpful. On that second point, being a purely Excel guy, I’ve never learned much more about SQL Server other than how to write TQSL queries to get the data I need into a PivotTable. I’ve often heard other MS SQL Server terms such as SSRS, SSAS, and SSIS bandied around by IT types, but they have never really meant anything much to me, and so have always gone over my head. I’m just an Excel guy, after all. So here I come along talking about this thing called PowerBI that can be used to easily share data and insights, and my colleague doing stuff in the SQL stack says “Hmm…sounds awfully like this SSRS/SSAS/SSIS thing that I’m using. Compared to SSRS/SSAS/SSIS, what exactly *is* PowerBI?”

Not being an expert in either product, I couldn’t answer. So I did the next best thing: got the boss to pay for getting it certified on our IT system, and then got it installed on all my unit’s PCs, so that she could find out, and then explain it to me. It took my colleague just 5 minutes to understand what PowerBI is…or in her words:

OMG. I know what this PowerBI thing is. It’s easy.

So there’s your answer: PowerBI is much easier and quicker than developing in the SQL Server stack. But of course, while you can easily develop reports using the completely free client tool (i.e. PowerBI Desktop), you’ve still got to share those reports with end users. So what does that cost?  $9.99 per user per month, regardless whether that user is a Rob Collie/Matt Allington type, or the cleaner (and I’m not talking about Harvey Keitel here). Yup, everybody has to have the pro license in order to consume shared reports, regardless of whether they are star developers, or occasional peekers.

That $9.99 per user per month is incredible value if you’re providing dashboards with the kind of people like us: Folk who put the ‘intelligence’ in Business Intelligence’.  But that is pretty expensive across a whole bunch of non-self-service BI types, who only want to look at things occasionally that are already filtered to show only what is relevant to them. And who perhaps only glance at a handful of reports, a handful of times per year.  So it’s hard to justify an organisational roll-out at great cost of a promising new technology. Currently, I can produce and distribute on-demand static reports for these folk for free, using Excel, VBA, and Outlook. When you don’t intend to click on filters, a PDF will work just fine.

This left me puzzled as to how to justify a much easier development tool to the bean counters. Quite coincidentally, yesterday MS announced the release of their PowerBI Premium plan. I thought this might help the business case for our organisation, but unfortunately it offers even worse value for mid-sized organisations than the $9.99-per-month flat rate that I’ll already struggle to get across the line with. Under the premium plan, you still pay that the $9.99-per-month flat rate for each developer, as this gives them the pro version of PowerBI that they need to create shareable reports. But in addition you also effectively buy a lump of capacity on a reporting server (either on premise or in the cloud) big enough to handle the load from servicing the demands of the folk actually consuming these reports. But those lumps are pretty big: they come in $60,000-per-year clumps. Yikes!

You can take a look at the pricing via Microsofts online calculator here. Here’s a cut-down view of how it looks:

Let’s look at the default view shown in the calculator above: An organisation of 5000 users, 20% of which are pro, 35% ‘Frequent’, and the remaining 45% ‘Occasional’. MS estimate the number of nodes you’d consume given this mix is 3, at $4,995/Node/Month. When you do the math, 45% of that $25k total is due to the Pro User licesces, and the remaining 55% is effictively the cost of sharing the reports with the wider organisation.

The bottom line here is that:

  • Premium is an add-on to PowerBI Pro.You still need pro licences for your developers.
  • Setting aside performance benefits, Premium is effectively the cost of sharing your reports with users without Pro licences.
  • The break-even point between the old plan and this enterprise plan under Microsoft’s modelling is 625 users.
  • You are not licensed per user, but for capacity. So as guavaq says over at the powerbi community site, if your users and data sets do not require a lot of compute power, you could potentially have a large userbase on the base level hardware (BUT IT ALL DEPENDS on your workloads and users). In other words, that break-even point above is only a very rough guestimate.

I agree with Matt Allington’s excellent post-mortum of this pricing announcement that while this might offer better value for very large customers, it’s completely unaffordable for medium-sized organisations. The middle guy is left out in the cold.

So how do MS work out the number of nodes you’re likely to need, given your Total User count and the split between user types? I did a bit of reverse engineering to determine that Microsoft’s calculator is based on:

  • 1 node for every 1000 pro users
  • 1 node for every 1430 Frequent users
  • 1 nodes for every 3030 Occasional users

Note the disclaimer from MS below the calculator:  This calculator provides a rough and conservative estimate based on simple usage logic. Actual needs to support a given workload could significantly vary based on data models, data volumes, number of queries and their complexities, refresh rates, usage distribution and pattern changes over time, and other factors. This estimate should be regarded only as guidance and is not providing any guarantee of performance.

With these node calculations in hand, I rolled my own calculator, using Excel. Try it out in the web app below. (Orange cells are input cells).

 

So will we proceed down the PowerBI path? Maybe. But I’m tempted to stick with my current ‘old-school’ approach of building Dashboards in Excel, and then using VBA to automatically filter them, PDF them, and email them to the cleaners. Rather than being taken to the cleaners, courtesy of Microsoft’s licensing.

You?