The Promise Of PowerQuery Custom Data Types

The Promise Of PowerQuery Custom Data Types

PowerQuery custom data types offer promise of bold and powerful new features in the future; however, their current implementation is currently limited to simple, non-hierarchical data.

Philip Trick

Microsoft has brought some excellent features to Excel within the last few years. Among these, is rich data types such as the Stock and Geography data types and the variety of Wolfram Alpha data types.

Connect to your own data with more new data types in Excel - Microsoft 365 Blog
The evolution of Excel Excel is the ultimate decision-making tool. We’re amazed every day by the ways in which you, our customers, use Excel to make better decisions, leveraging the flexibility of the 2D grid and formulas to capture, analyze and collaborate on data. Up to this point, Excel has only …

Along with these rich data types, Microsoft has added the ability to create custom data types using PowerQuery. Custom data types ofer a promise of vastly improved data organization, accessibility, and usability for Excel users of all skill levels.

Re: Announcing Power Query Data Types
Custom Data Types cannot have hierarchical records without flattening them out first and re-creating a new custom object. Another missing feature is a tie-in with dynamic arrays. A Custom Data Type with a List object within one of the properties would be an exceptional opportunity. Any chance th…

Microsoft promises rich data types that go beyond simple numbers and strings. Data types containing hierarchical data, references to other data types, images, and even intelligent actions.

My question is simple. Are custom data types "rich"?

The custom data types blog post does not discuss how well these objects handle data beyond simple primitives, so I constructed a basic data set to perform my own tests.

A terrible zoo's collection of animals.

Each animal contains the number of animals, a description of the animal, their exhibit location, and then a list of their names, how popular each animal is, and a brief biography. I want to create the following simple report:

Normally, this report would be constructed with VLOOKUP or INDEX/MATCH for the Description, Count, and Location. The harder part is the table of names, attitudes, and popularity metrics. Currently, the easiest way to do that is via a PowerQuery table; however, that requires a manual refresh after each change or a VBA OnChange event macro.

Let's start with the easy part. The selection of animal.

In short, your VLOOKUP results can be tied directly to your drop-down selections instead of searched for based upon the user's selection.
A data validation list linked to a list of Custom Data Types will bring the entire data type upon selection.

My data validation list for the animal selection is tied directly to the table shown above, where I defined the custom data types. As a result, when I pick an animal - like Monkey - I get the entire rich data type instead of simply the word "Monkey". This means I can reference to cell D3's sub-property "Count" in order to populate the count in cell D6.

The lists associated with each animal are the harder section to populate. Normally, I would do this with a PowerQuery that uses the drop-down selection in cell D3 to apply a merge with a table output, but with dynamic arrays and the lists already linked to the rich data type, this should be easy right?

Turns out, list objects do not populate into dynamic arrays.

Turns out that the easy way is a no-go. The list object is flattened into a single text value indicating that the property contains a "List". Super useful. Hopefully Microsoft plans to allow these lists to populate into dynamic arrays in the future, but this isn't the end of the world. Let's take another approach using more custom data types.

The idea is to add a custom data object that contains properties 1, 2, 3 ... n, each of which containing the item that would have been in a list. Slightly less dynamic, but it could be workable.
I converted my lists into indexed custom data types.

Creating these records took a little bit of "M" prowess and effort whereas those list items were created using simple GUI tools, so definitely an extra level of complexity. Even more so when considering that the M Language Reference doesn't contain much information about how to create a record and assign the DisplayNameColumn and TypeName properties to it outside of the CombineColumnsToRecord function. Anyhow ... my new data object will not contain the "Popularities", "Names", or "Abouts" lists and will instead contain a single "Animals" rich data type. Or, at least, I hope it will.

A "Custom Data Type" is simply a PowerQuery Record with a pair of extra properties, Upon adding another Custom Data Type as a property within a Custom Data Type, the underlying details are flattened and lost into a single text value very much like the List type I tested before.
The Custom Data Type does not get added to our Custom Data Type, confusing enough?

Sadly, and much to my chagrin, the custom data type "Animals" was not accessible as a sub-type within our primary data type. In fact, the underlying custom data type's display value did not even carry over, with only a text value of "[Record]" being returned. The implication is that our custom data types are not far from being rich data types, being limited to simple text and numerical values with no ability to manage any type of hierarchical or advanced data. While Microsoft has implemented rich data types for specific types of objects as defined by Wolfram Alpha, custom types lack any such level of sophistication.

I hold a great deal of excitement for the future of custom data types. They offer a great amount of promise and an opportunity to vastly improve many, many Excel reports. As of December 2020; however, they fulfill only a small sliver of that promise and, as such, will offer only minor and niche improvements to data management in the near term.

Updated January 20, 2021

Microsoft does have plans to expand the functionality of custom data types in their next phase with nested data types (including lists!). The future is very, very bright for PowerQuery.

Photo by Dimitry Anikin on Unsplash
black computer keyboard on brown wooden desk