The Promise Of PowerQuery Custom Data Types
Microsoft has brought some excellent features to Excel within the last few years. Among these, is rich data types such as the Stock and Geography data types and the variety of Wolfram Alpha data types.
Along with these rich data types, Microsoft has added the ability to create custom data types using PowerQuery. Custom data types ofer a promise of vastly improved data organization, accessibility, and usability for Excel users of all skill levels.
Microsoft promises rich data types that go beyond simple numbers and strings. Data types containing hierarchical data, references to other data types, images, and even intelligent actions.
My question is simple. Are custom data types "rich"?
The custom data types blog post does not discuss how well these objects handle data beyond simple primitives, so I constructed a basic data set to perform my own tests.
Each animal contains the number of animals, a description of the animal, their exhibit location, and then a list of their names, how popular each animal is, and a brief biography. I want to create the following simple report:
Normally, this report would be constructed with VLOOKUP or INDEX/MATCH for the Description, Count, and Location. The harder part is the table of names, attitudes, and popularity metrics. Currently, the easiest way to do that is via a PowerQuery table; however, that requires a manual refresh after each change or a VBA OnChange event macro.
Let's start with the easy part. The selection of animal.
My data validation list for the animal selection is tied directly to the table shown above, where I defined the custom data types. As a result, when I pick an animal - like Monkey - I get the entire rich data type instead of simply the word "Monkey". This means I can reference to cell D3's sub-property "Count" in order to populate the count in cell D6.
The lists associated with each animal are the harder section to populate. Normally, I would do this with a PowerQuery that uses the drop-down selection in cell D3 to apply a merge with a table output, but with dynamic arrays and the lists already linked to the rich data type, this should be easy right?
Turns out that the easy way is a no-go. The list object is flattened into a single text value indicating that the property contains a "List". Super useful. Hopefully Microsoft plans to allow these lists to populate into dynamic arrays in the future, but this isn't the end of the world. Let's take another approach using more custom data types.
Creating these records took a little bit of "M" prowess and effort whereas those list items were created using simple GUI tools, so definitely an extra level of complexity. Even more so when considering that the M Language Reference doesn't contain much information about how to create a record and assign the DisplayNameColumn and TypeName properties to it outside of the CombineColumnsToRecord function. Anyhow ... my new data object will not contain the "Popularities", "Names", or "Abouts" lists and will instead contain a single "Animals" rich data type. Or, at least, I hope it will.
Sadly, and much to my chagrin, the custom data type "Animals" was not accessible as a sub-type within our primary data type. In fact, the underlying custom data type's display value did not even carry over, with only a text value of "[Record]" being returned. The implication is that our custom data types are not far from being rich data types, being limited to simple text and numerical values with no ability to manage any type of hierarchical or advanced data. While Microsoft has implemented rich data types for specific types of objects as defined by Wolfram Alpha, custom types lack any such level of sophistication.
I hold a great deal of excitement for the future of custom data types. They offer a great amount of promise and an opportunity to vastly improve many, many Excel reports. As of December 2020; however, they fulfill only a small sliver of that promise and, as such, will offer only minor and niche improvements to data management in the near term.
Updated January 20, 2021
Microsoft does have plans to expand the functionality of custom data types in their next phase with nested data types (including lists!). The future is very, very bright for PowerQuery.