10 Hot 🔥 Data & Analytics Trends to Watch in 2022

Question: What do you get when you put 4 data junkies from 3 countries in a car for 3 hours?

Answer: 1. A musical education, and 2. the contents of this article.

Last month as our team retreat concluded, four of us got in Ollie’s car and started the long road trip from northeastern Wales to London. Even though we’d spent the last few days talking ad nauseam about the data industry and its challenges, we soon found ourselves heatedly discussing how our industry is changing for the better, and worse.

This article is a summary of that conversation (minus the soundtrack of 90's grunge, 2000's hip-hop, and 80's pop), and reflects both the serious, and the more...out there..ideas we discussed. To keep it balanced, I've alternated between serious predictions (safe bets), and some more fun ideas (hot takes).

1. The data stack goes niche and data teams burnout on choice [safe bet]

No, it wasn’t just you; there were a lot of new data startups popping up in 2021, each seeming to help with a smaller and smaller part of your stack. Benn Stancil put it best in a recent post:

"Instead, the front of the data stack is represented by an explosion of tools, all tacking in slightly different directions. There’s traditional BI; there’s modern BI; there’s headless BI; there’s open-source BI; there’s Bitcoin-based BI. There are notebooks for analysis, notebooks for SQL, notebooks for collaboration, notebooks for apps, and apps for notebooks. There are data visualization tools, data visualizations for notebooks, and notebooks for data visualizations. There are SQL editors for teams, SQL editors for people who don’t want to write SQL, and SQL editors for Snowflake customers. There are collaborative workspaces, and tools that combine lots of things together. There are spreadsheets we can’t get rid of and spreadsheets replacing the spreadsheets we can’t get rid of; there are rebuilt spreadsheets; there are spreadsheets, but BI. And more of everything is coming.” [1]

This strategy isn’t surprising — startups have to focus on solving small problems well before they can achieve their full vision. What is surprising is that we seem to be buying the narrative that having more specialized tools in our stack is better than fewer general tools. For some situations that may be the case, but for the majority of companies I speak to, this spells trouble.

But burnout looms...

Already our team is spread thin, regularly switching tools for most of their tasks: querying the database, analyzing the results from that query, visualizing the results from that analysis, sharing the findings of those visuals, building data models, documenting those models, version controlling those models, etc. And they’re getting pretty fed up with it [2].

To add more tools to that workflow is downright terrifying. As we know from research [3] (and personal experience), we become "less rational, less intelligent, less focused" the more our attention is divided across different tasks and tools. I worry too many teams will learn this lesson the hard way this year.

2. Data teams will make “mission statements” and abandon them [hot take]

I started hearing whispers of this trend in late 2021, and it's picked up steam already in the first few months of 2022. The impulse to create a data team mission statement is understandable; it creates a sense of purpose, a way to reset expectations within the team, and more importantly, with the rest of the business.

All that being said, I can't help roll my eyes at this a bit. I have flashbacks to sessions I was a part of as an analyst in which we discussed lofty ambitions for our team, how we were going to "enable better business decisions" with "the highest quality data and technology." But the reason we are failing at that objective is not because we've forgotten it, it's because it's genuinely hard to do.  

By all means, have that mission statement session, but please also have the essential follow up sessions to let your team talk about they things they really want to change, and then you might start to move that needle.

3. Notebooks and Data Catalogs go enterprise [safe bet]

2021 saw a big transition for notebooks and data catalogs as people stopped asking “Why do I need a [notebook | data catalog]” and started asking “Which one should I get?” Finally, we’ve all come around to the idea that we need more than dashboards to do our jobs well, and notebooks are a nice addition to the toolkit. Moreover, the data catalog feels like a natural consequence of the data modeling movement that helps make that clean data, and those slick new notebooks easy to find.

The big question is who will win this game this year? Will an existing enterprise tool try to build their own notebooks like Snowflake and Azure have already done? Will anyone go for their own data cataloging feature?

Or, will it be one of the little guys that breakthrough into enterprise ubiquity?

I for one am always rooting for an underdog...

4. The data “Jungle Book” goes the way of Facebook [hot take]

Python’s been untouchable for the last 5 years. It’s gone from “the new R” to the must-know language of every data engineer, data scientist, and even data analyst. But will this hold much longer?

For data engineers, as Medhi Ouazza argues in his article "The Battle for Data Engineer's Favorite Programming Language Is Not Over Yet" [4] Rust might just lure data engineers right out of their Python dens. Rust is more highly rated than Python, has a steeper learning curve, so they can get more credibility points for learning it, and most importantly, might actually be better suited for data engineering tasks.

On the other end of the spectrum, analysts who have been using python (and 🐼) for data transformation and exploration have never had more reasons not to use it. SQL data warehouses like Snowflake are faster and upping the capabilities of regular-old SQL every day, making the decision to leave your SQL IDE to do some transformations in super slow python more costly every day.

5. Collaboration will mean more than “Google Docs for data” [safe bet]

Often the best innovations are borrowed, or at the very least inspired by something else. In data, we see this most often with the “X for data” taglines on many popular tools. For example, dbt’s software engineering for data engineering took the industry by storm as it cleverly brought in the best that software engineers could offer for data engineers.

The cons to this approach appear when we don’t quite get our analogies right. In the case of dbt, they did not just recreate all aspects of a software engineer’s world — they took only the parts that were needed, that make things better. They understood what made data engineering different from software engineering just as much as they saw the similarities.

Unfortunately, not everyone is as clever. Most recently, we’ve been flooded with tools promising “collaboration”, but really what they’ve done is given you Google sheets for data and call it a day. No one wants to build a SQL query with someone in real-time, and adding comments to a notebook only captures a small portion of the collaboration we do on a daily basis. It frankly, falls short.

2022 is the year someone [5] puts some creative and analytical thought behind what collaboration in data ought to look like, and really shakes things up.

6. Someone will do data in the metaverse...even if it has no business there [hot take]

Apologies to any metaverse lovers out there, this is really not a shot at you. The metaverse is an attractive green space of innovation, ideas, and potential right now. And someone is going to want to walk around in that world of possibilities  and wish they could pivot data with their hands, or walk amongst the stalks of a tall bar chart, or some other asinine idea. This is the year the metaverse becomes accessible and exciting enough that we see a surge of new applications making a play in the metaverse to see if it works, and data won't be left behind.

And, let's be honest, I'll be there to try it out.

7. The Last Mile becomes the next Data Mesh [safe bet]

The world was abuzz in 2021 with talks of the mysterious data mesh. What was it? Well, that depended on who you spoke to. But you can bet we all wanted to find out. I’m hearing similar fervor and equally disparate ideas on the Last Mile of Analytics. Is it reverse ETL? Is it headless BI? Something else entirely?

We’ve written up what it means to us [6], but we’re sure someone else will tell you something different. Either way, you’re going to be hearing a lot about the Last Mile this year, so buckle up.

8. We remember how much better in-person data events are to all those Slack channels [hot take]

This is the year we all remember the magic that is connecting with someone with shared interests over cold pizza and warm beer, of learning about cutting-edge work from uncomfortable chairs next to someone with a little too body spray. This is the year we pull away from those slack channels we joined in March 2020, unsubscribe to a few of those newsletters, and try to find the right balance of real and virtual connections again. Cheers to that 🍻.

9. Self-service finds new fervor [safe bet]

Don’t call it a comeback...because it never left. A few years ago I heard a lot about self-service. We all dreamed the same pipe dream that our business partners would be able to do most of their analysis on their own, so we as a data team could be freed up to do more fruitful activities. Then seemingly we all came to the same crushing conclusion: self-service won’t work until we get our own house in order. So we took a few years to focus on building reliable data pipelines, and telling better data stories, but now we’re ready to open the doors to the outside world again. But will it work?

You better believe we’re gonna do our best! If the Last Mile is going to be as big as I expect, then self-service will naturally be experiencing a re-invigoration this year. Maybe this is even the year it becomes more than a dream...

10. Decision scientists are this year's analytics engineers [hot take]

As I’ve alluded to in earlier predictions, there will be a big focus this year on how we better integrate our data into the business, and specifically with decision making. One of the more obvious solutions to this is to create a position for someone to make sure data is used effectively in decision-making. Cue the decision scientist.

Decision scientists, or operations researchers, have been around for years, and companies like Meta already employ decision scientists [7]. But 2022 is the year decision scientists go from fringe to mainstream.

These mavericks operate at the intersection of data, business knowledge, and psychology, expertly ensuring data is used in the optimum way to influence the best business outcomes possible. In short, in the right organization and structure, they can make a big difference.

Closing Thoughts

Even if I've gotten these predictions completely wrong, there is one thing I'm sure of: 2022 is going to be an exciting year for data. Already in the first month and a half we've seen the Twitter-verse come alive with the idea of data stack "bundling", we're seeing new tools pop up promising new ways to explore data, and some big series A announcements promise innovation and growth coming soon.

But what do you think? What did I get right? wrong? What did I miss entirely?

References

[1] Benn Stancil, "Business in the back, party in the front". Feb 4, 2022.

[2] Taylor Brownlow, "The Analytical Workflow is broken". Feb 18, 2021.

[3] Jonathan Hari, Stolen Focus: Why You Can't Pay Attention--And How to Think Deeply Again. 2022.

[4] Medhi Ouazza, "The Battle for Data Engineer’s Favorite Programming Language Is Not Over Yet". Jan 27, 2022.

[5] Selfish plug for Count, where we're back in Beta working on this very problem. Learn more here.

[6] Taylor Brownlow, "Modern Data Stack, It's Time for Your Closeup". Nov 8, 2021.

[7] Open role at Meta for a Decision Scientist.