There are only so many things you can do to accelerate queries and make your BI tools more responsive and usable. You can write more efficient queries to begin with or reduce their scope. You can flatten your data or copy it into a flatter structure. You can throw hardware at the problem, as is done with in-memory databases or by spending more on compute. Or you can try to answer user queries in advance by computing and persisting (storing) the results of queries. This last one is what a Looker PDT attempts to do.
Answering queries in advance is a lot of work, and so are Looker PDTs. But I’ve got good news: you can turn this tedious work over to the robots. In this article, I’ll summarize what a PDT is, the pros and cons of them, and how to use AI to make this all easier.
What is a Looker PDT
Simply put, a PDT is a Persistent Derived Table. Let’s take a look at each of these words, in reverse order:
- Table. To your Looker dashboard, a PDT will look and act just like a table with rows and columns. This makes it very easy for analysts to use.
- Derived. But it isn’t really a table. It is actually derived from a query. That is to say, the table is an arbitrary query result set. By arbitrary, I mean that you get to determine it. This is what I mean when I say you are answering questions in advance.
- Persistent. Most query result sets are temporary. Not this one. It is stored into the Looker “scratch schema” of your database and can live there forever. It can even be updated incrementally and on a schedule.
The idea is you write a query–it can even be a complicated one from many tables–that answers most or all of the questions being analyzed by your Looker dashboard, and then store the results. When the dashboard calls for that information–presto, it’s ready to go.
Should You Use a Looker PDT?
Any query acceleration tactic will have pros and cons, and this includes Looker PDTs.
Looker PDT Pros
- Performance. Obviously, right? Persisting the data is much more efficient at query time than a usual LookML model. Your users will notice a dramatic difference.
- Safety. PDTs stay inside the Looker environment, created with the Looker UI. Your BI team can build them without needing access to the underlying data pipelines.
Looker PDT Cons
- Difficult to build. Just look at the documentation for yourself. This will be a lengthy learning process and you can’t afford to make mistakes, especially if your PDT relies on joined tables.
- Costly to maintain. You’ll need to decide on an update strategy, implement, and monitor it. In addition, you won’t know how much PDTs are costing you in extra compute time of your cloud data warehouse until you are running them.
- Non adaptive. The more things change with your requirements or environment, the more often you’ll need to create new PDTs, update existing ones, and remove obsolete ones.
- Not available outside Looker. And for all this effort, you won’t be able to use the results with other tools.
This is why I call it the “Looker PDT treadmill.” The work is substantial, the benefits apply only to Looker, and you need to constantly work on it. The performance benefits are probably worth it. But you can get the performance benefits without these drawbacks by using AI.
Using Keebo AI to Accelerate Looker Queries
Keebo is a data learning product well-suited to accelerating queries. It takes about 30 minutes to set up and doesn’t require any changes to your dashboards or data schema. Using machine learning, Keebo accelerates queries by examining each one and using machine learning to assess how it can be improved. Then Keebo creates smart models to persist data (similar to a materialized view) and rewrites queries on-the-fly to use the smart models. As such, these smart models should be considered as an alternative to Looker PDTs for the following reasons:
- Automatic. You don’t need to learn anything new or create anything. Instead, let machine learning figure out when to use a smart model and what should be in it. Keebo monitors dozens of performance parameters 24×7.
- Adaptive. As your workload, environment, or data change, Keebo gets rid of obsolete smart models and creates new ones automatically. And you also don’t have to find a human to fix a problem that of course will happen at odd hours and critical moments.
- Agnostic. Smart models aren’t unique to Looker and in future Keebo releases can be leveraged for other tools as well.
Some might be concerned about handing query performance over to AI robots. If so, I’ve got good news for you there, too. Keebo provides an interface where you can set freshness and budget parameters so the optimizations work within your SLAs and budgets, unlike PDTS, where cost impact won’t be known.
We’ve got a short demo video if you’d like to see it in action. Keebo Query Acceleration could be the first practical application of AI for your data team, accelerating performance and saving your time for more critical tasks.