Metrics Vault: is it exclusive to Data Vault processes' metrics?

Gytis

Administrator
Staff member
While exploring Information Delivery Layer in Data Vault I've got a question about one of the optional vaults - Metrics Vault.

The book Building a Scalable Data Warehouse with Data Vault 2.0 (2016) says:

The Metrics Vault is used to capture and record runtime information, including the run history, process metrics, and technical metrics, such as CPU loads, RAM usage, disk I/O metrics and network throughput. <...> It might include technical metadata and technical metrics of the ETL jobs or the data warehouse environment.

Is Metrics Vault limited to storing only the metrics that describe ETL/ELT processes governed by Data Vault 2.0 standard (landing > staging > integration > business vault > information delivery), or it may be used as central repository for even more metrics of an organization, e.g.: indirectly affecting data warehouse, such as source loading timings, etc.?
 
Solution
Dan Linstedt answers:

Dan Linstedt said:
To answer your question: No. The metrics vault is not limited to just DV2 metrics. It can be leveraged with any metrics you are capturing. Remember: most "ELT and ETL" tools have their own metrics repository, some capture more metrics than others. But then, that leaves us with the same problem we have in data warehousing and BI : How to integrate and get analytics from ALL metrics we are interested in....

To that end, if you have additional metrics you want or need to capture, then by all means do so through the Metrics Vault. It is / or should be: a metrics warehouse.

It raises one more question then - when we consider Metrics Vault as Metrics Warehouse - does it mean Integration...
Dan Linstedt answers:

Dan Linstedt said:
To answer your question: No. The metrics vault is not limited to just DV2 metrics. It can be leveraged with any metrics you are capturing. Remember: most "ELT and ETL" tools have their own metrics repository, some capture more metrics than others. But then, that leaves us with the same problem we have in data warehousing and BI : How to integrate and get analytics from ALL metrics we are interested in....

To that end, if you have additional metrics you want or need to capture, then by all means do so through the Metrics Vault. It is / or should be: a metrics warehouse.

It raises one more question then - when we consider Metrics Vault as Metrics Warehouse - does it mean Integration layer is the logical place for it, or it's still Information Delivery layer?

What triggers this question is integration keyword here - are we allowed to perfom integrations (gather metrics from different sources and transform them into usable form) in other layer than Integration according to Data Vault 2.0?

Here Dan Linstedt responds:

Dan Linstedt said:
Information Delivery Layer is always the place for business rules. The Metrics Vault is equivalent in concept to a "Raw Data Vault" / ie: data warehouse.

Gathering metrics from other sources is all part of the "raw data warehousing effort". The model in the Metrics Vault can be interesting in attempting to design it by "Business Key", but can be done by: machine (IPV6, machine name, geo-loc), Virtual Process container (docker, etc...), process name, date & time of captured event, etc...

This really clarifies the idea of Metrics Vault 👍

I was curious since in the book (2016) Metrics Vault is visualized next to Business Vault in Data Warehouse layer and essentially defined in Information Delivery layer. But it certainly makes sense to treat metrics just as any other data coming into Data Warehouse, hence going through all corrensponding layers.

So in Raw Data Vault metrics most likely will replicate tagged Time Series records (e.g. like in Prometheus) and Business Key is very likely to be a combination of these tags just as you have given an example.
 
Solution
Back
Top