WhereScape

WhereScape consists of 2 applications:

  • WhereScape 3D
  • WhereScape RED

WhereScape 3D is a newer application and is the way to go for the future. With 3D I can dig into my source system, profile it, document it and design my future design and at the end it spills out any objects (including any needed intermediate tables such as stage tables) for the “machine room”.

WhereScape RED is the machine room application. RED creates tables, indexes, views in the way a user wants them. On top of it, it creates stored procedures to pump data from the source to the target table. For me this was a design shift from SSIS which is an application retrieving data, transforming it and inserting it into the target. All in-memory outside the database. Stored procedures are running inside the database engine.

When we started I had big hopes to get to my target design very fast. Hoped for a default path of reaching my goals. But the consultants just told us, we can do everything the way you want! So, what did I want? And which is the way we wanted it? We started to model stuff in 3D, exporting it, testing, remodel, exporting, testing, etc. As I didn’t know what RED can do and how to achieve things best, this took us some time.

Finally I focused on RED, the machine room, to learn how things are working and how to design things in a pattern which is repeatable.

Our first draft was to rebuild a kind of data flow with views: extract & transform → merge → clean business key → calculate hash diffs → remove duplicates (optional) → stage table → load → target table. With a service creating all those sql queries based on view type this was quite fast and simple. Along the way we learned also strange behaviors from SQL Server which I’m going to point out later. Creating all those tables from 3D took us some time and some features didn’t exist in the way we wanted it to work.

But to be clear: Working and figuring out was still faster then working in SSIS.

In June 2017 WhereScape released a template engine based on Pebble. Never heard of that before.

In a matter of days I redesigned all our objects, removing all those views and simplifying the whole stack. Mainly it looks like source → stage → target. I’m very satisfied with that!

So, what kind objects does WhereScape offer?

The table type “Stage” is the work horse for anything.

What target table types do I use (there are more)?

  • Load
  • History (renamed from Data Store)
  • Satellite
  • Link
  • Hub
  • Dimension
  • Fact

That’s it. Do I need only 7 templates “to rule them all” (including stage, without load)? Wow, that’s a short list. Ok, I added 3 more templates for special stage tables, but that’s it. Focusing on those types and defining the best templates gives us the power to develop fast our pipelines. Is this automation? I don’t know. Working with SSIS I needed to create the tables and indexes manually. In WhereScape I define and it creates. I believe when I start using 3D then we can talk about automation.

Something else. Maybe you wonder why “Load” is a target table type? Usually the books are telling about a stage table and a target table. But as stage tables have “logic” in it (at least in WhereScape) and is used to organize data towards the target table type, a “Load” table is the de facto landing zone for any data from the outside.

Our data flow looks like that:As you can see, the Stage table is the workhorse for any transfer from one model to another.

In the next blogs I’m going to present my recipes how I load data into each of those objects.