Nov 4 2017
In my profession as a Business Intelligence guy, it is my duty to get data from any source system into a managed system of data and deliver my users reliable and cleaned results, so that they can analyze the data and make the right decision based on them.
How do we do that:
As you can see, we have lots of different source types and 2 ways how to ingest the data. Either batch orientated or streaming. Currently we are not at the level to work on streaming data. People are most interested in old-to-present data and forecasts based on them.
What I’m going to talk about is batch loading of data. Getting a grip on any data and loading them to our SQL Server which contains our data warehouse and our cubes for analyzing the data.
Anything file related goes to my managed file directory which is also synced to Azure Blob Storage. I know, many prefer Amazon S3. For a change and because I had the chance of trying it, I gave it a go. Very convenient and with adding own meta-data it is very easy to recover data when it gets deleted.
How to ingest data easily, I might discuss that later in another blog post.