Getting data architecture right: Making the pieces fall into place
I have evaluated, designed and implemented many different data structures during my career. If there’s one thing I’ve learned with all that I have seen, it is that no data structure is perfect. The most successful data structures maintain a fine balance in accommodating the end user’s many different needs. Some of these needs include performance, ease of use, presentation to the semantic layer, RDBMS optimization, natural query language and reusability. If you can balance these requirements correctly, then all of the other data architecture pieces will be much easier to fit into place.
Start at the beginning
Your data architecture is the pillar on which your platform stands, and it needs to be treated as such. Having a solid, well-thought-out and flexible data architecture allows your organization to use a multitude of front-end reporting and analytics tools, ETL tools and MDM applications. With them, your organization can choose the best fit for its needs instead of settling for what “will work.” All of these pieces can be put into place around effective data architecture. Consider that you can add a horn or a bell to your bike, but without a bike to start with, the horn is not as useful. Think of your architecture as the bike.
Find the perfect fit
Most of the people who are new to data architecture design expect to follow a strict set of rules for creating a star schema, a third normal form schema or a data vault. In the real world, we do not live in a vacuum. Data is not perfect. A single architecture does not always work for all uses. You should be willing to be flexible and use what works best for your organization but also be prepared to make mistakes. Learn from your mistakes and use the lessons to move forward. Use the experience you have gained from the past to form a strong, resilient and long-lasting, well-connected architecture.
Connect it all together
Today’s BI applications are built with a thinner-than-ever semantic layer. Historically, the semantic layer could be used to cover up or correct some of the data architecture issues, such as the use of outer joins instead of inner joins to ensure correct answers for the end user. With some of the tools that are available today, this is no longer the case. Your end users can now create joins without really knowing why or when to use a non-inner join. Decisions can and will be made based on incorrect ad hoc queries in today’s open self-service environments. This has put a lot of power in the hands of your users. While this might be the future of BI, you must respond by mitigating as much of the risk as possible that’s associated with end users making mistakes. And what’s the best way to mitigate mistakes? By having a great data architecture that ensures all pieces fall into place perfectly.