The decision points that you need to address with your data management strategy are:
- Consolidate data. A fundamental function provided by data management is to gather data, from various sources in various forms. This data is then turned into potential intelligence (actionable information) to aid in decision making.
- Provide intelligence. Data has no inherent value until it is used by someone, or some software, to make a decision. Data management provides appropriate access to data, and access to tools to manipulate and communicate it, to turn it into intelligence (actionable information).
- Ensure data security. This is a very important aspect of security in general. The fundamental issue is to ensure that people get access to only the information that they should, and that information is not available to people who shouldn’t have it. This is called the principle of “least access.” Data security must be addressed at both the virtual and physical levels.
- Analyze data. Existing data sources, and the data that they contain, must be explored and examined so that the data may be understood and hopefully be turned into intelligence.
- Evolve data assets. There are several categories of data that prove to be true assets over the long term: Test data that is used to support your testing efforts; reference data, also called lookup data, that describes relatively static entities such as states/provinces, product categories, or lines of business; Master data that is critical to your business, such as customer or supplier data; Meta data, which is data about data. Traditional data management tends to be reasonably good at this, although can be heavy handed at times and may not have the configuration management discipline that is common within the agile community.
- Specify data assets. At the enterprise level your models should be high level – lean thinking is that the more complex something is, the less detailed your models should be to describe it. This is why it is better to have a high-level conceptual model than a detailed enterprise data model (EDM) in most cases. Detailed models, such as physical data models (PDMs), are often needed for specific legacy data sources by delivery teams.
- Improve data quality. There is a range of strategies that you can adopt to ensure data quality. The agile community has developed concrete quality techniques – in particular database testing, continuous database integration, and database refactoring – that prove more effective than traditional strategies. Meta data management (MDM) proves to be fragile in practice as the overhead of collecting and maintaining the meta data proves to be far greater than the benefit of doing so. Extract transform and load (ETL) strategies are commonplace for data warehouse (DW) efforts, but they are in effect band-aids that do nothing to fix data quality problems at the source.
- Refactor legacy data sources. Database refactoring is a key technique for safely improving the quality of your production databases. Where delivery teams will perform the short-term work of implementing the refactoring, there is organizational work to be done to communicate the refactoring, monitor usage of deprecated schema, and eventually remove deprecated schema and any scaffolding required to implement the refactoring.
- Govern data. Data, and the activities surrounding it, should be governed within your organization. Data governance is part of your overall governance efforts.
Looking at the diagram above, traditional data management professionals may believe that some activities are missing. These activities may include:
- Enterprise data architecture. This is addressed by the enterprise architecture process blade. The DA philosophy is to optimize the whole. When data architecture (or security architecture, or network architecture, or…) is split out from EA it often tends to be locally optimized and as a result does not fit well with the rest of the architectural vision.
- Operational database administration. This is addressed by the IT operations process blade, once again to optimize the operational whole over locally optimizing the “data part.”
As you can see, we’re not talking about your grandfather’s approach to data management. As Figure 2 summarizes, organizations are now shifting from the slow and documentation-heavy bureaucratic strategies of traditional Data Management towards the collaborative, streamlined, and quality-driven agile/lean strategies that focus on enabling others rather than controlling them.