Disciplined Agile

Data Management Mindset

To capture the mindset for effective data management, we extend the principles, promises, and guidelines of the Disciplined Agile® (DA™) mindset with philosophies.

Data Management Mindset

Figure 1. The Disciplined Agile (DA) mindset for data management (click to enlarge).

To be effective at data management, we embrace these philosophies:

  1. Work closely with others. Data professionals need to actively work with others throughout your organization, helping them to leverage data in their decision making. They also need to work closely with delivery teams to produce and work with high-quality data sources.
  2. Transfer skills and knowledge. The aim is to enable others to better understand and become more effective at working with data.
  3. Usage-driven data. Traditional strategies promote a data-driven approach where their design efforts focused on data structure and semantics first, with the belief that usage would follow. This tended to result in data sources that weren’t readily usable by their intended audience. The agile strategy is to start with an understanding of what people want to do and then design data sources to support that.
  4. Timely, secure, and auditable intelligence. When people are making decisions, they need easy access to the right data at the right time. Having said that, they should only have access to the data that they are allowed to and no more – this is referred to as the principle of least access. Data activities must also be auditable – we should know where data comes from, when and how it changes, and who and when someone accessed it in the case of private/secure data.
  5. Fix the source. Every organization has technical debt, poor quality assets, that hampers their ability to operate effectively.  Technical debt in data sources includes inaccurate data, inconsistent data, poorly formatted data, and many other problems. Traditional strategies of dealing with these problems were to transform the data as you bring it into your data warehouse (DW) environment or into other systems, a strategy that proves to be expensive and inconsistent over time. A better option, albeit harder in the short term, is to fix the problems at the source via data refactoring strategies.
  6. Model to understand. Models are very good at communicating and capturing high-level concepts but rather poor at capturing details. Visual models showing major concepts - such as entity types and the relationships between them on conceptual diagrams or systems and data flow on architectural diagrams – prove useful in practice because they show you the lay of the land. The challenge is that people tend not to keep the details up to date and very often don’t even bother to read them anyway. The implication is that we need something else to capture the details, which is what executable tests are good at doing.
  7. Test to specify. Strategies such as Acceptance Test-Driven Development (ATDD) and Sustainable Test-Driven Development (STDD) are quickly becoming the norm for exploring and capturing detailed requirements and design specifications respectively. Supported by automation, this promotes greater quality and ease of change while providing superior specifications for future efforts. The tests form executable specifications and thus add real value to the delivery teams because they validate their work, and are therefore much more likely to be kept up to date than static documentation.
  8. Automate, automate, automate. Automated checking of the work of delivery teams via ATDD and STTD is enabled via continuous integration (CI) and continuous deployment (CD) technologies. These strategies have their roots in software development, but they are also being applied in the data realm and are contributing to improving overall data quality and decreasing the time to safely deploy database updates into production. In parallel, automation is common in the data warehouse (DW)/business intelligence (BI) environment to evolve from batch processing to full real-time processing across the entirety of your data infrastructure. See the Accelerate Value Delivery process goal for automation options.