Think about all the data in your personal life – passwords, calendars, finances, phone numbers, and so much more. Then think about how that data intersects with the other people in your circle. It could be a family calendar with all the doctor appointments and soccer games. Or it’s a shared bank account with a partner for which you both have the login.
But maybe one day, someone forgets to add a game to the calendar or share a password change. Suddenly, the whole process breaks down cause the various parties are not bringing the necessary data together.
And even though we’re talking about a family, the same scenario may be true for your work family, where data isn’t always flowing effortlessly among the various systems and processes. Without proper integration, a company can feel a bit lost and unsure about its data. But there are steps to set your organization on the right path to achieving seamless data integration.
Data integration should occur when data or information from one business context or department generates value if used in another context or department. This may sound a bit abstract, but bear with me, as we are going to cover a lot of ground – and I want to build a shared foundation.
Integration is an immense problem space because most companies require a multitude of systems to operate. The cost structure of business models often depends on how effectively and efficiently these integrations are implemented with constrained resources. Most companies recognize that as the marginal cost of compute moves to zero, the effective digitization of workflows will yield market winners. As Marc Andreessen, the progenitor of the modern browser, explains in his article “Why Software is Eating the World,” this mega-trend started developing back in 2011.
As our global economy moves into the cloud ecosystem, “digital transformations” are accelerating the demand curve for integrated operations and analysis. In the early phases of these “digital transformations,” companies were shifting away from manual processes like spreadsheets and using modern, governable systems to store and analyze their data.
These opportunities are buoyed by the 100-billion-dollar market cap of the growing data and analytics space where the major players are Snowflake, Microsoft Azure, AWS, DataBricks, and Google Cloud. Snowflake represents the most innovative approach to databasing since PostgresSQL was introduced in the 1990s. Virtual database technology fundamentally changes the velocity of innovation and experimentation with a company’s harvested analytical data (more on that later). Luckily, Snowflake competitors realized this and have made their own substantial investments in building products and services in this space as well.
According to Gartner, 89% of board directors say digital business is now embedded in all business growth strategies. Still, a mere 35% of board directors report that they have achieved or are on track to achieving digital transformation goals. This may be because managing data across the enterprise and the timely integration of information among departments are critical aspects of all “digital transformations.” We often see companies look at the complexity of this task and try to over-engineer a perfect solution. This is a mistake.
In the cloud, infrastructure can be spun up and torn down multiple times in a day and because this problem space is so novel, mistakes can be contained and remediated. Furthermore, they represent the best way to learn and grow your company’s capabilities. It’s time to take on this challenge and that can mean starting small.
So, with that context in place, let’s get back to that shared foundation we were building. There are two main types of data that are valuable to an organization:
When integrating operational data, you are attempting to provide an accurate and relevant model of the world as it currently is. The relevancy of the data or an event to the domain being integrated is a critical limiting factor on whether a particular integration should be implemented. If irrelevant data is moved from one system to another, you’ve incurred an operational cost with no benefit, and over time this behavior will erode data quality in the analytical plane.
In the analytical plane, you look at data describing the past. Data should be moved from the operational to the analytical plane, where cross-domain analysis and experimentation can proceed without interrupting business operations. This is where baselines and benchmarks get created. As new data flows from the operational to the analytical plane, we can assess how the business is measuring against those established KPIs and indexes.
Business operations require different people to do different jobs and use different systems for various aspects of an organization’s wealth-generating and value-producing activities. Sometimes the activities from one department are an input to another department’s work. The ability to efficiently execute workflows and optimally distribute across resources requires an organization to carefully consider how operational data from one department is used – and is useful – to another department’s operations.
Activities supported by operational data may include scheduling services, ordering products, or shipping materials. For example, when operational data is integrated, it allows orders placed on an e-commerce site to make it to the ERP system for fulfillment. Tracking numbers created by the warehouse team can be accessed by the customer using the Amazon marketplace.
Data integration is a hot topic because of the explosion of software systems and digitization. Software systems generally produce operational data that is stored within those systems and have back-end support for the operational workflows employees are executing.
Operational systems generally have dissimilar perspectives of the external world and require different data to model those aspects effectively. CRM systems don’t care about bank account numbers, and financial systems don't care about how many candidates the HR team might be recruiting. But this does not imply that data from one system will not be useful in another.
Finding the right tools for operational data integration depends on your data architecture, cloud footprint, and application landscape. Here are some technologies & patterns to consider:
Back to that shipping example, a company not only needs the integration of operational data to ensure the raw materials physically show up at the manufacturer’s warehouse. It also needs to be able to use that data and combine it with other information known to be true. This allows the company to draw deeper insights into its own business.
In much the same way that algebraic derivatives give us an understanding of functions, using analytical data provides us with an understanding of operational processes. Analytical data is generally represented and accessed in schemas, models, and views using database technology. And that modeling and analysis require high-quality data, applied in the correct context, to accurately represent the state of any business.
Modeled data is descriptive of the real world like a map describes the terrain. How well modeled data provides the “map holder” the ability to navigate the current environment based on a map derived from historical information is a measure of “fitness.” The key to evolving a company’s map-making capabilities is the correct toolkit and framework to help them map the terrain. Here are some suggestions that we have implemented at Kenway:
Kenway offers a flexible and tailored approach to data integration by guiding clients with a data strategy that aligns with corporate objectives and drives long-term, sustainable value. Based on our experience with a wide array of data integration projects, we generally keep the following in mind when handling data in the analytical plane:
If you’re looking for data integration solutions for your organization, connect with us to discover how to complement your business objectives and maximize return on investment while minimizing operational overhead.