Smart Data Workflows: The Evolution of Unstructured Data Management

SMART DATA WORKFLOWS: THE EVOLUTION OF UNSTRUCTURED DATA MANAGEMENT

Smart Data Workflows: The Evolution of Unstructured Data Management

Source: Komprise

“Moving data to the cloud can help you optimize your infrastructure, but the bigger value is in leveraging the compute power and data services in the cloud,” remarked Komprise co-founder and COO Krishna Subramanian in our second Cloud Field Day 14 presentation.

Subramanian defines unstructured data as any data we can access as a file or as an object that does not fit neatly into database rows and columns. These data sets are piling up in the data centre, at the edge and in the cloud. An unstructured data management solution that can look across all your sources of unstructured data, provide an analytical view of this data, mobilize data and allow users to index, search and deliver only what is needed to data consumers is a smart strategy to modernize your data storage practice.

In this session, Subramanian briefly introduces the Komprise SaaS platform and introduces our latest product update: Smart Data Workflows. Here’s an overview of what her session covered.

With Smart Data Workflows, IT users can create automated workflows for all the steps required to find the right unstructured data across storage assets, tag and enrich the data and send it to external tools for analysis. This eliminates manual effort in unstructured data management and helps organizations speed time to value from new cloud-native tools.

With Smart Data Workflows, you can deliver only the right file and object data into a data lake: preventing the dreaded data swamp.

Does Komprise alter the data? No. Data remains in native format. When a file is moved to an object store, Komprise does not “munge it up.” We call it file-object duality. Read about it in this post: Why Cloud Native Data Access Matters.

The Power of Global Unstructured Data Visibility

Krishna introduced the Global File Index, which is a unified view of your data without moving the data. Today enterprise IT organizations are flying blind. They don’t know what data is sitting where, who is using the data, how the data is growing and what the data is costing them. End users can’t find the data they need when they need it. With today’s data volumes, organizations must have full visibility to make good decisions. Once you have this data visibility, Komprise makes it actionable. This is where the magic happens.

With billions of files and objects, analytics plus continuous mobilization is essential because data has a lifecycle and data management is not a one-time thing.

Smart Data Workflow Use Cases

Before the demonstration, Krishna reviewed a series of Smart Data Workflow use cases, including:

 Legal Hold

Smart-Data-Workflows-Legal

  • Search & Curate: Define and execute a custom query to find all data related to a divestiture project with Komprise Deep Analytics and the Komprise Global File Index.
  • Execute & Enrich: Execute an external function to identify PII data and tag it.
  • Cull & Extract: Move sensitive data to an object-locked cloud storage bucket and move the rest to a writable cloud bucket using Komprise Deep Analytics Actions.
  • Manage Lifecycle: Move the data to a lower storage tier for cost savings once the analysis is complete.
 Genomics Sequencing

Smart-Data-Workflows-Genomics

  • Search & Curate: Define and execute a custom query to find all data for Project X with Komprise Deep Analytics and the Komprise Global File Index.
  • Execute & Enrich: Execute an external function on Project X data to look for specific DNA sequence for a mutation and tag such data as “Mutation XYZ”.
  • Cull & Extract: Move only Project X data tagged with “Mutation XYZ” to the cloud using Komprise Deep Analytics Actions.
  • Manage Lifecycle: Move the data to a lower storage tier for cost savings once the analysis is complete.
 Autonomous Vehicles

Smart-Data-Workflows-Autonomous-Vehicles

  • Search & Curate: Find crash test data related to abrupt stopping of a specific vehicle model with Komprise Deep Analytics and the Komprise Global File Index.
  • Execute & Enrich: Execute an external function to identify and tag data with “Reason = Abrupt Stop”.
  • Cull & Extract: Move only the related data to the cloud data lakehouse to reduce time and cost associated with moving and analyzing unrelated data using Komprise Deep Analytics Actions.
  • Manage Lifecycle: Move the unrelated data to a lower storage tier for cost savings (or delete it) once the analysis is complete.

Smart Data Workflow Demonstration

Komprise CTO Mike Peercy delivered a demo related to autonomous vehicle data–because let’s face it, none of us will be driving in 10 years. The topic was also discussed in this recent webinar with AWS: A Modern Data Strategy for the Automotive Industry.

Here is the flow:

  1. ENABLE SHARES with autonomous vehicles data for processing
  2. Show Deep Analytics (which is the UI for the GFI – Global File Index)
  3. Show query for crash reports in 2019
  4. Show query with TAG : Stopped in traffic – there will be NONE
  5. Create Plan to analyze contents of 2019 files using LOCAL FUNCTION to tag matching files
  6. Activate the Plan to analyze and tag files
  7. Show query with TAG: Stopped in traffic – there will be MANY

Source: Komprise

 

Image source: Shutterstock (1493123198)