Designing Data Platforms to Harness the Power of Fog Computing

Learn how to harness the power of the cloud all the way out to the edge to build the dynamic real-time systems of tomorrow.

Ryan Gross
18 min readNov 3, 2021
Image Source: Wikimedia

Note: This article originally appeared in The BI Journal Volume 26 Issue 1 from TDWI under the title “Unleash the Power of Fog Computing”

The Current State of Data Processing Systems

For the last 5 years, enterprises have been scrambling to centralize their analytics processing on the cloud (hence the current $20B+ valuations of Databricks & Snowflake). Most recently, all of the major data platform vendors have converged their messaging around the concept of a “LakeHouse” architecture that takes the best attributes from traditional data warehouses and enables them to run on platforms with data lake storage architectures. For near-real-time scenarios, several streaming platforms have been built as well (e.g. Storm, Spark Streaming, Pulsar, and Flink). These systems also adopt a cloud-based, centralized architecture and assume that data ingestion will direct edge streams to cloud message brokers like Kafka[HP1] [GR2] , Kinesis, or Event Hubs. These systems have been able to scale to handle petabytes of data, but often at great cost.

--

--

Ryan Gross

Emerging Tech & Data Leader at Credera | Interested in how people & machines learn, and how to bring them together.