This is my 2nd blog on HANA to discuss new applications building on SAP HANA platform and this one will discuss one of today’s hottest topics, applications for IoT.
We are all foreseeing a new connected world where sensors and networking chips are everywhere. In this world, everything is digitized – from light bulbs, to human body, to industrial equipment – and real-time data continuously flows across the network and through software systems. Applications that harness these data flows can greatly transform the way we live and do business. These IoT applications will allow us to create unprecedented new personalized offers and services, innovative business processes and models, and greater business opportunities. For example, adaptive logistic applications can leverage sensors that deliver Geo-spatial information on trucks, trains, and vehicles movement to help business optimize operations at their logistic hubs, with predictive maintenance on capital equipment or for real-time fleet management. SAP does provide multiple ready to use IoT solutions today, and in this blog,
I will focus on how SAP HANA advanced data provisioning technology can help you develop new IoT application to satisfy your unique business requirements.
Two of the most fundamental characteristics of IoT applications are the “continuous data input” and the size of the data generated. Additionally, not all data is relevant and as such it needs to be selectively processes and stored. SAP HANA advanced data provisioning technology help your applications cope with these new data-flows. We actually introduced SAP HANA advanced data provisioning technology to
adapt to different scenarios in the previous blog.
The following diagram shows different options, such as a real-time data replication among
relational databases (SAP Replication Server), complex and high-performance extract, transform and load operations (SAP Data Services), and data
exchange with Business Suite (SAP Landscape Transformation), all used together. In this blog, I want to emphasize the following two features:
1. Smart Data Streaming : As mentioned, not all streaming data is useful, but it is typically critical to process and store out-of-range data immediately.
For example, if a patient’s pacemaker data suddenly shows an irregular heart beat, the nurse might need to take immediate action. In such situations we
should process the data before storing it.
SAP HANA provides smart data streaming component that allows the processing of high-velocity, high-volume event streams in real time.
Specifically with smart data streaming your application can filter, aggregate, and enrich raw data before storing it into your database.
Depending on the situation, the filters you can apply to actively monitor the data streaming can be very simple or include complex set of
rules that aggregate or correlate incoming events. In this way, you can truly take advantage of live streams of data and implement continuous
intelligence solutions.
2. SAP Remote Data Sync: In some situations, sensors are not connected all the time to the Internet and applications have to find ways to store data at
the edge of the network and periodically synchronize it with other data stores for consistency. For example, with big drill devices used in the
underground wells, what can we connect devices when network is unreliable. So, IoT applications might need to interact not only with incoming streams but
also with a number of different applications that deal with streaming data locally and store streaming data in remote data repositories.
Efficiently moving data between SAP HANA and other applications or data stores is critical. This remote data sync capabilities recently released in
SPS10. This feature leverages SAP SQL Anywhere/Ultralite databases to capture data at thousands of locations at the edge of the network and provides
transaction consistent data synchronization to SAP HANA database. In this way your IoT application can easily support offline operation mode.
HANA scenario for IoT
typically 4 types of IoT scenarios. Let me propose this taxonomy:
M2A -> Machine to Analytic -> When a “Thing” or “Machine” needs to be monitored, it typically requires taking a look at sensor information. For example, in Oil & Gas, you would monitor the pressure of a well by reading the sensors of parts that regulate pressure. In automotive—let’s take race cars–you would monitor the engine temperature and status of electrical equipment such as lights. There are many scenarios where you need real-time monitoring of a device. I would even argue that the “Find My Phone” app for an iphone is a simple version of this. The thing is the phone and the analytic is a map that tells you where your phone is. This doesn’t require storing large amounts of data in Hadoop or looking back at data. The value is seeing what’s going on now. This could even be things like Dropcam where the machine is a video camera and the analytic is a “screen” for viewing a video. Lots of home security applications here. Also, there are modern analytical use cases leveraging predictive for fault tolerance or predictive maintenance, but you get the idea that it could be operational analytics to see what’s going on now or predictive analytics to predict future pattern.
M2M -> Machine to Machine -> This is the scenario people talk about the most. Imagine your alarm clock going off and it triggering your coffee maker and toaster to get started making breakfast. Very cool! Very consumer oriented. However, this is not new. In reality, Machines have been talking to each other (without human intervention) for decades. In manufacturing-back in the 1980s– there was always a conveyor belt and a mechanical arm. If the mechanical arm broke, the conveyor belt stopped. If the belt broke, the arm stopped. So there you go – a machine to machine scenario. It wasn’t using the internet, but does anyone care? This was two machines talking to each other. What’s different now, with the introduction of the internet and some standardization, is that the type and amount of data you can collect is utterly enriched and it is much cheaper for two machines to talk to each other, which makes the technology cheap enough for consumer scenarios (like the alarm clock/toaster scenario). Cheaper is a result of adoption/scale as well as Internet connectivity. To put this into perspective I’ll ask the question you’ve all heard before: “if a tree falls in a forest and no one is there to hear it, does it make any sound?”. The fact that two machines talked in the past was limited to those two machines and possibly the operators on the shop floor. Now, everyone from the floor operator to the business executive can be made aware of the event (possibly with different alert levels) and suitably adapt their operations and processes.
M2D -> Machine to Data Lake -> This one, I believe, is where most people go with IoT. They believe that machines are collecting a ton of data and if they don’t capture and store that data, they won’t be able to derive value in the future because they missed collecting all of it. Sometimes the idea and value of capturing the data is well understood up front, but most often, I find that people don’t identify the value of capturing information up front and therefore, spend a ton of time/money building out data lakes to look for future value. It’s a “fear driven” approach. I do find that a lot of folks who buy Hadoop do this for their IoT driven Big Data Scenario. Then they unleash data scientists and try to find the magic value. Usually, something meaningful can be derived, but it takes time.
M2P -> Machine to Process -> The last scenario, where I believe there is a ton of value, is when machines can communicate directly with business processes and be part of that process. What if your washing machine broke and could trigger a warranty claim or a service technician call if you’re not in warranty? 9 times out of 10, the “process side” of this is already handled in an SAP system. The question is connecting the machine to generate the business process action.
Now some IoT scenarios may have an element of multiple categories. For example, you may want to do real-time analytics off a car engine (M2A), as well as store the data (M2D) to look for maintenance failure patterns over time, as well as trigger a service appointment if the engine fails (M2P), and finally display a light on the car dashboard that the engine died (M2M). So, some scenarios may only be categorized via one of the above taxonomies, but sometimes, a scenario will involve multiple approaches.