I am excited to post my first blog as part of Neuralytix. As you would expect from a Neuralytix analyst, I hope to provoke conversation, questions, and (dare I say) arguments.
I will focus on Big Data, cloud computing, and the Internet of Things. My analysis will
- provide actionable advice to companies that want to leverage the value data and technology can bring (either as a user or as a solution provider); and
- highlight applications of technologies in specific verticals / industries, highlighting the economic opportunities as well as barriers to entry and implementation.
Simply put, I want to separate the hype from the actual information you need to leverage the technology to your advantage.
Bringing Common Sense to Big (Common) Data
So, onto Big Data. I love the concept, but there remains a lot of misunderstanding about Big Data. I say this because the concept of Big Data has become so “big” (pun intended) that it is very easy to get lost in all the talk about it. I believe strongly (and that is putting it mildly) that users must understand what is needed to maximize the use of any Big Data technology before investing in it.
So, as I said earlier, my job is to consider the impact of technology on enterprises and enterprise users. From that perspective, these are the challenges that I see before any Big Data solution:
- No matter what data is collected, users must ensure that its data is clean. In other words, does your dataset consist of records that are complete, accurate, and represent the “event” you are trying to capture? The fact that data is growing at an alarming rate should not be a surprise. At the end of the day, a user will most likely not use all that stored data. It is very possible that TBs of data to be analyzed can be reduced to MBs. And MBs does not fall, technically, under the Big Data criteria (thus failing the “volume” test for Big Data).
- Data can reveal patterns to guide how to answer questions related to the problem you are trying to solve. However, it is also important to know the context behind those patterns. Data is always generated by an underlying process and the specific timeframe that is being examined. If that is not understood, then any conclusions made can lead you down the wrong path. (Therefore, the “variety” of data cannot be properly ascertained).
- Getting insights in “real-time” may not be applicable in a majority of cases. The last criteria of Big Data is “velocity”, and velocity is key when making business decisions and adding value. However, the speed at which those decisions are made is dictated by the process of collecting, cleaning-up, understanding and analyzing the data before any legitimate decisions can be made.
I am not suggesting that Big Data is not a worthy endeavor, it is! The generation and variety of data has exploded, and there is a need to process data in a manner that can keep up with that explosion.
However, it is important, to remember that faster collection, analysis and visualization of data will not solve alone the ultimate challenge of Big Data. To solve business problems in light of the increasing variety, velocity and
volume of data, the people and processes that generate Big Data will need to change to fully extract value from Big Data.
How they change has yet to be fully explored; and so, my work here at Neuralytix begins. What are your thoughts?