I just Googled “Big Data” and I got 20,000,000 results. about two years ago, there was virtually nothing and now there is huge unprecedented hype.

Big data is not a single technology or a shortlist of vendors. It’s a loose collection of evolving tools, techniques and talent. In practice, big data can be divided into three categories: storage, processing and analytics.

1) Storage

Large-scale data processing operations access data in a way that traditional file systems are not designed for. Data tends to be written and read in large batches, multiple megabytes at once. Efficiency is a higher priority than features like directories that help organize information in a user-friendly way. Cloud is a very vague term, but there’s been a real change in the availability of computing resources. Rather than the purchase or long-term leasing of a physical machine that used to be the norm, now it’s much more common to rent computers that are being run as virtual instances. This makes it economical for the provider to offer very short-term rentals of flexible numbers of machines, which is ideal for a lot of data processing applications. Enterprise data is traditionally stored in relational databases which are structured in tables that can join with other tables in a carefully defined way. Big data strains this approach because there is too much data to fit easily into big enterprise databases and many uses require faster processing and analysis. Big data storage differs significantly from relational databases because it stores data that has not been mapped to a particular format or structure. By not being contained to such structure, the data is available much more rapidly for use.

2) Processing

Processing: Mastering the proper tools for efficient analysis under different conditions (different data sets, varied business environments, etc.). Although current web analysts we are undoubtedly experts at leveraging web analytics tools, most lack some broader expertise in business intelligence and statistical analysis tools such as Tableau, SAS, Cognos and such.

Processing big data means collecting and moving it into storage or other systems in an organized way. Big data needs to be distributed across a number of different hardware locations and is generally not in a predefined format so it requires its own approach to processing.

Batch processing is working with data that sits in a constellation of database clusters which are spread across hundreds or thousands of different pieces of hardware. There are a number of frameworks to execute batch processing of data including MapReduce and Spark. Real-time processing works on data that is “in motion,” potentially at or near the point of data capture. Think of a marketer being able to process behavior data from a website visitor in the moment to serve that same visitor relevant ads, promotions or content throughout his or her site visit.

3) Analytics

Developing expertise in unstructured data analysis such as social media, call center logs and emails. From the perspective of Processing, the goal should be to identify and master some of the most appropriate tools in this space, be it social media sentiment analysis or more sophisticated platforms.

Adoption of specialized big data tools is still growing. Yet many analytics techniques can and do make use of big data stores, generally by transforming the data into structured formats first. One area of growing interest in big data analysis is machine learning, which uses software to find patterns within large amounts of data in ways that don’t rely on explicit programming and can surpass human capabilities. And there is a clear opportunity for digital analysts to develop an expertise in areas of dash boarding and more broadly, data visualization.

DataMaestro

Big Data For Digital Marketers

I just Googled “Big Data” and I got 20,000,000 results. about two years ago, there was virtually nothing and now there is huge unprecedented hype.

Big data is not a single technology or a shortlist of vendors. It’s a loose collection of evolving tools, techniques and talent. In practice, big data can be divided into three categories: storage, processing and analytics.

1) Storage

No comments:

Post a Comment