The Big Data Technology as a term describes the large volume of data structured and unstructured. it can be analyzed for insights that lead to better decisions and strategic business moves, many organizations use big data technologies to collect, store, and analyze massive amounts of data.
Well, we introduced the definition of big data now we will introduce the sources that help us to catch big data, for example,
every mouse click on a website can be captured in Web log files and analyzed in order to better understand shoppers’ buying behaviors and to influence their shopping by dynamically recommending products.
Social media sources such as Facebook and Twitter generate tremendous amounts of comments and tweets. This data can be captured and analyzed to understand.
Machines such as smart meters, generate data. These meters continuously stream data about electricity, water, or gas consumption that can be shared with customers and combined with pricing plans to motivate customers to move some of their energy consumption.
There is a tremendous amount of geospatial data, such as that created by cell phones, that can be used
by applications like FourSquare to help you know the locations of friends and to receive offers from nearby stores
and restaurants. Image, voice, and audio data can be analyzed for applications such as facial recognition systems in
BIG DATA ANALYTICS
Big data analytics is the often complex process of examining large and varied data sets to uncover information including hidden patterns, unknown correlations, market trends and customer preferences that can help organizations make informed business decisions. it enables big data analysts, data scientists, predictive modelers, statisticians, and other analytics professionals to analyze growing volumes of structured transaction data, plus other forms of data that are often left untapped by conventional business intelligence and analytics programs.
many of the organizations that collect, process and analyze big data turn to NoSQL databases, as well as Hadoop and its companion tools, including:
- YARN a cluster management technology and one of the key features in second-generation Hadoop.
- MapReduce a software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers.
- Spark an open source, parallel processing framework that enables users to run large-scale data analytics applications across clustered systems.
- HBase a column-oriented key/value data store built to run on top of the Hadoop Distributed File System.
- Hive an open source data warehouse system for querying and analyzing large data sets stored in Hadoop files.
- Kafka a distributed publish/subscribe messaging system designed to replace traditional message brokers.
- Pig an open source technology that offers a high-level mechanism for the parallel programming of MapReduce jobs executed on Hadoop clusters.
Once the data is ready, it can be analyzed with the software commonly used for advanced analytics processes. That includes tools for data mining, which sift through data sets in search of patterns and relationships; predictive analysis, which build models to forecast customer behavior and other future developments, machine learning, which taps algorithms to analyze large data sets.
Finally, We may ask why big data is important now?
Well, The importance of big data doesn’t revolve around how much data you have, but what you do with it. You can take data from any source and analyze it to find answers that enable
– cost reductions
– time reductions,
– new product development and optimized offerings,
– smart decision making.
When you combine big data with high-powered analytics, you can accomplish business-related tasks such as:
- Determining root causes of failures and defects in near-real time.
- Generating coupons at the point of sale based on the customer’s buying habits.
- Recalculating entire risk portfolios in minutes.
- Detecting fraudulent behavior before it affects your organization.