Big Data refers to data sets that are too large or complex for traditional data-processing application software to adequately deal with. Data with many cases offer greater statistical power, while it with higher complexity may lead to a higher false discovery rate. it is used to describe data that is high volume, high velocity, and/or high variety, requires new technologies and techniques to capture, store, and analyze it; and is used to enhance decision making, provide insight and discovery, and support and optimize processes.
Types of Big Data
Structured refers to data that has a defined length and format. It refers to highly organized information that can be readily and seamlessly stored and accessed from a database by simple search engine algorithms. it is entered in specific fields containing textual or numeric. These fields often have their maximum or expected size defined. In addition to the firm structure for information, structured has very set rules concerning how to access it. For instance, the employee table in a company database will be structured as the employee details, their job positions, their salaries, etc., will be present in an organized manner. Examples of structured include numbers, dates, and groups of words and numbers called strings.
Unstructured refers to data that doesn’t fit neatly into the traditional row and column structure of relational databases. This makes it very difficult and time-consuming to process and analyze unstructured. It is entered in specific fields containing textual or numeric. These fields often have their maximum or expected size defined. In addition to the firm structure for information, structured has very set rules concerning how to access it. Examples of types of files generally considered to be unstructured data are: books, some health records, satellite images, Adobe PDF files, a warranty request created by a customer service representative, notes in a web form, objects from presentations, blogs, text messages, word documents, videos, photos and other images. These files are not organized other than being placed into a file system, object store or another repository.
Semi-structured is a type that contains semantic tags but does not conform to the structure associated with typical relational databases. Global Product Marketing for Management at SAS defines semi-structured that contains semantic tags but does not conform to the structure associated with typical relational databases. it refers to the data that although has not been classified under a particular repository database, yet contains vital information or tags that segregate individual elements within. Examples include email, XML and other markup languages.