Site icon Thetechhacker

What is Big Data?

What is Big Data?

The concept of “Big Data” (or massive data) refers to a set of data so large that it is difficult to handle with conventional tools. It is often data from multiple sources registered to enable their operations and analysis without a predetermined goal and for no time limit.

The emergence of Big Data created instruments for development. First, there is the development of the Internet and the increase in the number of connected objects that contribute to the creation of large volumes of data and secondly the development of storage capacity and computing that allows treatment to cost more and more.

Big Data in principle to meet four characteristics: volume, velocity, variety, and value.

Some examples

The uses are extremely different. We can cite, for example the analysis of crowd movements using data from cell phones to facilitate the delivery of aid following the earthquake that occurred in Haiti in 2010, the adaptation of President Obama’s speech at the 2012 campaign based on reactions posted on Twitter, or the identification of areas and hours in a city where crimes are most likely to commit to better allocate resources.

Another famous example is the US company Target, which store can identify women who are expecting a child to offer their products for infants. For this, the company has analyzed millions of data from loyalty cards to women opening a list of baby shower gifts. For example, they observed that they began to buy creams fragrance to about three months of pregnancy, and some dietary supplements at a different stage of pregnancy. By applying these criteria (combined with others) to all its customers, Target can identify pregnant women with tremendous efficiency.

Where is data protection in all of this?

Big Data poses a real challenge for data protection because many basic principles are endangered. Requirements of data protection, apply only to the processing of personal data, or data related to an identified or identifiable person. Therefore excluded the data of anonymous. The problem here is that when anonymous data are combined with other data, they can quickly become identifiable.

Data can be processed in the order in which they were collected, and they must be destroyed once this goal is achieved. Big Data is based instead on the use of data for other purposes, even on the retention of data for future use (which is the not-yet-determined goal).

The person concerned must be aware of the processing of their data, including their transmission to third parties, which implies a clear and accurate information on the terms and goals of treatment. These rights are difficult to comply with the processing of Big Data. The correct data, as well as the guarantee of a right of access, can also be problematic.

This does not mean that standards of data protection are not applicable, or should be changed. Simply one who is collecting and analyzing Big Data must show good faith and transparency. It will also take the necessary measures to ensure as much as possible the anonymity of the data and to ensure its safety.

Exit mobile version