Ir para o conteúdo principal

Questão: 476286 - FGV - 2016 - IBGE - Analista - Análise de Projetos , FGV - 2016 - IBGE - Analista - Análise de Sistemas - Desenvolvimento de Aplicações - Web Mobile , FGV - 2016 - IBGE - Analista - Análise de Sistemas - Suporte Operacional , FGV - 2016 - IBGE - Analista - Análise em Biodiversidade , FGV - 2016 - IBGE - Analista - Auditoria , FGV - 2016 - IBGE - Analista - Ciências Contábeis , FGV - 2016 - IBGE - Analista - Design Instrucional , FGV - 2016 - IBGE - Analista - Educação Corporativa , FGV - 2016 - IBGE - Analista - Jornalismo - Redes Sociais , FGV - 2016 - IBGE - Analista - Geoprocessamento , FGV - 2016 - IBGE - Analista - Engenharia Agronômica , FGV - 2016 - IBGE - Analista - Engenharia Civil



The backlash against big data


Big data refers to the idea that society can do things with a large

body of data that weren't possible when working with smaller

amounts. The term was originally applied a decade ago to

massive datasets from astrophysics, genomics and internet

search engines, and to machine-learning systems (for voicerecognition

and translation, for example) that work

well only when given lots of data to chew on. Now it refers to the

application of data-analysis and statistics in new areas, from

retailing to human resources. The backlash began in mid-March,

prompted by an article in Science by David Lazer and others at

Harvard and Northeastern University. It showed that a big-data

poster-child—Google Flu Trends, a 2009 project which identified

flu outbreaks from search queries alone—had overestimated the

number of cases for four years running, compared with reported

data from the Centres for Disease Control (CDC). This led to a

wider attack on the idea of big data.

The criticisms fall into three areas that are not intrinsic to big

data per se, but endemic to data analysis, and have some merit.

First, there are biases inherent to data that must not be ignored.

That is undeniably the case. Second, some proponents of big data

have claimed that theory (ie, generalisable models about how the

world works) is obsolete. In fact, subject-area knowledge remains

necessary even when dealing with large data sets. Third, the risk

of spurious correlations—associations that are statistically robust

but happen only by chance—increases with more data. Although

there are new statistical techniques to identify and banish

spurious correlations, such as running many tests against subsets

of the data, this will always be a problem.

There is some merit to the naysayers' case, in other words. But

these criticisms do not mean that big-data analysis has no merit

whatsoever. Even the Harvard researchers who decried big data

"hubris" admitted in Science that melding Google Flu Trends

analysis with CDC's data improved the overall forecast—showing

that big data can in fact be a useful tool. And research published

in PLOS Computational Biology on April 17th shows it is possible

to estimate the prevalence of the flu based on visits to Wikipedia

articles related to the illness. Behind the big data backlash is the

classic hype cycle, in which a technology's early proponents make

overly grandiose claims, people sling arrows when those

promises fall flat, but the technology eventually transforms the

world, though not necessarily in ways the pundits expected. It

happened with the web, and television, radio, motion pictures

and the telegraph before it. Now it is simply big data's turn to

face the grumblers.

(From explains/201


The use of the phrase “the backlash” in the title of Text II means the:

© Aprova Concursos - Al. Dr. Carlos de Carvalho, 1482 - Curitiba, PR - 0800 727 6282