Big data has increased the demand of information management specialists so much so that
Software AG,
Oracle Corporation,
IBM,
Microsoft,
SAP,
EMC,
HP, and
Dell have spent more than $15 billion on software firms specializing in data management and analytics. In 2010, this industry was worth more than $100 billion and was growing at almost 10 percent a year, about twice as fast as the software business as a whole. Developed economies increasingly use data-intensive technologies. There are 4.6 billion mobile-phone subscriptions worldwide, and between 1 billion and 2 billion people accessing the internet. Between 1990 and 2005, more than 1 billion people worldwide entered the middle class, which means more people became more literate, which in turn led to information growth. The world's effective capacity to exchange information through telecommunication networks was 281
petabytes in 1986, 471 petabytes in 1993, 2.2 exabytes in 2000, 65
exabytes in 2007 which is the format most useful for most big data applications. This also shows the potential of yet unused data (i.e. in the form of video and audio content). While many vendors offer off-the-shelf products for big data, experts promote the development of in-house custom-tailored systems if the company has sufficient technical capabilities.
Government The use and adoption of big data within governmental processes allows efficiencies in terms of cost, productivity, and innovation, but comes with flaws. Data analysis often requires multiple parts of government (central and local) to work in collaboration and create new and innovative processes to deliver the desired outcome. A common government organization that makes use of big data is the National Security Administration (
NSA), which monitors the activities of the Internet constantly in search for potential patterns of suspicious or illegal activities their system may pick up.
Civil registration and vital statistics (CRVS) collects all certificates status from birth to death. CRVS is a source of big data for governments.
International development Research on the effective usage of information and communication technologies for development (also known as "ICT4D") suggests that big data technology can make important contributions but also present unique challenges to
international development. Advancements in big data analysis offer cost-effective opportunities to improve decision-making in critical development areas such as health care, employment,
economic productivity, crime, security, and
natural disaster and resource management. Additionally, user-generated data offers new opportunities to give the unheard a voice. However, longstanding challenges for developing regions such as inadequate technological infrastructure and economic and human resource scarcity exacerbate existing concerns with big data such as privacy, imperfect methodology, and interoperability issues. The challenge of "big data for development" is currently evolving toward the application of this data through machine learning, known as "artificial intelligence for development (AI4D).
Benefits A major practical application of big data for development has been "fighting poverty with data". In 2015, Blumenstock and colleagues estimated predicted poverty and wealth from mobile phone metadata and in 2016 Jean and colleagues combined satellite imagery and machine learning to predict poverty. Using digital trace data to study the labor market and the digital economy in Latin America,
Hilbert and colleagues argue that digital trace data has several benefits such as: • Thematic coverage: including areas that were previously difficult or impossible to measure • Geographical coverage: providing sizable and comparable data for almost all countries, including many small countries that usually are not included in international inventories • Level of detail: providing fine-grained data with many interrelated variables, and new aspects, like network connections • Timeliness and timeseries: graphs can be produced within days of being collected
Challenges At the same time, working with digital trace data instead of traditional survey data does not eliminate the traditional challenges involved when working in the field of international quantitative analysis. Priorities change, but the basic discussions remain the same. Among the main challenges are: • Representativeness. While traditional development statistics is mainly concerned with the representativeness of random survey samples, digital trace data is never a random sample. • Generalizability. While observational data always represents this source very well, it only represents what it represents, and nothing more. While it is tempting to generalize from specific observations of one platform to broader settings, this is often very deceptive. • Harmonization. Digital trace data still requires international harmonization of indicators. It adds the challenge of so-called "data-fusion", the harmonization of different sources. • Data overload. Analysts and institutions are not used to effectively deal with a large number of variables, which is efficiently done with interactive dashboards. Practitioners still lack a standard workflow that would allow researchers, users and policymakers to efficiently and effectively deal with data. The financial applications of Big Data range from investing decisions and trading (processing volumes of available price data, limit order books, economic data and more, all at the same time), portfolio management (optimizing over an increasingly large array of financial instruments, potentially selected from different asset classes), risk management (credit rating based on extended information), and any other aspect where the data inputs are large. Big Data has also been a typical concept within the field of
alternative financial service. Some of the major areas involve crowd-funding platforms and crypto currency exchanges.
Healthcare Big data analytics has been used in healthcare in providing personalized medicine and
prescriptive analytics, clinical risk intervention and predictive analytics, waste and care variability reduction, automated external and internal reporting of patient data, standardized medical terms and patient registries. Some areas of improvement are more aspirational than actually implemented. The level of data generated within
healthcare systems is not trivial. With the added adoption of mHealth, eHealth and wearable technologies the volume of data will continue to increase. This includes
electronic health record data, imaging data, patient generated data, sensor data, and other forms of difficult to process data. There is now an even greater need for such environments to pay greater attention to data and information quality. "Big data very often means '
dirty data' and the fraction of data inaccuracies increases with data volume growth." Human inspection at the big data scale is impossible and there is a desperate need in health service for intelligent tools for accuracy and believability control and handling of information missed. While extensive information in healthcare is now electronic, it fits under the big data umbrella as most is unstructured and difficult to use. The use of big data in healthcare has raised significant ethical challenges ranging from risks for individual rights, privacy and
autonomy, to transparency and trust. Big data in health research is particularly promising in terms of exploratory biomedical research, as data-driven analysis can move forward more quickly than hypothesis-driven research. Then, trends seen in data analysis can be tested in traditional, hypothesis-driven follow up biological research and eventually clinical research. A related application sub-area, that heavily relies on big data, within the healthcare field is that of
computer-aided diagnosis in medicine. For instance, for
epilepsy monitoring it is customary to create 5 to 10 GB of data daily. Similarly, a single uncompressed image of breast
tomosynthesis averages 450 MB of data. These are just a few of the many examples where
computer-aided diagnosis uses big data. For this reason, big data has been recognized as one of the seven key challenges that computer-aided diagnosis systems need to overcome in order to reach the next level of performance.
Education A
McKinsey Global Institute study found a shortage of 1.5 million highly trained data professionals and managers In the specific field of marketing, one of the problems stressed by Wedel and Kannan is that marketing has several sub domains (e.g., advertising, promotions, product development, branding) that all use different types of data.
Media To understand how the media uses big data, it is first necessary to provide some context into the mechanism used for media process. It has been suggested by Nick Couldry and Joseph Turow that practitioners in media and advertising approach big data as many actionable points of information about millions of individuals. The industry appears to be moving away from the traditional approach of using specific media environments such as newspapers, magazines, or television shows and instead taps into consumers with technologies that reach targeted people at optimal times in optimal locations. The ultimate aim is to serve or convey, a message or content that is (statistically speaking) in line with the consumer's mindset. For example, publishing environments are increasingly tailoring messages (advertisements) and content (articles) to appeal to consumers that have been exclusively gleaned through various
data-mining activities. • Targeting of consumers (for advertising by marketers) • Data capture •
Data journalism: publishers and journalists use big data tools to provide unique and innovative insights and
infographics.
Channel 4, the British
public-service television broadcaster, is a leader in the field of big data and
data analysis.
Insurance Health insurance providers are collecting data on
social "determinants of health" such as food and
TV consumption, marital status, clothing size, and purchasing habits, from which they make predictions on health costs, in order to spot health issues in their clients. It is controversial whether these predictions are currently being used for pricing.
Internet of things (IoT) Big data and the IoT work in conjunction. Data extracted from IoT devices provides a mapping of device inter-connectivity. Such mappings have been used by the media industry, companies, and governments to more accurately target their audience and increase media efficiency. The IoT is also increasingly adopted as a means of gathering sensory data, and this sensory data has been used in medical, manufacturing and transportation contexts.
Kevin Ashton, the digital innovation expert who is credited with coining the term, defines the Internet of things in this quote: "If we had computers that knew everything there was to know about things—using data they gathered without any help from us—we would be able to track and count everything, and greatly reduce waste, loss, and cost. We would know when things needed replacing, repairing, or recalling, and whether they were fresh or past their best."
Information technology Especially since 2015, big data has come to prominence within business operations as a tool to help employees work more efficiently and streamline the collection and distribution of
information technology (IT). The use of big data to resolve IT and
data collection issues within an enterprise is called
IT operations analytics (ITOA). By applying big data principles into the concepts of
machine intelligence and deep computing, IT departments can predict potential issues and prevent them. a special issue in the
Social Science Computer Review, a special issue in
Journal of the Royal Statistical Society, and a special issue in
EP J Data Science, and a book called
Big Data Meets Social Sciences edited by
Craig Hill and five other
Fellows of the American Statistical Association. In 2021, the founding members of BigSurv received the Warren J. Mitofsky Innovators Award from the
American Association for Public Opinion Research.
Marketing Big data is notable in marketing due to the constant "datafication" of everyday consumers of the internet, in which all forms of data are tracked. The datafication of consumers can be defined as quantifying many of or all human behaviors for the purpose of marketing. The size of big data can often be difficult to navigate for marketers. As a result, adopters of big data may find themselves at a disadvantage. Algorithmic findings can be difficult to achieve with such large datasets. Big data in marketing is a highly lucrative tool that can be used for large corporations, its value being as a result of the possibility of predicting significant trends, interests, or statistical outcomes in a consumer-based manner. There are three significant factors in the use of big data in marketing: • Big data provides customer behavior pattern spotting for marketers, since all human actions are being quantified into readable numbers for marketers to analyze and use for their research. In addition, big data can also be seen as a customized product recommendation tool. Specifically, since big data is effective in analyzing customers' purchase behaviors and browsing patterns, this technology can assist companies in promoting specific personalized products to specific customers. • Real-time market responsiveness is important for marketers because of the ability to shift marketing efforts and correct to current trends, which is helpful in maintaining relevance to consumers. This can supply corporations with the information necessary to predict the wants and needs of consumers in advance. ==Case studies==