Search News from Limbo

Friday, March 30, 2012

White House boosts 'big data' drive,
aims for expansion of tekkie force
Following recent disclosure of the National Security Agency's super-data crunching center, the White House and the Energy Department and five other government agencies unveiled a national effort to upgrade massive data analysis for military and civilian purposes.

A big thrust would use university programs to expand the nation's work force capable of dealing with the new data-analysis technologies.

"As the amount of data continues to grow – scientists who already are falling behind are in danger of being engulfed by massive datasets," a federal spokesman said.

As supercomputers have become ever more powerful – now capable of performing quadrillions of calculations per second – they allow researchers to conduct detailed simulations of scientific problems at an unprecedented level of detail.  The technologies for sifting such volumes of data are eagerly sought by scientists and others.

"As scientists around the world address some of society’s biggest challenges, they increasingly rely on tools ranging from powerful supercomputers to one-of-a-kind experimental facilities to dedicated high-bandwidth research networks," a spokesman said.

The Pentagon is “placing a big bet on big data” by investing approximately $250 million annually -- with $60 million available for new research projects -- across the military departments in a series
of programs that will harness and utilize massive data in new ways and bring together sensing,
perception and decision support to make truly autonomous systems that can
maneuver and make decisions on their own, a spokesman said.

The Pentagon seeks a 100-fold increase in the ability of analysts to extract information from texts in any language, and a similar increase in the number of objects, activities, and events that an analyst can observe.

On March 18, Newz from Limbo covered the National Security Agency's building of a major data analysis center. The agency plans to use the new data mining and analysis technologies to tackle previously uncrackable codes, it has been reported.

The Obama administration created the Big Data Research and Development Initiative in order to "advance state-of-the-art core technologies needed to collect, store, preserve, manage, analyze, and share huge quantities of data" and to "harness these technologies to accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning; and expand the workforce needed to develop and use Big Data technologies."

Arie Shoshani of Berkeley Lab will lead a five-year Energy Department project to help scientists extract insights from massive research datasets. The national effort is budgeted at $200 million, with the Energy Department spending $25 million.

The Energy Department's project is known as the Scalable Data Management, Analysis, and Visualization (SDAV) Institute. The idea is to improve ability to extract knowledge and insights from large and complex collections of digital data.

Among the other projects announced was a $10 million award to the University of California, Berkeley, as part of the National Science Foundation’s “Expeditions in Computing” program. The five-year NSF Expedition award to UC Berkeley will fund the campus’s new big data program.

SDAV is a collaboration tapping the expertise of researchers at Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge and Sandia national laboratories and in seven universities: Georgia Tech, North Carolina State, Northwestern, Ohio State, Rutgers, the University of California at Davis and the University of Utah. Kitware, a company that develops and supports specialized visualization software, is also a partner in the project.

Among other things, scientists are looking forward to developing a new generation of particle accelerators with applications ranging from nuclear medicine to power generation can simulate fields with millions of moving particles.

But that makes it difficult to pull out the most interesting information, such as just the particles that are energetic. In the past, it could take hours to sift through the data, but FastBit, an innovative method for indexing data by characteristic features, allows researchers to perform the task in just seconds, dramatically increasing scientific productivity. At the same time, by reducing the amount of data being visualized, they will be able to see phenomena they would otherwise be unable to see.

The next step to be tackled under SDAV is to develop a way to interact with the data as it is being created in a simulation. This technique would allow researchers to monitor and steer the simulation, adjusting or even stopping it if there is a problem.

Because such simulations can run for hours or days on thousands of supercomputer processors, such a capability would help researchers make the most efficient use of these high-demand computing cycles. Similarly, such tools will allow scientists to analyze and visualize data as it is being generated and could help them summarize and reduce the amount of data to a manageable level, resulting in datasets with only the most valuable aspects of the simulated experiment.

This capability will also benefit scientists using large scale experimental facilities, such as the Defense Department's Advanced Light Source where scientists use powerful X-ray beams to study materials. Previously, data was collected at one frame per second, but is now up to 100 frames per second. But the proposed Next Generation Light Source will pour out data at 1,000,000 frames per second. Again, having tools to manage the data as it is being generated is critical as the results of one experiment are often used to guide the next one.

Scientists don’t want to have to wait six months just to sort out the science from the data. Awaiting discovery may be critical insight into the cause and treatment of diseases or the development of innovative materials for industry.

Mining the data to find the patterns which determine whether a simulation will proceed successfully can help researchers catch problems early and modify the parameters to eliminate these patterns.

SDAV is meant to counter data management overload even when overload might not be obvious. For example, having too much data for a computer simulation can dramatically slow a supercomputer’s performance as it moves data in and out of processors. This not only wastes time, but also the power needed to run the system. By developing methods to manage, organize, analyze and visualize data, SDAV aims to greatly improve the productivity of scientists.

“In the same way that past federal investments in information-technology R&D led to dramatic advances in supercomputing and the creation of the Internet, the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security,” said Dr. John P. Holdren, director of the White House Office of Science and Technology Policy.

The initiative responds to recommendations by the President’s Council of Advisors
on Science and Technology, which last year concluded that the Federal Government is
under-investing in technologies related to Big Data.

Also involved in the Big Data initiative are the National Science Foundation (NSF) and the National Institutes of Health.
NSF is implementing a comprehensive, longterm
strategy that includes new methods to derive knowledge from data; infrastructure
to manage, curate, and serve data to communities; and new approaches to education
and workforce development.

Specifically, NSF is encouraging research universities to develop interdisciplinary graduate programs
to prepare the next generation of data scientists and engineers; funding a $10 million Expeditions in Computing project based at the University of
California, Berkeley, that will integrate three powerful approaches for turning data
into information-machine learning, cloud computing, and crowd sourcing;
Providing the first round of grants to support “EarthCube” – a system that will
allow geoscientists to access, analyze and share information about our planet;
issuing a $2 million award for a research training group to support training for
undergraduates to use graphical and visualization techniques for complex data.

Also: Providing $1.4 million in support for a focused research group of statisticians and
biologists to determine protein structures and biological pathways,
and convening researchers across disciplines to determine how Big Data can
transform teaching and learning.

In addition, the Defense Advanced Research Projects Agency (DARPA) is beginning the XDATA program, which intends to invest approximately $25 million annually for four years to develop computational techniques and software tools for analyzing large volumes of data. Challenges include developing scalable algorithms for processing imperfect data in distributed data stores; and creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions.

The U.S. Geological Survey will run a Big Data for Earth System Science program.

New hacking uproar engulfs Murdoch

Murdoch has been firing off Tweets in his company's defense. Yet, several years ago his U.S. arm ended a hacking trial by paying off and buying out a rival.
Newz from Limbo is a news site and, the hosting mechanism notwithstanding, should not be defined as a web log or as 'little more than a community forum'... Write News from Limbo at Krypto78=at=gmail=dot=com... The philosophical orientation of Newz from Limbo is best described as libertarian... For anti-censorship links: (If link fails, cut and paste it into the url bar)... You may reach some of Paul Conant's other pages through the sidebar link or at See for photo

No comments:

Post a Comment