Data farming

Data farming

Community hub

Data farming

0 subscribers

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something to knowledge base

About hubMembersRules

Hub AI

Data farming AI simulator

(@Data farming_simulator)

Hub AI

Data farming AI simulator

(@Data farming_simulator)

Wikipedia

Data farming is the process of using designed computational experiments to “grow” data, which can then be analyzed using statistical and visualization techniques to obtain insight into complex systems. These methods can be applied to any computational model.

Data farming differs from Data mining, as the following metaphors indicate:

Miners seek valuable nuggets of ore buried in the earth, but have no control over what is out there or how hard it is to extract the nuggets from their surroundings. ... Similarly, data miners seek to uncover valuable nuggets of information buried within massive amounts of data. Data-mining techniques use statistical and graphical measures to try to identify interesting correlations or clusters in the data set.

Farmers cultivate the land to maximize their yield. They manipulate the environment to their advantage using irrigation, pest control, crop rotation, fertilizer, and more. Small-scale designed experiments let them determine whether these treatments are effective. Similarly, data farmers manipulate simulation models to their advantage, using large-scale designed experimentation to grow data from their models in a manner that easily lets them extract useful information. ...the results can reveal root cause-and-effect relationships between the model input factors and the model responses, in addition to rich graphical and statistical views of these relationships.

A NATO modeling and simulation task group has documented the data farming process in the Final Report of MSG-088. Here, data farming uses collaborative processes in combining rapid scenario prototyping, simulation modeling, design of experiments, high performance computing, and analysis and visualization in an iterative loop-of-loops.

The science of Design of Experiments (DOE) has been around for over a century, pioneered by R.A. Fisher for agricultural studies. Many of the classic experiment designs can be used in simulation studies. However, computational experiments have far fewer restrictions than do real-world experiments, in terms of costs, number of factors, time required, ability to replicate, ability to automate, etc. Consequently, a framework specifically oriented toward large-scale simulation experiments is warranted.

People have been conducting computational experiments for as long as computers have been around. The term “data farming” is more recent, coined in 1998 in conjunction with the Marine Corp's Project Albert, in which small agent-based distillation models (a type of stochastic simulation) were created to capture specific military challenges. These models were run thousands or millions of times at the Maui High Performance Computer Center and other facilities. Project Albert analysts would work with the military subject matter experts to refine the models and interpret the results.

Initially, the use of brute-force full factorial (gridded) designs meant that the simulations needed to run very quickly and the studies required high-performance computing. Even so, only a small number of factors (at a limited number of levels) could be investigated, due to the curse of dimensionality.

Knowledge Base

Talk Channels

Special Pages

Data farming

Recent from talks

Recent from talks

Contribute something to knowledge base

Subscribers

Supporters

Contributors

Moderators

Hub AI

Hub AI

Hub AI

Data farming

Wikipedia

Data farming

History

Data farming

Recent from talks

Recent from talks

Contribute something to knowledge base

Subscribers

Supporters

Contributors

Moderators

Hub AI

Hub AI

Hub AI

Data farming

Wikipedia

Data farming