X-Informatics:
I-400 and I-59Knowledge and Physical
Computation
Spring Semester 2002 MW 6:00 pm – 7:15 pm Indiana Time
Geoffrey Fox and Bryan Carpenter
PTLIU Laboratory for Community Grids
Informatics, (Computer
Science , Physics)
Indiana University
Bloomington IN 47404
Knowledge
• Optimization includes many areas such as– Design the best set of airline routes given a set of planes, airports, passenger preferences etc
• It also includes decision-making
– Given current understanding of “the world”, decide how to address the current crisis
– Given technical and political “understanding”, decide how to store nuclear waste
– Decide what product customers actually want
• Given our National Virtual Observatory, decide on best estimate fore age of Universe …
Complex Systems
• Physics and Chemistry teaches you about systems built from fundamental entities like molecules atoms quarks
• Biology teaches you about systems built from genes neurons or cells …
• Engineering teaches you to build vehicles out of chunks of steel or equivalent
• Management teaches you about systems built from people
• Economics teaches you about systems built from companies, consumers, products, stocks ……
• We can abstract this to concept of a “complex system”
• Complex Systems are sets of entities which can be dynamic and heterogeneous
Complex Systems II
• The entities in a complex system are typically situated in a “space” and labeled by “time”
– “space” is often not a physical space and can be continuous or discrete
– “time” is often conventional time but can be anything labeling change such as a version or iteration number
• Class Scheduling as a complex system has
– Entities are (teacher, class) pairs
– Space is set of allowed (classsroom, timeslot) pairs in a week
– There is no immediate “time” as we want to find a single configuration
• Sometimes as in above example, main interest in complex systems is its equilibrium configuration
– However “solving problem” will often start with a different configuration and evolve it to “desired configuration.
Complex Systems III
• Informatics tends to deal with more abstract complex systems
than those in “conventional science”
– Physics Chemistry and Biology tend to discuss their complex systems in terms of rules (equations of motion) which are
believed to follow more or less directly from Mother Nature
• Physics often based on “fundamental equations” (such as those of Newton or Einstein) and as you get to more applied fields
(Chemistry Biology Engineering) one uses “models”
– Biology might model the spread of disease in terms of
probabilities of one person to pass onto others and to progress down path of disease
– These probabilities are derived from observations – not from Newton’s laws for motion of molecules in viruses
• Theory of Complex Systems tries to unify discussion of all systems whatever their description
Complex Systems IV
• Complex Systems – even when abstract – exhibit a set of common features
– These features are most interesting when there are a lot of entities in complex system and tend to be very insensitive to details
– Temperature: a measure of degree of random activity in system
– Entropy: a measure of number of states in system
– Phase Transitions: large change in configuration
– Small Worlds: tendency for efficient networks to be developed with entities close to each other
– Fractal Dimension: a measure of complexity of information and linkage between entities
Phase Transitions
• Current view of Information Technology represents a phase transition at Wall Street
• End of Cold War was a Phase Transition
• Some “Knowledge” e.g. how to predict earthquakes reliably is a phase transition
Objectiv Function
Configuratio
More on Phase Transitions
• Systems have objective functions – corresponding to an Energy E in physics analogy, then natural state is state of lowest energy
• Suppose system depends on one or more parameters p such that energy surface varies as p varies
• Then as p varies a little the location of lowest state and value of associated energy will change a little
– Nothing special – system states in this state and changes as needed as p varies
• However as p varies more, a “distinct” minimum can become the new lowest state but now how does system change?
• Problem is that “only continuous way” from original to new lowest energy (equilibrium) is through states of high energy
Examples of a Phase Transition
• Water to Steam or ice or snow• Opinion of Indiana about Bobby Knight and his successor
• Transition associated with end of cold war and strong dictatorial communist rule in Eastern Europe
• Bull or Bear view of stock market
• Ability to be able to predict an earthquake
• Middle East today is struggling as it has not found a way to do phase transition
– Perhaps the conflict today represents the difficulty of transitioning from current situation to a stable
Israel+Palestine situation
• Emergence of a third party in a democracy
– Parties emerge but rarely have what it takes to surpass existing parties
• Emergence of Java as a dominant language
Nucleation of a Phase Transition
• Note Phase Transitions tend to occur in clumps and are initiated by small areas (nucleation points) growing
• This is because total energy often has terms f(i1,i2) which tend to “align i1 and i2” i.e. in Bobby Knight example, social forces tend to mean that in equilibrium each person (labeled by i) has same opinion
• In political party example, people are often loath to change their vote as they are interacting with others who try to make them not change their opinion (voting for a new party is a wasted vote etc.)
• One often gets “super-heating or super-cooling” effect – namely complex system changes later than it should
– i.e. if in phase I (communist say), it will stay communist long after natural forces make “non-communist” the equilibrium state
Knowledge as a Phase Transition
• Consider X-Informatics as a Complex System• The system consists of entities which are both information nuggets and the people interested in field X
• Using XML metadata we establish links (forces) between information nuggets
– These links can be found automatically using automated techniques like Google searches
• People are linked to information nuggets by web access and to each other by methodologies normal in the field
• Information and their links are just normal stuff
• Real Knowledge is a Phase Transition coming from integrating information
– One day we don’t really know how to predict earthquakes as a community BUT a few people think they know
Role of Informatics
•
Goal is to prepare the information nuggets and the links
between them
–
Need to provide both systems (wizards) to enable “input”
of links and to develop automatic ways of finding
“unexpected” links
–
Unexpected links could be correlations between nuggets
which become apparent when details of nuggets are
compared
•
Then one needs to encourage the development of knowledge
as properties of “emergent systems”
–
Phase transitions and emergent properties are similar
concepts
Internet as a Complex System
•
The Internet is a very interesting complex system
•
Members are computers
•
Connectivity from ping, email, ftp, web-access
•
Interesting phase transitions due to network traffic
anomalies
–
From local Ethernet to global
•
Unusual structure from global interconnections
Different Approaches to Knowledge
• Mother Nature’s approach to Complex System is to find emergent properties as a resultant of interplay between a myriad of forces and pressures
– This is sort of democracy
• Original Artificial Intelligence approach to Complex System was a set of rules where you have a decision tree
– This is sort of dictatorship
• Rule-based approaches have some successes but clearly limited and as we get more and more information, it will get less and less effective
– The “world” is billions of nuggets of half-baked information
Genetic Algorithm Example
• Consider a set of computer tasks labeled by i to be processed on a set of computer nodes
– Assume that tasks take a certain time to complete and also need to communicate between themselves
– Maybe one task per data item
• Complex system is set of tasks and “space” is finite with a number of locations equal to number of computer nodes
• If system was dynamic there would be a “time” associated with a problem but in simplest case not
• Tasks with a lot of communication between them are “attracted” to each other
• If tasks with a lot of compute time happen to be in same node, they tend to repel each other
Complex Systems Approach to Scheduling Bunch of
Computers
Each Color is a differen processor in top displa
GA Example I
• Here we have 2processors and 9 data tasks
• Chromosome is a 9-bit binary number – one bit for each data task
– Bit is 0 if task in
processor 0 and 1 if task in processor 1
• We have a “sea” of
GA Example II
• Crossover andMutation are the two most important
operators but others are also possible
Simulated Annealing
• We have same complex system as in genetic algorithm but a very different approach
• Genetic algorithms use an ensemble of different representatives of complex system
• Simulated Annealing uses a single representative and evolves it in time
– Works for systems with closely coupled members
– Genetic algorithms tend to be used when members loosely coupled but in scheduling problem and many others one can use either
• Simulated Annealing is based on physics analogy of how high quality materials are formed by annealing
Role of Temperature (in Informatics)
•
Temperature is “smoother” – at temperature T you can
jump heights in E of size kT
•
Essentially at high Temperature you ignore detail and
get global structure correct
–
Note different phases are characterized by different
global structure
•
Annealing finds minima at each temperature and
gradually lowers it
–
Such as T
new= 0.9 * T
oldExample of difference between two scheduling strategies showing a phase transition as you change time dependence of loa
We are assuming tasks operate iteratively where work done at each
iteration changes with each iteration implying tasks may need to be
moved to a different node
This could be a
simulation of a bunch of stars with a globular
And
Hopfield and Tank showed how one could set up problem as minimizing a Energy as a function of neural variables. This was analogous to way brai