References:
http://www.nitrd.gov/subcommittee/hec/hecrtf-outreach/
Look in super computing and ipdps for pers on the current state of parallel research.
Ideas
Debugger for parallel applications
1. Parallel computing era is coming into our life (from multicore to supercomputers)
More vivid games (better AI devoted)
Grand challenge applications (scientific/engineering) impacts our daily life.
2. Utilizing these computing powers, problems
a) porting existing applications/scale existing applications
b) way to construct new softwares (programming environment is important)
c) zero in on debugging
3. What involves in the above? (programming environment)
Bugs always inflicting us, particularly in larger machines.
4. Bug cause lots of damages (10% GDP of USA)
5. Debugging is hard in sequential. Doing it in parallel context is more difficult because of interactions among different processors
complicate this, Concurrency etc. Data racing, control flow etc.
For small platform - possible for windows to popup each standing for an independent thread
For large platform - impossible (and more importantly, resource constraints)
6. Approach:
YY's approach
Simulation based
- data mining techniques in code, detecting code duplication
- simulation based
- integrated framework
- can't debug on super computer
- statement of problem
- justification for why it's important
- Examples
- statemen to of area
- why we chose out approach
- lots of attention on models, less focus on tools
- efficency has been done and done,
- automatic parallelization has been done and done, no success.
- why parallel debugging important
- bugs cost XXX money (Get values)
- indespensible in contructing and understanding software (need figure on time spend debugging)
- figure on utilization of supercomputers.
- why parallel debugging difficult
- lack of support for parallel debugging (??? begs list of what exists)
- race conditions hard to reproduce
- hard to freeze the global system state
- limited time on super computers
- conditions for success (what we want)
- baseline: better than just log files.
- scalability
-
- an approach
- look up similar problems in multi-thread OS, distributed systems, networking.
- save everything -- naive
- takes a long time to save things
- storage space is ridiculuous
- better approach, purge after well known good states
- how do we find those states?
- still slow
-
- looking at debugging offline? state inspection not possible.
- who's interested
- need to find software companies expressing interest
- grants we could apply for
- papers published in the area recently
- what's been done
- total view
- vampire
- look up more debugging tools
- lookup yy's cp-miner
- papers published in the area recently
References:
Topic revision: r1 - 12 Feb 2007 - 16:54:40 -
RobBlake