Papers
2007-03-01
- http://www.hpcwire.com/hpc/1279576.html Marc Snir talking about how programming languages aren't the cause of concern. We need better educated people, better frameworks, and domain specific languages.
2007-02-23
States how POOMA's original developers all left, and now the project is being bounced around.
- An overview of the Trilinos project
- Heroux MA, Bartlett RA, Howle VE, Hoekstra RJ, Hu JJ, Kolda TG, Lehoucq RB, Long KR, Pawlowski RP, Phipps ET, Salinger AG, Thornquist HK, Tuminaro RS, Willenbring JM, Williams A, Stanley KS
- An overview of the Trilinos project
In fact, much of POOMA terminology uses grid concepts. POOMA also focuses only on basic linear algebra compu- tations.
In our experience mathematical libraries tend to be written by small teams of domain experts. For example, approximately 25 staff members (not including students) contribute to Trilinos development across approximately 25 different packages, but most individual Trilinos packages are developed by one to three staff members, and no single package has more than five developers. Some staff members contribute to more than one package, but very few contribute to more than three packages. Another observation is that mathematical libraries tend to be written by experienced numerical software developers who do not have much, if any, experience with formal software tools and processes. Both of these observations have motivated the Trilinos design and implementation. The Trilinos package architecture naturally supports small interrelated team development efforts. Trilinos services, provided on a package-by-package basis, directly address the second observation.
— Fault identification: Bonsai [The Mozilla Organization 2004a] is a Web-based application that supports a variety of CVS repository browsing capabilities, and links changes in the repository to Bugzilla issues. Bonsai is most useful as a way to quickly identify changes in source code that have cause a software fault.
A full description of the Petra Object Model can be found in Boman et al. [2004].
- FIND: BOMAN, E., DEVINE, K., HEAPHY, R., HENDRICKSON, B., HEROUX, M., AND PREIS, R. 2004. Ldrd re- port: Parallel repartitioning for optimal solver performance. Tech. rep. SAND2004-0365. Sandia National Laboratories, Albuquerque, NM.
2007-02-19
Contains a neat code example of a fault tolerant finite difference.
Also contains a reference that
BigSim? crashes with 30,000 processors.
Early investigations in the 1970’s showed that chaotic relaxation has quite restrictive convergence properties, which is the main reason why it never became popular. However, for 100,000-processor systems it may be time to once again look at this iteration-free method. When failures and failure recovery are factored into the solution time, chaotic relaxation has some attractive recovery properties. The tasks that communicate with a failed task can do recovery independently and locally. Furthermore, the information lost by a failed task does not need to be recovered. The calculations can be formulated to proceed and converge to the solution despite failures.
Older
We do not have a rigorous justification of why (5.5) is an appropriate replacement for
(5.3), but here is a heuristic one. Equation (5.5) amounts to the statement that the
operations performed in Gaussian elimination do not compound, from the point of view
of groh factors; it is as if the kth elimination step were applied to the original matrix
A A ( rather than to A (. Why should this be? Our best answer is to describe the
following mechanism, which suggests that the groh introduced at one elimination step
tends not to contribute to fuher groh at later steps. At step k, the coecfion subtracted
from ( by (5.2) is a rank-1 matx. Taking the extreme, suppose this coection hap-
pened to be much larger than the elements it was being added to. Then the new matx
A (+1 would be close to a matrix of rank one in its lower-fight subsquare k + N i,
j N n. Consequently, the large numbers just introduced would vanish at step k + 1.
This argument is ceainly not complete, nor precise enough to distinguish (5.5)
from various other possible modifications of(5.3). But we believe the feedback mechanism
it describes is essential to the stability of Gaussian elimination: large groh makes the
remaining matrix close to a matrix of low rank, which in turn inhibits large groh. Note
that in keeping with the distinction in the Introduction between (2) and (2’), the low-
rank ropey would be destroyed if the signs of the correction matrix were randomized.
Experiments with a qobotomized Gaussian elimination" algorithm of this kind confirm
that (5.3) and (5.4) then become accurate. See 8 for the occurrence of this instability
phenomenon in a computation of practical interest based on parallel pivoting."
Clearly, this
parallel pivoting strategy is unstable. We explain this by observing that first, the multipliers are bigger than in standard Gaussian elimination (although still no greater than );
second and more important, the corrections introduced at each step are of high rank, so
that there are no favorable dependences among signs to retard growth. The rough agree-
ment of the data with the curve e n/(4 log n) suggests that perhaps this particular pivoting
strategy, unlike partial or complete pivoting, approximately satisfies hypotheses ( and
(2) of the Introduction.
- I like that in the introduction he gives acknowledgement to the people who got him interested in the subject.
Interesting presentation:
- orthogonal matrix solution methods have well understood stability properties, but cost twice as much.
- orthogonal methods are stable but cost more? Is there jusitification for this?
- Gaussian elimination has strange stability properties, but is stable most of the time
- can we come up with non-orthogonal methods that possibly run faster?
- L. N. Trefethen, The definition of numerical analysis, SIAM News, Nov 1992
- Trefethen, Lloyd N. (2006). "Numerical analysis", 20 pages. To appear in: Timothy Gowers and June Barrow-Green (editors), Princeton Companion of Mathematics, Princeton University Press.
- L. N. Trefethen, Predictions for scientific computing 50 years from now, Mathematics Today, 2000
Neat little paper and a moderately good introduction to social networks.
Need to look up the computer science paper databse: NCSTRL
MEDLINE seems to be the standard for biomedical research.
collaboration and personal communication is the method by which most collaborations form. This is healthy for science.
A database which would allow one conveniently and
quickly to extract shortest paths between scientists in
this way might have some practical use. Kautzetal.[60]
have constructed a web-based system which does just this
for computer scientists, with the idea that such a system
might help to create new professional contacts by providing
a “referral chain” of intermediate scientists through
whom contact may be established.
- Name: The Virtualization Approach to Parallel Programming: Runtime Optimizations and the State of the Art
- Authors: L.V. Kale
- Journal:
- Date Read: 2006-08-01
- Where to find: attachment:02-01-kale-virtualization.pdf
- Summary
Explains the benefits of virtualization, and the motivation behind the charm framework. There isn't much meat to the paper other than a very simple justification for computation-as-object.
Need to look up more on AMPI and the charm tutorial.
- Name: ParFUM? : A Parallel Framework for Unstructured Meshes for Scalable Dynamic Physics Applications
- Authors: Orion S. Lawlor1 , Sayantan Chakravorty2 , Terry L. Wilmarth2 , Nilesh Choudhury2 , Isaac Dooley2 , Gengbin Zheng2 and Laxmikant V. Kale2
- Journal:
- Date Read: 2006-08-01
- Where to find: http://http://charm.cs.uiuc.edu/papers/femINTL05.shtml
- Summary
Goes over the basic capabilities of parFUM. I'm worried that this framework doesn't have the capabilites to do matrix math like PETSc. How well does Kale's group understand the numerical side of things?