2008-08-12
Large deformation diffeomorphic metric mapping
Idea: morph one shape into another, look into the morphing to classify disease.
Computational anatomy
Idea: I have no idea what this guy is saying.
2008-03-25
Stream Programming: Luring Programmers into the Multicore Era
Languages to look out for:
2008-03-24
Sparse Recovery Using Sparse Random Matrices
2008-02-20
Mixed finite element methods for elasticity
finite element exterior calculus
Elasticity: solving %$ A\sigma = \epsilon u$%, %$-div \sigma = f$%
- good FEM for heat conduction. Raviart tomas or Brezzi Douglas Marini families of elements. Nedelec 1st kind or second kind face elements in 3D.
- in 1960s, it must have become slear that naive element choices (linear stress, constant displacement) didnt' work.
- 1968 Watwood & Hartz, use micromesh, proven stable
- 1984 Arnold Douglas Gupta: another micromesh
- Veubeke 1975, Amara-Thomas 1975 - modify variational form, only enforce weak conservation of angular momentum.
- 1984 PEERS element- add 'bubbles' to elements for stability
- 2002: Arnodl Winther, uses polynomials
2007-10-11
Panel on parallelism in multicore and differences
Speaker 1
- Differences with multicore compared to potential parallel computing
- %$ \sqrt{p} $% acceptable. # of transistors growing like p^2,
- compute times are short -- 16 msec frames
- has to be robust -- target audience is so much larger, can't resolve bugs for individual people
Speaker 3
- Nvidia chip programming
- users must see initial speedup in order to use the new hardware.
- performance tuning returns diminish.
- therefore, work on specialized hardware for the task, ie database processors would do well.
Speaker 5
- SciComp? people aren't shocked by multicore
- Business people are shocked, expect a lot.
Other comments
- get rid of variables, make them read only, use functional programming, or use transactional memory
- who is the target audience for general parallel computing? Who needs it? What are the killer application.
2007-08-15
Thalamic Morphometric Analysis in ADHA
- size of the thalamus in ADHD.
- computes regular 3D square grid of nodes that make up the thalamus
- computes collection of tets that make up cube, include or reject them as part of the thalamus
- computes volume through divergence integral? Why? We have the tets.
- LDDMM Large Deformation Diffeomorphic Metric Mapping - allows you to compare arbitrary stuff to a standard template.
Vascular Endothelial Cell Modeling
Interested in the mechanisms behind muscles in the blood vessel and how they regular blood pressure. Cells between internal blood vessel wall and smooth muscle are electrically active (propagates for mm).
Investigating failed computational shocks in a Computational Model of Calcium Overload.
2007-04-11
Book Recomendations
- Domain driven design
- patters of time triggered embedded systems.
2007-03-26
- Bill Gropp
- Challenges for MPI in the Petaflops Era
- Argonne National Labs
- "learning form the success of MPI" add to citeulike.org
2007-01-26
- Automatied Composition of Reliable, Adaptive Parallel/Distributed Programs from Self-Describing Components
- Jim Browne
- UT Austin
- programming model
- flow of control versus interacting actors
- too many buzzwords, this talk isn't showing the big ideas
- basically, add a "contract" for each method, let the compiler do the matching between components. My idea for a programming model, but much more verbose.
2006-10-11
- Kale: migratable objects and adaptive runtime systems for parallel performance.
2006-10-04
2006-09-27
Talk is on the problems with parallel computing languages.
Parallel programing languages for Large Scale Parallel Computing
- interested in 1K nodes
- load balancing, locality, communication are algorithmic issues, need to be addressed by the user.
- scientific computing
- transformational rather then reactive code
- (not mentioned in talk, but low fruit. programs are hihgly regular, easy to work with)
- expect reporducibility and determinism
- exceptions- associatice/communatice operations == reductions
- large codes (>> 100 KLOC)
- programming language domain: 3 dimensions
- application type: SC, transaction server, client application, we services
- software type large, ong lived application, small prototype code
- paltform type: large scale, SMP, 1000K processors
- need new langagues?
- claim: bad mpi
- mpi codes are larger
- mpi is loc level *not true? an MPI apology
- mpi communciation is asmall fraction of large frameworks and is hidden at bottom
- empirical studies show some problems are easier with some frameworks.
- code size bad predictor of devel time.
- tuning is harder with higher level languages
- cant control memory locality
- other SE aspects have larger effect
- parallel compilers aren't high quality.
- what snir wants
- performance - can beat mpi codes
- same reason we moved away from assembly
- faster compiled communication that avoid software overheaad of mpi
- compiler optimizations of communications
- co array fortran faster than mpi on cray machines?
- semantics & performance transparendy
- can analyze & understnat outcome and performance of parallel coe by looking at source code, language has simple (aporximate) performancs semantics.
- Time = work/p + depth.
- from parallel algorithms
- guy from cmu developed parallel functional language with exact performance semantics.
- Need approximate composition rules for work and depth.
- some user control parallelism load blncing, and communication
- algo design should be expressible
- non-determinism only when needed.
- support iteratice refinement of program
- make performance decisions and optimizations later
- modularity & composability
- sequential mehtod can be replaced by parallel method wth no change in invoking code
- nested parallelism? what?
- easy to compose different parallel programs (only hard b/c mpi sucks).
- semantics and performance characteristics of parallel code can be easily inferred from semantics and characteristics
- object orientation
- backwards compatibility
- interoperability with mpi codes
- similar to existing languages
- ways to do this:
- processor virtualization
- why not user controlled load balancing?
- failures of cores
- power saving
- compose large independent codes in other libraries
- global namespace
- untie memory location from memory name
- still need local copies for performance.
- problems - processers need to keep these variables in sync *how to do
- compiled if association is persistent
- managed in runtime (hash table)
- optimize if acciciation is slow changing
- run time compilation can be used here (duh)
- global control and dynamic parallelism
- shoudl focus on globally on algo, processor behavior shoudl be derived at run time
- differences between distributed computing and parallel computation
- what's happening today
- partitioned global array systems
- UPC and co arry fortran
- gloabal references are syntactically distinct from local references
- no overhead for local references
- sequential code executed once on each processor (local view of control)
- add global barriers and forall in UPC
- critique
-
performance - can beat mpi
-
simple model of execution
- :-/ user control of data and control partitioning - same as mpi
-
non deterministic behavior, need to lock between reads and writes (but only because lack of dependencies)
-
iterative refinement
-
composabiltiy - hard to compose
-
no object orientation (dynamic type resolution screws compiler-- why?)
-
backwards compatibility
-
no global namespace
-
no dynamic parallelism
- X10
- no global namespace
- no simple performance
* my problems with snir's language
- scientists care the most about performance
- false! scientists care the most about turn around time. Why else are most clusters under utilized.
- still assumes that the programmer is handling dependencies explicitly
- assumes programmer wants/needs to handle memory tuning
2006-09-25
Vaidya came to talk about implementing a parallel Lanczos algortihm with Kale's Lab.
- Block Lanczos Algorithms
- get the name of the NA book he recommended.
- factor A
- symmetric pos def,
- use cholesky
- reorder so factorization is sparse
- Loop
- compute A^-1 * C for some columns C
- take that product, orthogonalize w/r/t block vectors
- clean Q vectors
- n = 10 million
- m' - 500-1000
- m -
2006-09-21
- A+ language came from morgan stanely.
- Java xml broker - executable flowchart
- contributed to AFS
- contributes to cpan
2006-09-14
- Who: Douglas Hofstadter
- By the guy who wrote Godel Escher Bach.
talk focuses on analogy and how it forms the basis of cognition.
Overall, I thought the talk was ok, but not great. Hofstadter is an entertaining speaker, but I thought his argument was weak. He argued that things acquire their value by making comparisons against other things. To back this up, he provided lots of anecdotal evidence where his mind made comparisons. His conclusions didn't follow from his premises.
2006-08-24
- Software inprications of multi core architectures
- siddhartha <didn't get last name, from IBM>
- death of moore's law? Due to small size of transistors, consumes more and more power without doing work.
- languages to look up.
- chapel
- fortress
- extent
- toronto - sh language. stream language. Is that what I'm looking for?
- he thinks that new programing paradigms wont appear, that we wont be able to move to functional programming
- if the concepts are proven, I disagree.
- ways to do parallelization
- SPMD
- better locking mechainisms
- relase consistency ??
- lazy release consistency ??
- transactional consistency ??
- toroto -
big idea: you can do algo analysis on parallel programs.