2008-08-12

Large deformation diffeomorphic metric mapping

Idea: morph one shape into another, look into the morphing to classify disease.

Computational anatomy

Idea: I have no idea what this guy is saying.

2008-03-25

Stream Programming: Luring Programmers into the Multicore Era
  • Bill Thies

Languages to look out for:

  • pH
  • HAL,
  • Esterel

2008-03-24

Sparse Recovery Using Sparse Random Matrices

2008-02-20

Mixed finite element methods for elasticity
  • Douglass N Arnold

finite element exterior calculus

Elasticity: solving %$ A\sigma = \epsilon u$%, %$-div \sigma = f$%

  • good FEM for heat conduction. Raviart tomas or Brezzi Douglas Marini families of elements. Nedelec 1st kind or second kind face elements in 3D.
  • in 1960s, it must have become slear that naive element choices (linear stress, constant displacement) didnt' work.
  • 1968 Watwood & Hartz, use micromesh, proven stable
  • 1984 Arnold Douglas Gupta: another micromesh
  • Veubeke 1975, Amara-Thomas 1975 - modify variational form, only enforce weak conservation of angular momentum.
  • 1984 PEERS element- add 'bubbles' to elements for stability
  • 2002: Arnodl Winther, uses polynomials

2007-10-11

Panel on parallelism in multicore and differences Speaker 1
  • Differences with multicore compared to potential parallel computing
    • %$ \sqrt{p} $% acceptable. # of transistors growing like p^2,
    • compute times are short -- 16 msec frames
    • has to be robust -- target audience is so much larger, can't resolve bugs for individual people

Speaker 3

  • Nvidia chip programming
    • users must see initial speedup in order to use the new hardware.
  • performance tuning returns diminish.
  • therefore, work on specialized hardware for the task, ie database processors would do well.

Speaker 5

  • SciComp? people aren't shocked by multicore
  • Business people are shocked, expect a lot.

Other comments

  • get rid of variables, make them read only, use functional programming, or use transactional memory
  • who is the target audience for general parallel computing? Who needs it? What are the killer application.

2007-08-15

Thalamic Morphometric Analysis in ADHA
  • Ann Kenedy

  • size of the thalamus in ADHD.
  • computes regular 3D square grid of nodes that make up the thalamus
  • computes collection of tets that make up cube, include or reject them as part of the thalamus
  • computes volume through divergence integral? Why? We have the tets.

  • LDDMM Large Deformation Diffeomorphic Metric Mapping - allows you to compare arbitrary stuff to a standard template.


Vascular Endothelial Cell Modeling

  • Timothy Lay

Interested in the mechanisms behind muscles in the blood vessel and how they regular blood pressure. Cells between internal blood vessel wall and smooth muscle are electrically active (propagates for mm).


Investigating failed computational shocks in a Computational Model of Calcium Overload.
  • Linmiao Xu

2007-04-11

Book Recomendations
  • Domain driven design
  • patters of time triggered embedded systems.

2007-03-26

  • Bill Gropp
  • Challenges for MPI in the Petaflops Era
  • Argonne National Labs

  • "learning form the success of MPI" add to citeulike.org

2007-01-26

  • Automatied Composition of Reliable, Adaptive Parallel/Distributed Programs from Self-Describing Components
  • Jim Browne
  • UT Austin

  • programming model
    • flow of control versus interacting actors
  • too many buzzwords, this talk isn't showing the big ideas
  • basically, add a "contract" for each method, let the compiler do the matching between components. My idea for a programming model, but much more verbose.

2006-10-11

  • Kale: migratable objects and adaptive runtime systems for parallel performance.

2006-10-04

  • start with a good OO language
    • preferred java
  • remove java's restrictions to good performance
    • take out dynamic loading
    • don't use jvm -- compile to source
    • no reflection
    • do not require exact reproducibility of exceptions and floating point operations
  • add to java
    • true multi dimensional arrays
    • immutable classes :: classes that behave like a value, ie complex numbers.
      • taken from titanium
    • operator overloading
    • deep copying (at least for immutables)
      • using the := operator
  • modern scicmp appls use iregular structures
    • sparse arrays
    • graphs
    • mapping of data structre handled by user software, not complier, whihc makes optimization hard
    • move these structuers to the compiler level
  • Add ordered sets as basic types
  • maps: suport accessing and updating of maps.
    • too restrictive, doesn't handle projections that reduce dimensionality???
  • sites
    • run one thing or less at a site
    • store adata at a site
    • cohort -set of sites
    • io can be associated with a particular site
  • shared variales destroy history of modification
    • goto's destroy history of execution, same argument here.
    • unrestricted usage unnecessary since PRAM can be converted to Bulk Syncronous parallel program (assuming logarithmic parallel slack)
    • BSP model - any two conflicting access to shared variables is ordered by a barrier
  • allow nested barriers
    • do we allow named barriers?
    • other sugar to descript hor barriers nest?
  • glboal int i, j, sum
    Site[] c = new Site[2];
    sync {
       parallel {
         on c[0] : {
            i = 3;
            barrier();
            i = sum
            }
         on c[1] : {
            j = 7;
            barrier();
            j = sum
            }
         defaultL sum i+j
    
    • barrier is like a thrown exception, default is like a catch
    • should barrier be a function instead?
  • questions:
    • assume transactional memroy that supports transactions of a few location. Does this cover all reductions of interest?
  • types of variables -
    • global :: read only everywhere
    • shared :: read or invalid everywhere
    • private :: read write in one place, invalid on others
    • invlid :: not accessible
  • user can only re-cast to more restrictive types
  • reserach hypotheses:
    • java's approach of static and dynamic check results in type and memory safe codes, with acceptable overhead. Simple extensions allow concurrency safe codes with acceptable overheads
    • scientific codes can be expressed effienctly wiing a nested BSP model, augmented with atomic section for commuting updates
    • IDEs can provide feedback
  • goals
    • performance: tbd
    • semantic & performance transparency: beter than current
    • some user control of parallelism, load balancing, locality and communication: yes, runtime, yes, yes.
    • support for iterative refinement: yes, charm LB and update variables
    • modularity and coposability: good
    • OO: good
    • virtualization: yes
    • globla namespace: yes
    • global control: yes
    • dynamic paralleism: yes

2006-09-27

Talk is on the problems with parallel computing languages.

Parallel programing languages for Large Scale Parallel Computing

  • interested in 1K nodes
  • load balancing, locality, communication are algorithmic issues, need to be addressed by the user.
  • scientific computing
    • transformational rather then reactive code
    • (not mentioned in talk, but low fruit. programs are hihgly regular, easy to work with)
    • expect reporducibility and determinism
      • exceptions- associatice/communatice operations == reductions
    • large codes (>> 100 KLOC)
  • programming language domain: 3 dimensions
    • application type: SC, transaction server, client application, we services
    • software type large, ong lived application, small prototype code
    • paltform type: large scale, SMP, 1000K processors
  • need new langagues?
    • claim: bad mpi
      • mpi codes are larger
      • mpi is loc level *not true? an MPI apology
      • mpi communciation is asmall fraction of large frameworks and is hidden at bottom
      • empirical studies show some problems are easier with some frameworks.
      • code size bad predictor of devel time.
      • tuning is harder with higher level languages
        • cant control memory locality
      • other SE aspects have larger effect
      • parallel compilers aren't high quality.
  • what snir wants
    • performance - can beat mpi codes
      • same reason we moved away from assembly
      • faster compiled communication that avoid software overheaad of mpi
      • compiler optimizations of communications
      • co array fortran faster than mpi on cray machines?
    • semantics & performance transparendy
      • can analyze & understnat outcome and performance of parallel coe by looking at source code, language has simple (aporximate) performancs semantics.
      • Time = work/p + depth.
        • from parallel algorithms
        • guy from cmu developed parallel functional language with exact performance semantics.
        • Need approximate composition rules for work and depth.
      • some user control parallelism load blncing, and communication
        • algo design should be expressible
      • non-determinism only when needed.
      • support iteratice refinement of program
        • make performance decisions and optimizations later
      • modularity & composability
        • sequential mehtod can be replaced by parallel method wth no change in invoking code
          • nested parallelism? what?
        • easy to compose different parallel programs (only hard b/c mpi sucks).
        • semantics and performance characteristics of parallel code can be easily inferred from semantics and characteristics
      • object orientation
      • backwards compatibility
        • interoperability with mpi codes
        • similar to existing languages
  • ways to do this:
    • processor virtualization
    • why not user controlled load balancing?
      • failures of cores
      • power saving
      • compose large independent codes in other libraries
    • global namespace
      • untie memory location from memory name
      • still need local copies for performance.
      • problems - processers need to keep these variables in sync *how to do
        • compiled if association is persistent
        • managed in runtime (hash table)
        • optimize if acciciation is slow changing
        • run time compilation can be used here (duh)
      • global control and dynamic parallelism
        • shoudl focus on globally on algo, processor behavior shoudl be derived at run time
        • differences between distributed computing and parallel computation
  • what's happening today
    • partitioned global array systems
      • UPC and co arry fortran
      • gloabal references are syntactically distinct from local references
        • no overhead for local references
      • sequential code executed once on each processor (local view of control)
      • add global barriers and forall in UPC
    • critique
      • smile performance - can beat mpi
      • smile simple model of execution
      • :-/ user control of data and control partitioning - same as mpi
      • frown non deterministic behavior, need to lock between reads and writes (but only because lack of dependencies)
      • frown iterative refinement
      • frown composabiltiy - hard to compose
      • frown no object orientation (dynamic type resolution screws compiler-- why?)
      • smile backwards compatibility
      • frown no global namespace
      • frown no dynamic parallelism
    • X10
      • no global namespace
      • no simple performance

* my problems with snir's language

  • scientists care the most about performance
    • false! scientists care the most about turn around time. Why else are most clusters under utilized.
  • still assumes that the programmer is handling dependencies explicitly
  • assumes programmer wants/needs to handle memory tuning

2006-09-25

Vaidya came to talk about implementing a parallel Lanczos algortihm with Kale's Lab.

  • Block Lanczos Algorithms
    • get the name of the NA book he recommended.

  • factor A
    • symmetric pos def,
    • use cholesky
    • reorder so factorization is sparse
  • Loop
    • compute A^-1 * C for some columns C
    • take that product, orthogonalize w/r/t block vectors
  • clean Q vectors
    • n = 10 million
    • m' - 500-1000
    • m -

2006-09-21

  • A+ language came from morgan stanely.
  • Java xml broker - executable flowchart
  • contributed to AFS
  • contributes to cpan

2006-09-14

  • Who: Douglas Hofstadter
    • By the guy who wrote Godel Escher Bach.

talk focuses on analogy and how it forms the basis of cognition.

Overall, I thought the talk was ok, but not great. Hofstadter is an entertaining speaker, but I thought his argument was weak. He argued that things acquire their value by making comparisons against other things. To back this up, he provided lots of anecdotal evidence where his mind made comparisons. His conclusions didn't follow from his premises.

2006-08-24

  • Software inprications of multi core architectures
  • siddhartha <didn't get last name, from IBM>

  • death of moore's law? Due to small size of transistors, consumes more and more power without doing work.
  • languages to look up.
    • chapel
    • fortress
    • extent
    • toronto - sh language. stream language. Is that what I'm looking for?
  • he thinks that new programing paradigms wont appear, that we wont be able to move to functional programming
    • if the concepts are proven, I disagree.
  • ways to do parallelization
    • SPMD
      • openmp
      • PGAS ??
      • UPC ??
    • better locking mechainisms
      • relase consistency ??
      • lazy release consistency ??
      • transactional consistency ??
      • toroto -


big idea: you can do algo analysis on parallel programs.

Topic revision: r12 - 12 Aug 2008 - 13:52:26 - RobBlake
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback