2007-06-05

Things to discuss with Gernot:

  • input/output formats between CARP and memfem
  • getting limpet working in memfem and vice versa
  • standardizing on a build of petsc

2007-06-04

I ran into some design problems while trying to parallelize the way that tetra files are read into memory.

  • text files lack fixed lenght records, which makes seeking to specific line numbers very difficult.
  • When reading in tetras files, we need to syncronize across .tetras, .fibers, and .spec files simultaneously

Here are approaches I could use to solve the problem

  • Binary formats
    • Disadvantages: Changes the input format, endian-ness
    • Advantages: fixes seeking problem, relatively easy to implement
  • Parallel Ascii reads: have the first processor do a pre-scan and figure out where we should be seeking in all the input files
    • Disadvantages: all processors will be waiting on one processor to read in the whole file, hard to code
    • Advantages: no change tin input files
  • Split the input files into a separate input file for each processor
    • Disadvantages: not as fleixble, need to plan processor utilizations ahead of time
    • Advantages: little change to input files or the way that input files are read in.

Since most of these options require changing the way that input files are read in, I decided to do a search for other input formats. I wanted to see if anyone else had already solved this parallel IO problem. I tried Ideas UNV, AVS UCD, and a few other formats that libmesh supports. All of these seemed to want to be read in serial, and I couldn't find code that implemented the file IO. My most promising lead was HDF5, which supports named attributes, data, and parallel data IO on multiple processors (I found this page by looking at ParaView? support for parallel input files). HDF5 comes with a library for reading it's files, and unfortunately it comes with it's own build system. However, HDF5 uses MPIIO underneath, and MPI-IO currently has problems with NFS. This means that we wouldn't be able to use HDF5 on most of our clusters.

If we have to roll our own input format, the easiest approach is to split up the input files. This avoids the whole endian-ness issue entirely. Memfem supports input directories, so pre-partitioned data could be archived. In the case that we don't want to split things up, we could fall back to the original input files without rewriting the code.

While deciding how to split up the input files, I had to make a decision of how this relates to the parallel partitioning of the data. Can more than one processor own the same tetrahedral? The same point? In order to best support how the code currently splits up the data, I decided that it was best if each processor had all the input data necessary to implement the local matrix equations.

When I decided to finally get down to coding yesterday, I noticed that many memfem tests were no longer working. I went and took snapshots of the code every month that I had been gone, looking for the changes that caused the testing code to break. I think I can pin the cause to one or more problems:

  • Umar made a change in the input path code so that multiple files in the path with the same name caused memfem to barf and die
  • Umar made some change to the way that restart files are written, so now restart tests no longer pass.
  • Umar changed the electroporation currents so that now the code is much more sensitive to transmembrane stimulii.

2007-05-29

Read http://bruteforce.dyndns.org/cepwiki/RobSummer07ToDoList

I went over the above list with Natalia today. In the immediate future, the big todo items are parallel input file reads, getting our code to work nicely with limpet and vice versa, and implementing proper operator splitting inside memfem. While I'm working on those things, Umar can continue to work on the prototype for the libmesh version of memfem and test it's parallelization. When either I'm done with my stuff or Umar is done with his, we can sit down with Natalia and decide whether to persue the libmesh project further or continue with development in memfem.

2007-05-21

I had a conversation with Umar on Thursday about what needs to be done to memfem. Here is what we came up with:

  • change input files from text to binary
    • Currently, one of the big problem slowing memfem down is the fact that all the processors read in all the data from all the input files. If we switch to a binary format, all the processors can easily seek to a specific line in the file and start reading there. This would allow us to construct the matricies more quickly. It takes minutes to assemble the matricies with our current methods
  • change the matrix equations so that they are more stable
    • We should be separating the the elliptic and hyperbolic systems of equations. This is more numerically stable and allows us to take advantage of AMG and hypre libraries.
  • change the indexing scheme
    • Right now, the way that I've chosen to control indexing inside memfem was a bad choice for parallel computation. It won't scale to blue-gene machines. Fixing this coudl be very very difficult.
  • take advantage of other FEM packages
    • considering the effort that both Umar and I have had in learning and reimplementing a basic FEM package, the easiest way to parallelize our current code base might be to just re-implement the FEM portion of memfem in another framework. Umar has been looking at libMesh.

After talking with Umar, created a plan. We'd try to get a really simple version of memfem working in libMesh before you got back, and test it's parallelization. I've compiled the library for HPCx and here, and umar has been completing the memfem-clone that he has already been working on.

2007-05-18

On wednesday, I talked with Slava. He laid out an agenda for what he would like to see done inside memfem.

  • step one is taking his mechanical code and putting it alongside memfem so that the electrical code and the mechanical code can run alongside the electrical code.
    • Slava doesn't want to integrate his code into memfem exactly, he'd rather put it into a library that memfem can use.
    • This step involves figuring out a good abstraction layer so our codes can communicate well.
    • Slava says his code is hacky, and he doesn't have the nice user interface that memfem has. I predict that this interface will have to be built from scratch, and I also predict that it will neccessitate moderate to significant changes to Slava's code.
  • Slava needs purkinje fibers for his experiments..
    • He says he thinks he can find a way to generate the Purkinje fiber data, he just needs the simulation code
  • After the above items are done, Slava wants to run experiments where the capacitance changes on an element by element basis.
    • This requires changing the matrix every timestep, and rebuilding the matrix is currently a very expensive operation in memfem
    • This step requires changing the way that matricies are built, and changing the way that we read in our input.
  • Finally, Slava would liek to change the nodal coordinates dynamically in the application
    • This also seems to require rebuilding the matrix information every timestep.
    • This method could also get very complex, because nodal values would no longer be constant. I don't know the best way to syncronize displacements across processors.

2007-05-15

Mike and Russ recommended some restaurants to me before I left:

  • Bertha's Muscles in Fells point
  • Little Itly
  • Faccaros

Talked with Molly about the optical mapping problem. Apparently, the whole LV is hyperpolerized when run through my code.

  • test one: run the optical mapping code on the same time step over and over, see if I get different results.
    • result: all results ran the same, it's not a simple programming error.
  • test two: look at the graphs of voltage versus distance from the epi surface, see if I can see a pattern
    • steps: cleaned up a script that allows me to create and edit simple sets
    • steps: Asked umar to write a script to compute differences
  • test 3: get martin's code, see if I can compare our results before and after restriction to the surface

I read over all of Martin's emails and realized that he had already guessed where the bug was in my code. I was using the Vm values from the t-file for the internal blood cavity nodes. Right after shock, this causes huge false Vm potentials in the blood cavities, which screws up the results.

I fixed the code, checked in the fix, reran the optical compute stuff and sent it over to Molly for visualization.

Topic revision: r8 - 24 Apr 2008 - 16:46:08 - RobBlake
 

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback