Software
Services
News
Company Info
APTL
SCTL
Documentation
Download
Performance
Opstone Benchmarks
 Porting to SCTL: Example
Converting standard C codes to C++ is almost painless when using SCTL. For example, the following is a simple C function that computes the Cholesky factorization of a dense matrix that will be translated into SCTL C++:
int Cholesky( float** A, int n )
{
  int i, j, k;
  
  for( i = 0; i < n; ++i )
  {
    for( j = 0; j <= i; ++j )
    {
      float sum = A[i][j];
      for( k = 0; k < j; ++k )
        sum -= A[i][k] * A[j][k];
      
      if( i == j )
      {
        if( sum < 0 )  return( -1 );
	
        sum = sqrt( sum );
        if( sum < EPSILON )  return( -1 );
	
        A[i][i] = sum;
      }
      else
        A[i][j] = sum / A[j][j];
    }
  }
  
  return( 0 );
}
              
For a 512 x 512 matrix, this factorization requires approximately 0.16 seconds on an AMD™ Athlon™ XP 2000+ compiled with GNU 'gcc' (compile flags '-O3 -fno-exceptions -march=athlon-xp')
By simply using the SCTL 'StaticMatrix' class instead of the C-style 'pointer to pointers' 2D array style, the code can now be rewritten as: (modified code highlighted red)
int Cholesky( SCTL::StaticMatrix<float>& A )
{
  int i, j, k;
  
  if( A.NumRows() != A.NumCols() )
    return( -1 );
  
  int n = A.NumRows();
  
  for( i = 0; i < n; ++i )
  {
    for( j = 0; j <= i; ++j )
    {
      float sum = A[i][j];
      
      for( k = 0; k < j; ++k )
        sum -= A[i][k] * A[j][k];
      
      if( i == j )
      {
        if( sum < 0 )  return( -1 );
	
        sum = sqrt( sum );
        if( sum < EPSILON )  return( -1 );
	
        A[i][i] = sum;
      }
      else
        A[i][j] = sum / A[j][j];
    }
  }
  
  return( 0 );
}
             
Notice how most of the original C code remained the same. For the same size matrix and on the same platform with the same compiler flags (except that 'gcc' is replaced with 'g++') as above, this code requires the same amount of execution time (C++ is just as fast as C). The real benefits of using SCTL can be seen when the code is modified to take advantage of the auto-vectorization and auto-optimization SIMD assembly coding abilities:
int Cholesky( SCTL::StaticMatrix<float>& A )
{
  int i, j;
  
  if( A.NumRows() != A.NumCols() )
    return( -1 );
  
  int n = A.NumRows();
  
  for( i = 0; i < n; ++i )
  {
    for( j = 0; j <= i; ++j )
    {
      SCTL::StaticSubVector<float> rowj( A[j], j );
      
      float sum = A[i][j] - ScalarProduct( A[i], rowj );
      
      if( i == j )
      {
        if( sum < 0 )  return( -1 );
	
        sum = sqrt( sum );
        if( sum < EPSILON )  return( -1 );
	
        A[i][i] = sum;
      }
      else
        A[i][j] = sum / A[j][j];
    }
  }
  
  return( 0 );
}
             
For the same size matrix and on the same platform with the same compiler flags as above, this code is approximately 35% faster than the above codes due to the automatic vectorization and tuned SSE instructions that SCTL has built into it. Along with the increase in performance, this code is very similar to standard C (double square-bracket indexing for 2D arrays & no weird 'iterator' objects) as well as providing higher level functions that operate on vectors.

For sales information regarding SCTL or other inquiries, please send us an email to sctl@bluesailsoftware.com.

top of page
   Info Copyright ©2004 Blue Sail Software. All rights reserved. Legal