|
|
|
|
Converting standard C codes to C++ is almost painless when using SCTL.
For example, the following is a simple C function that computes the
Cholesky factorization of a dense matrix that will be translated into
SCTL C++:
|
|
int Cholesky( float** A, int n )
{
int i, j, k;
for( i = 0; i < n; ++i )
{
for( j = 0; j <= i; ++j )
{
float sum = A[i][j];
for( k = 0; k < j; ++k )
sum -= A[i][k] * A[j][k];
if( i == j )
{
if( sum < 0 ) return( -1 );
sum = sqrt( sum );
if( sum < EPSILON ) return( -1 );
A[i][i] = sum;
}
else
A[i][j] = sum / A[j][j];
}
}
return( 0 );
}
|
|
|
|
For a 512 x 512 matrix, this factorization requires approximately
0.16 seconds on an AMD™ Athlon™ XP 2000+ compiled with GNU
'gcc' (compile flags '-O3 -fno-exceptions -march=athlon-xp')
|
|
|
By simply using the SCTL 'StaticMatrix' class instead of the C-style
'pointer to pointers' 2D array style, the code can now be rewritten as:
(modified code highlighted red)
|
|
int Cholesky( SCTL::StaticMatrix<float>& A )
{
int i, j, k;
if( A.NumRows() != A.NumCols() )
return( -1 );
int n = A.NumRows();
for( i = 0; i < n; ++i )
{
for( j = 0; j <= i; ++j )
{
float sum = A[i][j];
for( k = 0; k < j; ++k )
sum -= A[i][k] * A[j][k];
if( i == j )
{
if( sum < 0 ) return( -1 );
sum = sqrt( sum );
if( sum < EPSILON ) return( -1 );
A[i][i] = sum;
}
else
A[i][j] = sum / A[j][j];
}
}
return( 0 );
}
|
|
|
|
Notice how most of the original C code remained the same.
For the same size matrix and on the same platform with the same compiler
flags (except that 'gcc' is replaced with 'g++') as above, this code
requires the same amount of execution time (C++ is just
as fast as C). The real benefits of using SCTL can be seen when
the code is modified to take advantage of the auto-vectorization and
auto-optimization SIMD assembly coding abilities:
|
|
int Cholesky( SCTL::StaticMatrix<float>& A )
{
int i, j;
if( A.NumRows() != A.NumCols() )
return( -1 );
int n = A.NumRows();
for( i = 0; i < n; ++i )
{
for( j = 0; j <= i; ++j )
{
SCTL::StaticSubVector<float> rowj( A[j], j );
float sum = A[i][j] - ScalarProduct( A[i], rowj );
if( i == j )
{
if( sum < 0 ) return( -1 );
sum = sqrt( sum );
if( sum < EPSILON ) return( -1 );
A[i][i] = sum;
}
else
A[i][j] = sum / A[j][j];
}
}
return( 0 );
}
|
|
|
|
For the same size matrix and on the same platform with the same compiler
flags as above, this code is approximately 35% faster than the
above codes due to the automatic vectorization and tuned SSE instructions
that SCTL has built into it. Along with the increase in performance,
this code is very similar to standard C (double square-bracket indexing
for 2D arrays & no weird 'iterator' objects) as well as providing
higher level functions that operate on vectors.
For sales information regarding SCTL or other inquiries, please send us
an email to
sctl@bluesailsoftware.com.
|
|
|
top of page
|
|