Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel rns fgemv #273

Open
wants to merge 36 commits into
base: master
Choose a base branch
from
Open

Parallel rns fgemv #273

wants to merge 36 commits into from

Conversation

ZHG2017
Copy link
Contributor

@ZHG2017 ZHG2017 commented Jun 27, 2019

Implementation of fgemv for rns with corresponding helpers

ZHG added 28 commits May 3, 2019 17:01
…on into pfgemv which will no more require to be labeled with PAR_BLOCK
…gemv-mp has been restructured for different parameter values
@Breush
Copy link

Breush commented Jul 1, 2019

@ClementPernet Looks like all Travis builds are failing even on master. Have there been any updates in the configuration recently?

@@ -158,7 +158,7 @@ AC_PROG_LIBTOOL
AC_PROG_EGREP
AC_PROG_SED
# newer libtool...
LT_PREREQ([2.4.3])
LT_PREREQ([2.4.2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not have to change this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do so because I once got error with travis as if the libtool version was not corret. By changing the version to a lower number, I passed the travis compilation without the strange build failure.

{
for (size_t j=0; j<N; ++j) {
const Givaro::Integer & x(A[i*lda+j]);
if (Givaro::absCompare(x,vmax[i])>0){ vmax[i] = x;}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating vmax[i] in each parallel task creates contention on every operation. This code is likely not parallel at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes not so correctly parallelized but timing showed a few speedup. I cannot see any easier way to find the local max value for each thread then search for the global max value outside the parallel region.

@@ -410,7 +410,9 @@ namespace FFLAS {
else if (!std::is_same<Field,Givaro::ModularBalanced<float> >::value){
if (F.characteristic() < DOUBLE_TO_FLOAT_CROSSOVER)
return Protected::fgemm_convert<Givaro::ModularBalanced<float>,Field>(F,ta,tb,m,n,k,alpha,A,lda,B,ldb,beta,C,ldc,H);
else if (!std::is_same<Field,Givaro::ModularBalanced<double> >::value && 16*F.cardinality() < Givaro::ModularBalanced<double>::maxCardinality())
else if (!std::is_same<Field,Givaro::ModularBalanced<double> >::value &&
!std::is_same<Field,Givaro::ModularBalanced<double> >::value &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this line duplicated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember well, we did this together to correct some compile time error, but I could not remember of the exact reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants