c++ - Changing a vector into an array makes my program slower -
i profiled program of mine , found out hotspot levenshtein_distance, called recursively. decided try , optimize it.
lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) {     const size_t len1 = s1.size(), len2 = s2.size();     std::vector<unsigned int> col( len2+1 ), prevcol( len2+1 );      const size_t prevcolsize = prevcol.size();     for( unsigned int = 0; < prevcolsize; i++ )         prevcol[i] = i;      for( unsigned int = 0, j; < len1; ++i )     {         col[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( col[j], prevcol[1 + j] );             col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }         col.swap( prevcol );     }     return prevcol[len2]; } tl;dr: changed std::string → std::array
war story: , after running vtune on it, found line updates col[j+1] 1 slowing down (90% of time spent on it). thought: ok, maybe aliasing problem, maybe compiler cannot determine character arrays within string objects unaliased masked string interface , spends 90% of time checking no other part of program modified them.
so changed string static array, because there, there no dynamic memory, , next step have been using restrict compiler. in meantime, decided check if i had gained performance doing so.
lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) {     const size_t len1 = s1.size(), len2 = s2.size();     static constexpr unsigned max_string_size = 512;     assert(len1 < max_string_size && len2 < max_string_size);     static std::array<unsigned int, max_string_size> col, prevcol;      for( unsigned int = 0; < len2+1; ++i )         prevcol[i] = i;      // rest unchanged } tl;dr: now runs slow.
what happened lost performance. lot. instead of running in ~ 6 seconds, sample program runs in 44 seconds. using vtune profile again shows function called on , on again: std::swap (for you, gcc folks, in bits/move.h), in turn called std::swap_ranges (bits/stl_algobase.h).
i suppose std::min implemented using quicksort, explains why there swapping around, don’t understand why swapping, in case, takes time.
edit: compiler options: using gcc options "-o2 -g -dndebug" , bunch of warning specifiers.
for experiment ran version of original code largely unmodified pair of short strings got timings of ~36s array version , ~8s vector version.
your version seems depend on choice of max_string_size. when used 50 instead of 512 (which fitted strings), timing array version went down 16s.
i performed by-hand translation of main loop rid of explicit swap. further reduced time of array version 11s, , more interestingly, made array version timing independent of choice of max_string_size. when putting 512, array version still took 11s.
this evidence explicit swap of arrays bulk of perfomance issue version was.
there still significant difference between array , vector version array version talking around 40% longer. haven't had chance investigate why might be.
for( unsigned int = 0, j; < len1; ++i ) {     {         col[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( col[j], prevcol[1 + j] );             col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }     }      if (!(++i < len1))         return col[len2];      {         prevcol[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( prevcol[j], col[1 + j] );             prevcol[j+1] = std::min( minprev, col[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }     } } return prevcol[len2]; 
Comments
Post a Comment