c++ - Changing a vector into an array makes my program slower -


i profiled program of mine , found out hotspot levenshtein_distance, called recursively. decided try , optimize it.

lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) {     const size_t len1 = s1.size(), len2 = s2.size();     std::vector<unsigned int> col( len2+1 ), prevcol( len2+1 );      const size_t prevcolsize = prevcol.size();     for( unsigned int = 0; < prevcolsize; i++ )         prevcol[i] = i;      for( unsigned int = 0, j; < len1; ++i )     {         col[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( col[j], prevcol[1 + j] );             col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }         col.swap( prevcol );     }     return prevcol[len2]; } 

tl;dr: changed std::stringstd::array

war story: , after running vtune on it, found line updates col[j+1] 1 slowing down (90% of time spent on it). thought: ok, maybe aliasing problem, maybe compiler cannot determine character arrays within string objects unaliased masked string interface , spends 90% of time checking no other part of program modified them.

so changed string static array, because there, there no dynamic memory, , next step have been using restrict compiler. in meantime, decided check if i had gained performance doing so.

lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) {     const size_t len1 = s1.size(), len2 = s2.size();     static constexpr unsigned max_string_size = 512;     assert(len1 < max_string_size && len2 < max_string_size);     static std::array<unsigned int, max_string_size> col, prevcol;      for( unsigned int = 0; < len2+1; ++i )         prevcol[i] = i;      // rest unchanged } 

tl;dr: now runs slow.

what happened lost performance. lot. instead of running in ~ 6 seconds, sample program runs in 44 seconds. using vtune profile again shows function called on , on again: std::swap (for you, gcc folks, in bits/move.h), in turn called std::swap_ranges (bits/stl_algobase.h).

i suppose std::min implemented using quicksort, explains why there swapping around, don’t understand why swapping, in case, takes time.

edit: compiler options: using gcc options "-o2 -g -dndebug" , bunch of warning specifiers.

for experiment ran version of original code largely unmodified pair of short strings got timings of ~36s array version , ~8s vector version.

your version seems depend on choice of max_string_size. when used 50 instead of 512 (which fitted strings), timing array version went down 16s.

i performed by-hand translation of main loop rid of explicit swap. further reduced time of array version 11s, , more interestingly, made array version timing independent of choice of max_string_size. when putting 512, array version still took 11s.

this evidence explicit swap of arrays bulk of perfomance issue version was.

there still significant difference between array , vector version array version talking around 40% longer. haven't had chance investigate why might be.

for( unsigned int = 0, j; < len1; ++i ) {     {         col[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( col[j], prevcol[1 + j] );             col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }     }      if (!(++i < len1))         return col[len2];      {         prevcol[0] = i+1;         const char s1i = s1[i];         for( j = 0; j < len2; ++j )         {             const auto minprev = 1 + std::min( prevcol[j], col[1 + j] );             prevcol[j+1] = std::min( minprev, col[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) );         }     } } return prevcol[len2]; 

Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

node.js - Getting the socket id,user id pair of a logged in user(s) -

keyboard - C++ GetAsyncKeyState alternative -