c++ - Changing a vector into an array makes my program slower -
i profiled program of mine , found out hotspot levenshtein_distance
, called recursively. decided try , optimize it.
lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) { const size_t len1 = s1.size(), len2 = s2.size(); std::vector<unsigned int> col( len2+1 ), prevcol( len2+1 ); const size_t prevcolsize = prevcol.size(); for( unsigned int = 0; < prevcolsize; i++ ) prevcol[i] = i; for( unsigned int = 0, j; < len1; ++i ) { col[0] = i+1; const char s1i = s1[i]; for( j = 0; j < len2; ++j ) { const auto minprev = 1 + std::min( col[j], prevcol[1 + j] ); col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) ); } col.swap( prevcol ); } return prevcol[len2]; }
tl;dr: changed std::string
→ std::array
war story: , after running vtune on it, found line updates col[j+1]
1 slowing down (90% of time spent on it). thought: ok, maybe aliasing problem, maybe compiler cannot determine character arrays within string objects unaliased masked string interface , spends 90% of time checking no other part of program modified them.
so changed string static array, because there, there no dynamic memory, , next step have been using restrict
compiler. in meantime, decided check if i had gained performance doing so.
lvh_distance levenshtein_distance( const std::string & s1, const std::string & s2 ) { const size_t len1 = s1.size(), len2 = s2.size(); static constexpr unsigned max_string_size = 512; assert(len1 < max_string_size && len2 < max_string_size); static std::array<unsigned int, max_string_size> col, prevcol; for( unsigned int = 0; < len2+1; ++i ) prevcol[i] = i; // rest unchanged }
tl;dr: now runs slow.
what happened lost performance. lot. instead of running in ~ 6 seconds, sample program runs in 44 seconds. using vtune profile again shows function called on , on again: std::swap
(for you, gcc folks, in bits/move.h), in turn called std::swap_ranges
(bits/stl_algobase.h).
i suppose std::min
implemented using quicksort
, explains why there swapping around, don’t understand why swapping, in case, takes time.
edit: compiler options: using gcc options "-o2 -g -dndebug" , bunch of warning specifiers.
for experiment ran version of original code largely unmodified pair of short strings got timings of ~36s array version , ~8s vector version.
your version seems depend on choice of max_string_size
. when used 50 instead of 512 (which fitted strings), timing array version went down 16s.
i performed by-hand translation of main loop rid of explicit swap. further reduced time of array version 11s, , more interestingly, made array version timing independent of choice of max_string_size
. when putting 512, array version still took 11s.
this evidence explicit swap of arrays bulk of perfomance issue version was.
there still significant difference between array , vector version array version talking around 40% longer. haven't had chance investigate why might be.
for( unsigned int = 0, j; < len1; ++i ) { { col[0] = i+1; const char s1i = s1[i]; for( j = 0; j < len2; ++j ) { const auto minprev = 1 + std::min( col[j], prevcol[1 + j] ); col[j+1] = std::min( minprev, prevcol[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) ); } } if (!(++i < len1)) return col[len2]; { prevcol[0] = i+1; const char s1i = s1[i]; for( j = 0; j < len2; ++j ) { const auto minprev = 1 + std::min( prevcol[j], col[1 + j] ); prevcol[j+1] = std::min( minprev, col[j] + ( static_cast<unsigned int>( s1i != s2[j] ) ) ); } } } return prevcol[len2];
Comments
Post a Comment