multithreading - C# Multithreaded Mass Parse -


i'm sort of making 'web parser', except 1 website, parsing many different pages @ 1 time.

currently, there may 300,000 pages need parse, in relatively fast manner (i'm grabbing tiny amount of information doesn't take long go through, every page takes ~3 seconds max on network). of course, 900,000 seconds days = 10 days, , terrible performance. reduce couple of hours @ most, i'm reasonable time amount of requests, still needs 'fast'. know can't 300,000 @ 1 time, or website block of requests, there have few seconds delay in between each , every request.

i have processing in single foreach loop, not taking advantage of multithreading ever, know take advantage of it, i'm not sure path should take whether threadpools, or type of threading system or design.

basically, i'm looking point me in right direction of efficiency using multithreading can ease time take parse many pages on end, sort of system or structure threading.

thanks

check out answer this question, sounds might want check out parallel.foreach.

there various other methods of achieving want in multi-threaded fashion. give idea of how works:

  1. download linqpad. (should prerequisite c# developer, imho!)
  2. in "samples", "download/import more samples..." , ensure have "asynchronous functions in c#" downloaded.
  3. work through samples, seeing how fit together.

in fact, here 1 of asynchronous examples works uris:

// await keyword useful when want run in loop. instance:  string[] uris = {     "http://linqpad.net",     "http://linqpad.net/downloadglyph.png",     "http://linqpad.net/linqpadscreen.png",     "http://linqpad.net/linqpadmed.png", };  // try doing following without await keyword!  int totallength = 0; foreach (string uri in uris) {     string html = await (new webclient().downloadstringtaskasync (new uri (uri)));     totallength += html.length; } totallength.dump();  // continuation not 'totallength += html.length', rest of loop! (and final // call 'totallength.dump()' @ end.)  // logically, execution exits method , returns caller upon reaching await statement. rather // 'yield return' (in fact, compiler uses same state-machine engine rewrite asynchronous // functions iterators). // // when task completes, continuation kicks off , execution jumps middle of  // loop - right left off! 

Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

node.js - Getting the socket id,user id pair of a logged in user(s) -

keyboard - C++ GetAsyncKeyState alternative -