html - PHP, extracting mailing address -


i have problem need fixing. trying create script crawls websites mailing addresses. german addresses, unsure of how create said script, have created 1 extracts email addresses said websites. address 1 puzzling because there isn't real format.. here couple german addresses examples on way possibly extract data.

ilona mustermann hauptstr. 76 27852 musterheim   andreas mustermann schwarzwaldhochstraße 1 27812 musterhausen   d. mustermann kaiser-wilhelm-str.3 27852 mustach 

those few examples of looking extract websites. possible php?

edit:

this have far

function extract_address($str) { $str = strip_tags($str); $name = null; $zcc = null; $street = null;  foreach(preg_split('/([^a-za-z0-9üß\-\@\.\(\) .])+/', $str) $token) {     if(preg_match('/([a-za-z\.])+ ([a-za-z\.])+/', $token)){         $name = $token;     }      if(preg_match('/ /', $token)){         $street = $token;     }      if(preg_match('/[0-9]{5} [a-za-zü]+/', $token)){         $zcc = $token;     }      if(isset($name) && isset($zcc) && isset($street)){         echo($name."<br />".$street."<br />".$zcc."<br /><br />");         $name = null;         $street = null;         $zcc = null;     }     } } 

it works retrieve $name(ie: ilona mustermann , city/zipcode(27852 musterheim) unsure of regex retrieve streets?


well have came far, , seems working 60% of time on streets, zip/city work 100% , name. when tries extract street fails.. idea why?

function extract_address($str) {     $str = strip_tags($str);     $name = null;     $zcc = null;     $street = null;      foreach(preg_split('/([^a-za-z0-9üß\-\@\.\(\)\& .])+/', $str) $token) {         if(preg_match('/([a-za-z\&.])+ ([a-za-z.])+/', $token) && !preg_match('/([a-za-zß])+ ([0-9])+/', $token)){             //echo("n:$token<br />");             $name = $token;         }          if(preg_match('/(\.)+/', $token) || preg_match('/(ß)+/', $token) || preg_match('/([a-za-zß\.])+ ([0-9])+/', $token)){             $street = $token;         }          if(preg_match('/([0-9]){5} [a-za-züß]+/', $token)){             $zcc = $token;         }          /*echo("<br />             n:$name             <br />             s:$street             <br />             z:$zcc             <br />             ");*/          if(isset($name) && isset($zcc) && isset($street)){             echo($name."<br />".$street."<br />".$zcc."<br /><br />");             $name = null;             $street = null;             $zcc = null;         }     } } 

of course possible need use preg_match() function. making regex pattern.

for example post-code

<?php $str = "your adresses string here"; preg_match('/([0-9]+) ([a-za-z]+)/', $str, $matches); print_r($matches);  ?> 

this regex matches adresses you've given need put in native characters.

 [a-za-züß.]+ [a-za-z.üß]+\s[a-za-z. 0-9ß-]+\s[0-9]+ [a-za-züß.]+ 

Comments

Popular posts from this blog

Change php variable from jquery value using ajax (same page) -

Pull out data related to my apps from Android Play Store and iOS App Store -

How can I fetch data from a web server in an android application? -