f# - Parsing the full input twice -


to achieve case-insensitive infix operators using operatorprecedenceparser, i'm preprocessing input, parsing text delimited string literals. text portion searched infix operators need uppercased (to conform operator known opp). actual parsing takes place.

my question is, can both phases combined single parser? tried

// preprocess: parser<string,_> // scalarexpr: parser<scalarexpr,_> let filter = (preprocess .>> eof) >>. (scalarexpr .>> eof) 

but fails @ end of input, seemingly expecting scalarexpr. input can parsed preprocess , scalarexpr independently, i'm guessing it's issue eof, can't seem right. possible?

here other parsers reference.

let stringliteral =    let substring = manysatisfy ((<>) '"')   let escapedquote = stringreturn "\"\"" "\""   (between (pstring "\"") (pstring "\"") (stringssepby substring escapedquote))   let canonicalizekeywords =   let keywords =      [       "or"       "and"       "contains"       "startswith"       "endswith"     ]   let caseinsensitivekeywords = hashset(keywords, stringcomparer.invariantcultureignorecase)   fun text ->     let re = regex(@"([\w][\w']*\w)")     re.replace(text, matchevaluator(fun m ->       if caseinsensitivekeywords.contains(m.value) m.value.toupperinvariant()       else m.value))  let preprocess =    stringssepby      ((manysatisfy ((<>) '"')) |>> canonicalizekeywords)      (stringliteral |>> (fun s -> "\"" + s + "\""))  

the simplest way parse case insensitive operators fparsec's operatorprecedenceparser add operator definitions every casing want support. if need support short operator names, such "and" or "or", add possible case combinations. if want use operator names long approach, might consider supporting sane casings, i.e. lowercase, uppercase, camelcase , pascalcase. when want support multiple casings, convenient write helper function automatically generates needed casings standard one.

if have long operator names , want support casings, operatorprecedenceparser's dynamic configurability allows following approach, should easier , more efficient transforming input:

  1. search input case insensitive occurrences of supported operators. search shouldn't miss occurrences, it's no problem if finds false positives if e.g. operator name used inside function name or inside string literal.
  2. add unique casings found in step 1 operatorprecedenceparser. (usually there won't many casings of same operator.)
  3. parse input configured operatorprecedenceparser.

when parse multiple inputs, can keep operatorprecedenceparser instance around , lazily add new operators casings need them.


Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

node.js - Getting the socket id,user id pair of a logged in user(s) -

keyboard - C++ GetAsyncKeyState alternative -