The pdf file is analyzed following a syntactic level and a lexical level.
The grammar of this level is presented in the YACC format:
1 file : protocol_list 2 protocol_list : pheader ptrailer 3 | protocol_list pheader ptrailer 4 pheader : PROTOCOL name structs ENDPR 5 | PROTOCOL name ENDPR 6 skip : SKIP ':' expression
7 ptrailer : skip next_protocol 8 | next_protocol 9 next_protocol : ELSE name 10 | n_protocol ELSE name 11 n_protocol : CASE expression ':' name 12 | n_protocol CASE expression ':' name 13 structs : structs def bit_defs 14 | def bit_defs 15 bit_defs : '{' b_defs '}' 16 | 17 b_defs : BIT name'(' number ',' number ')' 18 | b_defs BIT name'(' number ',' number ')' 19 def : BYTE name 20 | WORD name 21 | DWORD name 22 name: NAME 23 number : NUMBER 24 | NUMBER_H 25 expression : name 26 | number 27 | '(' expression ')' 28 | expression '+' expression 29 | expression NOT_EQ expression 30 | expression MIU expression 31 | expression MAU expression 32 | expression '<' expression 33 | expression '>' expression 34 | expression '=' expression 35 | expression AND expression 36 | expression OR expression 37 | expression '-' expression 38 | expression '*' expression 39 | expression '/' expression 40 | expression '%' expression 41 | '-' expression 42 | NOT expression
The operators precedence in the expressions::
Low Precedence | |
Operators | Association |
OR | left |
AND | left |
'=',NOT_EQ,'<','>',MIU,MAU | left |
'+','-' | left |
'*','/','%' | left |
NOT,MENO_UNARIO | right |
High Precedence |
Here it is how the scanner recognizes the tokens in the file. It is presented in the LEX format:
SPACE ([\t\r\n ]) SPACES ({SPACE}*) LETTER ([a-zA-Z]) DIGIT ([0-9]) F_DIGIT ([1-9]) DECIMAL ({DIGIT}*{F_DIGIT}|0) HEX ((("0x")|("0X"))([a-fA-F]|{DIGIT})+) INTEGER ({F_DIGIT}{DIGIT}*) VAR_NAME ((("_")|{LETTER})({LETTER}|{DIGIT}|("_"))*) CONST_INT_SHORT ({INTEGER}|[0]) OP ([-+=*/%():{},]) VAR ({VAR_NAME}) COMMENT (("/*"([^*]*|"*"+[^*/])"*"+\/)|("//".*)) %% BYTE {return BYTE;} BIT {return BIT;} BITS {return BIT;} WORD {return WORD;} DWORD {return DWORD;} AND {return AND;} "&&" {return AND;} "&" {return AND;} OR {return OR;} "||" {return OR;} "|" {return OR;} NOT {return NOT;} "!" {return NOT;} "~" {return NOT;} PROTOCOL {return PROTOCOL;} ENDPR {return ENDPR;} NOT {return NOT;} SKIP {return SKIP;} CASE {return CASE;} ELSE {return ELSE;} "!=" {return NOT_EQ;} "<=" {return MIU;} ">=" {return MAU;} {HEX} {return NUMBER_H;} {CONST_INT_SHORT} {return NUMBER;} {COMMENT} {/* Discards comments */ } {OP} {return yytext[0];} {VAR} {return NAME;} {SPACE}+ {/*ignores spaces*/} . {printf("Scanner error at line %d\n",yylineno);return (-1);} %%
'AND' can be written : 'AND', '&&', '&'
'OR' can be written: 'OR', '||', '|'
'NOT' can be written: 'NOT', '!', '~'
'=' can be written: '==', '='