Statistics files grammar

Filters and Statistics expressions grammar

Here it is the presentation of the syntactic and lexical level used to parse the expression of the filters and of the statistics files.

Syntactic level

The following grammar is presented in the YACC format. The terminal symbol are written as capital letters, while the not terminal symbol are written as small letters.

   1  name : NAME
   2  field : name '.' name
   3  cost_str : COST_STRING
   4  number : NUMBER
   5         | C_H_NUM
   6         | C_O_NUM
   7         | LIT_CH
   8  expression : field
   9              | number
  10              | TOK_TRUE
  11              | TOK_FALSE
  12              | TOK_CANTTELL
  13              | '(' expression ')'
  14              | expression '+' expression
  15              | expression LSHIFT expression
  16              | expression RSHIFT expression
  17              | expression NOT_EQ expression
  18              | expression MIU expression
  19              | expression MAU expression
  20              | expression '<' expression
  21              | expression '>' expression
  22              | expression '=' expression
  23              | expression AND expression
  24              | expression OR expression
  25              | expression '-' expression
  26              | expression '*' expression
  27              | expression '/' expression
  28              | expression '%' expression
  29              | '-' expression
  30              | NOT expression
  31              | SIZEOF '(' field ')'
  32              | name
  33              | GETDATALEN '(' name ')'
  34              | SIZEOF '(' name ')'
  35              | GETVMVAR '(' cost_str ')'

'expression' is the beginning symbol.

Here it is the table which shows the priority order of the operators in the expressions:

Low precedence
Operators	The operator is associated to the expression on the:
OR	left
AND	left
'=',NOT_EQ,'<','>',MIU,MAU	left
LSHIFT,RSHIFT	left
'+','-'	left
'*','/','%'	left
NOT,MENO_UNARIO	right
High precedence

Lexical level

Here it is a summary of the scanner which recognize the file tokens. This summary is presented in the L'EX format.

SPACE ([\t\r\n ])
SPACES ({SPACE}*)
LETTER ([a-zA-Z])
DIGIT ([0-9])
F_DIGIT ([1-9])
DECIMAL ({DIGIT}*{F_DIGIT}|0)
UNTIL_END ([^\n]*)
R_REM (\/\/)
HEX ((("0x")|("0X"))([a-fA-F]|{DIGIT})+)
OCT (("0")([0-7])+)
INTEGER ({F_DIGIT}{DIGIT}*)
VAR_NAME ((("_")|{LETTER})({LETTER}|{DIGIT}|("_"))*)
COST_STRING (\"[^\"\n]*\")
COST_INT_CORTO ({INTERO}|[0])
QUOTE	\'([^\\']|\\[^'\\]+|\\\\)\'
OP	([-+=*/<>(),.%])
VAR ({VAR_NAME})
COMMENT ("/*"([^*]*|"*"+[^*/])"*"+\/)
COMMENT_L ("//".*)
COMMENT_P (";".*)
%%
\xd	/* Discard LF*/
{COMMENT}	{ /* Discard comments */ }
{COMMENT_L}	{ /* Discard comments */ }
{COMMENT_P}	{ /* Discard comments */ }
sizeof	{return SIZEOF; }
offsetof	{return OFFSETOF; }
getvmvar	{return GETVMVAR; }
getdatalen	{return GETDATALEN; }
{COST_INT_CORTO}	{return NUMBER; }
{COST_STRING}	{return COST_STRING; }
{HEX}	{return C_H_NUM; }
{OCT}	{return C_O_NUM; }
{QUOTE}	{return LIT_CH; }
AND {return AND;}
"&" {return AND;}
"&&" {return AND;}
OR {return OR;}
"|" {return OR;}
"||" {return OR;}
NOT {return NOT;}
"!" {return NOT;}
"~" {return NOT;}
TRUE {return TOK_TRUE;}
FALSE {return TOK_FALSE;}
CANTTELL {return TOK_CANTTELL;}
{OP}	{return *yytext; }
"=="	{return '='; }
"!="	{return NOT_EQ; }
">="	{return MAU; }
"<="	{return MIU; }
"<<"	{return LSHIFT; }
">>"	{return RSHIFT; }
{VAR}	{return NAME;}
{SPACE}+ {/*ignore the spaces*/}
. {/*Ignore all the rest*/}
%%

'AND' can be written : 'AND', '&&', '&'

'OR' can be written : 'OR', '||', '|'

'NOT' can be written : 'NOT', '!', '~'

'=' can be written : '==', '='