-
Notifications
You must be signed in to change notification settings - Fork 14
lexer implementation notes
SinghCoder edited this page Feb 16, 2020
·
1 revision
- Reduce # of I/O operations
- Don't read char by char
- Read block by block from disk
- To test the performance, create test cases with duplicate code right now
- Twin buffer
- Avoid modularity at very basic level
- Don't use isalpha, isdigit type functions
- maybe use inline functions or macros instead
-
a token
- name of token
- accept state kaunsi thi
- lexeme recorded
- prefer char [] instead of char * // avoid pointers as much as possible
(char *)(begin...forward_ptr-1)
- then change begin pointer to forward_ptr
- line number
- unsigned int
- value
-
union{ int float }
-
- tag for value
- name of token
-
Right now, print lexer output as
- token_name | value | line_num \n
- printing token_name requires mapping table (maps token {enum value} to corresponding string)
-
Parser retreives tokens one by one.
get_next_token()
- lexer stores one single record for a token.
- Lexing and parsing will go hand in hand
- On parser demand, lexer will create token and return to parser
-
Implement hash table for keyword lookup - should be collison resistant.