Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Make an interpreter for this abomination #74

Open
wants to merge 78 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
2ffb75a
Groundwork
Bytestorm5 Jun 4, 2023
5aa5b4a
typo
Bytestorm5 Jun 4, 2023
7dc2119
tokenizer nearly done?
Bytestorm5 Jun 4, 2023
d32846c
mobile coding???????
Bytestorm5 Jun 4, 2023
75ead99
added more yelling
Bytestorm5 Jun 4, 2023
2a9fb21
newline and debugging
Bytestorm5 Jun 4, 2023
f68141e
Merge pull request #1 from TodePond/main
Bytestorm5 Jun 5, 2023
7aba489
very indecisive about the naming
Bytestorm5 Jun 5, 2023
347110d
Merge branch 'main' of https://github.com/Bytestorm5/DreamBerd
Bytestorm5 Jun 5, 2023
1d3a98d
forgot to save
Bytestorm5 Jun 5, 2023
23407e7
tokenizer probably works
Bytestorm5 Jun 5, 2023
50732c2
Put Tokenizer in a box
Bytestorm5 Jun 5, 2023
e678549
Tokenizer error handling
Bytestorm5 Jun 5, 2023
c52f87b
Generalize crawler to hopefully make parsing easier
Bytestorm5 Jun 5, 2023
448aef7
Improved AI
Bytestorm5 Jun 5, 2023
7a558c2
Implicit AI with newlines
Bytestorm5 Jun 5, 2023
d3c5655
Added Americentrism
Bytestorm5 Jun 5, 2023
25bd3ad
Removed Americentrism
Bytestorm5 Jun 5, 2023
2be54ab
Merge branch 'TodePond:main' into main
Bytestorm5 Jun 5, 2023
0ccea99
Added className and other misc changes
Bytestorm5 Jun 5, 2023
d392846
Individual tokens for Exclamations
Bytestorm5 Jun 5, 2023
d9320f5
totally didn't forget boolean operators who would do that not this gu…
Bytestorm5 Jun 5, 2023
6ab9618
Merge branch 'TodePond:main' into main
Bytestorm5 Jun 5, 2023
806a8b3
Check for internet connection
Bytestorm5 Jun 6, 2023
2f0f120
Parser groundwork
Bytestorm5 Jun 6, 2023
c074a2f
Improved "FUNCTION" keyword checks.
Odinmylord Jun 6, 2023
872ba48
Reformatted file.
Odinmylord Jun 6, 2023
62b2214
Update template.tsx
Bytestorm5 Jun 6, 2023
64f17fe
REGEX REGEX REGEX REGEX REGEX REGEX REGEX REGEX REGEX REGEX REGEX REGEX
Bytestorm5 Jun 6, 2023
dfede24
Made the regex even more cursed, now matches triple const
Odinmylord Jun 7, 2023
2865c10
Started implementing single line function declaration
Odinmylord Jun 7, 2023
0ae9440
Fixed the regex (splitting it in multiple lines wasn't a good idea) a…
Odinmylord Jun 7, 2023
11881fc
Fixed small issue with indentation
Odinmylord Jun 7, 2023
d82fd46
Fixed issues with variable assignment and function keyword
Odinmylord Jun 7, 2023
df530c0
Re-enable identification requirement && Lifetime functionality in com…
Bytestorm5 Jun 8, 2023
7917404
it is way too late to figure out time travel
Bytestorm5 Jun 8, 2023
5493a41
Preparing for chicanery
Bytestorm5 Jun 8, 2023
5a0a0e8
Variables tracked in a map now for greater flexibility
Bytestorm5 Jun 8, 2023
9036f89
This is a first try to check function syntax it isn't complete yet
Odinmylord Jun 8, 2023
ccaf1a4
some processor functions
Bytestorm5 Jun 8, 2023
efac972
Wrap expr in get_var
Bytestorm5 Jun 8, 2023
3445a8d
forgot to delete a keyword
Bytestorm5 Jun 8, 2023
265a27f
Dynamically determine if quotes are needed
Bytestorm5 Jun 8, 2023
645885a
expressions are hard
Bytestorm5 Jun 9, 2023
b7c0580
Merge remote-tracking branch 'origin/main' into functions
Odinmylord Jun 9, 2023
babf1d2
Added indentation check and fixed issue with newlines on linux
Odinmylord Jun 12, 2023
9082e9c
Now the script works with python 3.8+
Odinmylord Jun 12, 2023
945b278
almost finished parsing expressions
Bytestorm5 Jun 13, 2023
e3d7fab
why was I parsing to begin with?
Bytestorm5 Jun 13, 2023
4036885
Revert "why was I parsing to begin with?"
Bytestorm5 Jun 13, 2023
336ccac
Figured out parsing but do exponents *really* have to be right-associ…
Bytestorm5 Jun 13, 2023
04165bc
Finished expression parsing except for right associativity
Bytestorm5 Jun 13, 2023
a3bc397
Line preprocessor now solves precise equalities
Bytestorm5 Jun 13, 2023
08cae0c
Moved condition block manager to helper file
Bytestorm5 Jun 13, 2023
91dd845
keeping equality check simple for now, might change
Bytestorm5 Jun 13, 2023
ee6b452
Moved precise equality to expr processor
Bytestorm5 Jun 13, 2023
f218614
When check conditions and advanced variable geting
Bytestorm5 Jun 13, 2023
d88389b
We have some sort of function syntax check
Odinmylord Jun 13, 2023
92f60d7
Added const const var to invalid mix
Odinmylord Jun 13, 2023
aceda74
Merge pull request #2 from Bytestorm5/functions
Bytestorm5 Jun 13, 2023
8155a62
Recursive scoping
Bytestorm5 Jun 13, 2023
08c55da
Added type identifiers (they don't do anything)
Bytestorm5 Jun 13, 2023
11e4722
All functions lead to arrows
Bytestorm5 Jun 13, 2023
91a11b4
attempt at block handling
Bytestorm5 Jun 13, 2023
c06b6df
fix(compiler): variable declaration statements
gabrielchl Jun 22, 2023
8f71680
Add slots and fix type annotations
CoolCat467 Jun 22, 2023
3fd1c02
Add pyproject.toml for dependancies and mypy flags
CoolCat467 Jun 22, 2023
793ef9b
Merge pull request #3 from gabrielchl/main
Bytestorm5 Jun 22, 2023
fd84740
Merge branch 'pr/4'
Bytestorm5 Jun 22, 2023
2fd43e7
Merge pull request #5 from TodePond/main
Bytestorm5 Jun 22, 2023
03dc576
"serious" compiler boilerplate
Bytestorm5 Jun 23, 2023
baa183a
Reduce code duplication
CoolCat467 Jun 23, 2023
59701a6
Add code formatting
CoolCat467 Jun 23, 2023
ff68d1d
Revert "Reduce code duplication", wrong branch
CoolCat467 Jun 23, 2023
78255b8
refactor
Bytestorm5 Jun 23, 2023
f1c62af
feat: use filepath from args
gabrielchl Jun 23, 2023
6dbda7d
Merge pull request #8 from gabrielchl/feat-filepath-from-args
Bytestorm5 Jun 23, 2023
435eba3
Merge pull request #7 from CoolCat467/add-formatting
Bytestorm5 Jun 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
venv/
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,8 @@ import add!
add(3, 2)!
```

Note that due to the fact that it is impossible to export all functions of a file to all other DreamBerd files in existence, there is __no__ DreamBerd standard library.

By the way, to see DreamBerd in action, check out [this page](https://github.com/TodePond/DreamBerd/blob/main/LICENSE.md).

## Class
Expand Down
5 changes: 5 additions & 0 deletions src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
**Note:** This code is __under development__ and is not yet functional, let alone efficient.
# Compinterpreting
Running a perfect programming language requires a perfect compiler to run it. As such, the DreamBerd foundation has devised the next innovation in the field of compiling. The **Compinterpreter**.

The Compinterpreter works by interpreting DreamBerd while at the same time transpiling it to JavaScript, maximizing efficiency while staying true to DreamBerd.
283 changes: 283 additions & 0 deletions src/compinterpret.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
from codecs import decode
import gettext
from io import TextIOWrapper
import os
import locale

tokens = ["STRING", "NOT", "!", "IF", 'ELSE', '(', ')', '[', ']', 'TRUE', 'FALSE', 'CONST', 'VAR', '<', '>', 'INT', 'REAL', 'INFINITY', 'FUNCTION', 'PREVIOUS',
'NEXT', 'AWAIT', 'NEW_FILE', 'EXPORT', 'TO', 'CLASS', 'NEW', '.', 'USE', 'PLUS', 'MINUS', 'MULTIPLY', 'DIVIDE', '=', 'IDENTIFIER', 'INDENT',
'SPACE', 'DELETE', 'EOF', 'NEWLINE', '{', '}', 'INC', 'DEC', 'LOOSE_EQUALITY', 'PRECISE_EQUALITY', 'LITERAL_EQUALITY', 'ERROR', 'CURRENCY']
locale.setlocale(locale.LC_ALL, '')

class Token():
def __init__(self, token: str, lexeme: str) -> None:
global tokens
assert token.upper() in tokens

self.token = token.upper()
self.lexeme = lexeme

def __repr__(self) -> str:
return f'{self.token}({repr(self.lexeme)})'

def __str__(self) -> str:
return f'{self.token}({repr(self.lexeme)})'

class SimpleListCrawler():
def __init__(self, raw) -> None:
self.raw = raw
self.cursor = 0

def pop(self):
if self.cursor == len(self.raw):
return ''
self.cursor += 1
return self.raw[self.cursor-1]

def back(self, count=1):
self.cursor -= count

def peek(self, count=1):
if self.cursor == len(self.raw)-1:
return ''
return self.raw[self.cursor:self.cursor+count]

class Tokenizer():
def __init__(self) -> None:
self.operators = '+-*/<>=()[] '
self.reserved_chars = '!;.{}' + self.operators

self.basic_mappings = {
';': 'NOT',
'=': 'EQUAL',
'*': 'MULTIPLY',
'/': 'DIVIDE',
'.': '.',
'(': '(',
')': ')',
'[': ']',
'<': '<',
'>': '>',
'{': '{',
'}': '}'
}

regional_currency = locale.localeconv()['currency_symbol']
if regional_currency == '':
# Americentrisim, baby 😎🦅🔫🔫🦅🦅🦅🔫🦅 🦅🔫🔫🔫🦅🔫🦅🔫 🦅🔫🔫🔫🦅🦅🔫🔫 🦅🔫🔫🦅🔫🦅🦅🦅 🦅🦅🔫🦅🦅🦅🦅🦅 🦅🔫🔫🦅🦅🔫🦅🦅 🦅🔫🔫🦅🔫🦅🦅🔫 🦅🔫🔫🦅🦅🔫🦅🦅 🦅🦅🔫🦅🦅🦅🦅🦅 🦅🦅🔫🔫🔫🦅🦅🔫 🦅🦅🔫🦅🔫🔫🔫🔫 🦅🦅🔫🔫🦅🦅🦅🔫 🦅🦅🔫🔫🦅🦅🦅🔫😎
regional_currency = '$'
self.basic_mappings[regional_currency] = 'CURRENCY'

def is_fn_subset(self, string):
target = "FUNCTION"
i = 0

for char in string:
if char == target[i]:
i += 1
if i == len(target):
return True

return False

def getNextToken(self, file: SimpleListCrawler):
def readchar(i=1):
return ''.join([file.pop() for _ in range(i)])

c = readchar()

if c == '':
#The file has ended
return Token('EOF', '')

lexeme = ''

if c == ' ':
if file.peek(2) == ' ':
file.pop()
file.pop()
# 3-space indent
return Token('INDENT', ' ')
else:
return Token('SPACE', ' ')

elif c == '!':
marks = 0 #while loop will count one over
while c == '!':
c = readchar()
marks += 1
if file.peek() != '':
# File might end after a statment, we want to let it end if it does
# We can't just blindly push it back or we get an infinite loop
# TODO: Make sure this doesn't happen elsewhere
file.back() #Pushback
return Token('!', '!' * marks)

elif c in '+-':
next_char = readchar()
if c == next_char:
return Token('INC' if c == '+' else 'DEC', c*2)
else:
file.back()
return Token('PLUS' if c == '+' else 'MINUS', c)

elif c == '=':
equals = 0 #while loop will count one over
while c == '=':
c = readchar()
equals += 1
file.back() #Pushback
match equals:
case 1:
return Token('=', '=')
case 2:
return Token('LOOSE_EQUALITY', '==')
case 3:
return Token('PRECISE_EQUALITY', '===')
case 4:
return Token('LITERAL_EQUALITY', '====')
case _: # TODO: File splits (might have to be a preprocessor thing)
return Token('ERROR', 'Too much Equality (max is 4)')

elif c in '\"\'':
quote_format = ''
while c in '\"\'':
quote_format += c
c = file.pop()

#leave c at the next char, it'll be added to the string

quote = ''
while c not in '\"\'\n' and c != '':
quote += c
if c == '\\':
if file.peek() in '\"\'':
quote += file.pop() #Character already escaped
c = file.pop()
file.back()

# check for end quotes
if c == '':
# EOF reached; User probably forgot a closing quote
# Due to ambiguity the rest of the file is now a string
# End quotes are presumed present, thus satisfying AI requirement
# Diagnosis: skill issue
return Token('STRING', quote)
elif c == '\n':
# Line breaks within strings are not allowed, so the string ends here
return Token('STRING', quote)
else:
# If there are end quotes, they must match the quote format exactly
for i in range(len(quote_format)):
c = file.pop()
if c != quote_format[-(i+1)]:
# Mismatch
return Token('ERROR', 'String quote format mismatched')

return Token('STRING', quote)

elif c == '/' and file.peek() == '/':
file.pop() #Get rid of thge next slash
while c not in '\n\r':
c = file.pop()
file.back()
return self.getNextToken(file) #Should capture newline

elif c in self.basic_mappings.keys():
return Token(self.basic_mappings[c], c)

#INT and REAL
elif c.isdigit():
while c.isdigit():
lexeme += c
c = readchar()
file.back() #Pushback

# c is one character beyond the end
if c == '.':
#REAL
lexeme += '.'
c = readchar()
if c.isdigit():
while c.isdigit():
lexeme += c
c = readchar()
elif c not in self.operators:
return Token('ERROR', 'Non-Operator immediately after real; letters are not real')

file.back()

return Token('REAL', float(lexeme))

else:
#INT
return Token('INT', int(lexeme))

while not c.isspace() and c not in self.reserved_chars:
lexeme += c

c = readchar()

if len(lexeme) > 0:
file.back()
tok = lexeme.upper()
if tok in tokens:
return Token(lexeme, lexeme)

#check for function
if self.is_fn_subset(tok):
return Token('FUNCTION', lexeme)
else:
return Token('IDENTIFIER', lexeme)
else: #c is not alpha- only remaining case are special characters that count as whitespace
if c == '\n':
if readchar() != '\r':
file.back()
return Token('!', c)
elif c == '\r':
if readchar() != '\n':
file.back()
return Token('!', c)
elif c == '\t':
# Was very tempted to force you to only use the 3 spaces but this is complicated enough already
return Token('INDENT', c)
else:
return Token('SPACE', c)

def tokenize_file(self, path):
crawler = None
with open(path, 'r') as reader:
crawler = SimpleListCrawler(reader.read())
reader.close()

token = self.getNextToken(crawler)
while token.token != 'EOF':
yield token
token = self.getNextToken(crawler)
yield token #yield EOF


def catch_tokenizer_errors(tokens: list[Token]):
line = 1
has_errors = False
for token in tokens:
if token.token == 'NEWLINE':
line += 1
elif token.token == 'ERROR':
print(f'-Tokenizer: ParseError on Line {line}: {token.lexeme}')
has_errors = True
return has_errors

class Parser():
def __init__(self) -> None:
pass


if __name__ == '__main__':
tokens = list(Tokenizer().tokenize_file('test\\db\\db\\time_travel.db'))

if catch_tokenizer_errors(tokens):
print('\n')
print("Tokenizer reports L code, fix your code or I won't compile this garbage")
exit(1)

2 changes: 2 additions & 0 deletions test/db/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# INFOHAZARDS AHEAD
**Warning:** This folder contains potentially dangerous files that could pose information hazards. Proceed with caution.
8 changes: 8 additions & 0 deletions test/db/db/basic.db
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
print("Hello world")!
print("Hello world")!!!
print("Hello world")?

if (;false) {
print("Hello world")!
}

40 changes: 40 additions & 0 deletions test/db/db/time_travel.db
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
const var x = 5!
x++!
print(x)
previous x = 7!!!

"""6\n8\n"""

const var x = 5!
x++!
print(x)
print(previous x)

"""6\n5\n"""

fnc gaming() => {
if (x == 5) {
print(previous x * x)
}
}

class Player {
const var health = 10!
}

const var player1 = new Player()!
const var player2 = new Player()! //Error: Can't have more than one 'Player' instance!

class PlayerMaker {
function makePlayer() => {
class Player {
const var health = 10!
}
const const player = new Player()!
return player!
}
}

const const playerMaker = new PlayerMaker()!
const var player1 = playerMaker.makePlayer()!
const var player2 = playerMaker.makePlayer()!