-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Handle Duplicative Import Names #5377
Conversation
@@ -44,47 +48,99 @@ export class PythonFile extends AstNode { | |||
} | |||
|
|||
public write(writer: Writer): void { | |||
const uniqueReferences = this.deduplicateReferences(); | |||
|
|||
this.updateWriterRefNameOverrides({ writer, uniqueReferences }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is where the magic happens. Within it, we add state to Writer
this.statements.forEach((statement, idx) => { | ||
statement.write(writer); | ||
writer.newLine(); | ||
if (idx < this.statements.length - 1) { | ||
writer.newLine(); | ||
} | ||
}); | ||
|
||
writer.unsetRefNameOverrides(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wipe the state we had set on the writer just to be safe.
"from .cars import Car as Car_1 | ||
from .vehicles import Car as Car_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose a _n
suffix, but am open to suggestions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dvargas92495 lmk if you have a preferred naming convention here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
during normal development, I usually do something module dependent. eg
from .cars import Car as CarsCar
from .vehicles import Car as VehiclesCar
const nameOverride = writer.getRefNameOverride(this); | ||
writer.write(nameOverride.name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the second (of 2) place where we use state that was added to the Writer
. Now, a Reference
object gets its name from the Writer
during its own write
method.
@@ -44,47 +48,99 @@ export class PythonFile extends AstNode { | |||
} | |||
|
|||
public write(writer: Writer): void { | |||
const uniqueReferences = this.deduplicateReferences(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope to get rid of the need for this method altogether later. Right now, the data structure we use for ref tracking allows for duplicative refs, but there's no reason it needs to. We can deduplicate upon adding a new ref.
In any case, this is just a refactor to take existing logic that was there and move it into a helper so its return value can be used in two places directly below.
} | ||
|
||
private getImportName({ writer, reference }: { writer: Writer; reference: Reference }): string { | ||
const nameOverride = writer.getRefNameOverride(reference); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is one of two places where we use the state that was added to the Writer
.
|
||
// Build up a map of refs to their name overrides, keeping track of howmany times we've seen a name as we go. | ||
const completeRefPathsToNameOverrides: Record<string, { name: string; isAlias: boolean }> = {}; | ||
const nameUsageCounts: Record<string, number> = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need to track the counts? We only care if it's used at all right? Could we just use reservedNames
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do for the current naming convention scheme, but once I adapt to match Vargas's proposal, we won't need to.
2430804
to
ddf4a63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamping to unblock, but there's a few things I think we can improve here. I trust you to consider them before merging, but I'll leave you to making a judgement call for your use case.
uniqueReferences | ||
}: { | ||
writer: Writer; | ||
uniqueReferences: Map<string, { modulePath: ModulePath; references: Reference[]; referenceNames: Set<string> }>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we define an interface for this map value?
public getFullyQualifiedModulePath(): string { | ||
public getFullyQualifiedPath(): string { | ||
return this.modulePath.join("."); | ||
} | ||
|
||
public getCompletePath(): string { | ||
return `${this.getFullyQualifiedPath()}.${this.name}`; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would use getFullyQualifiedModulePath
and getFullyQualifiedPath
(since omitting the Module
component implies its for the reference), but feel free to ignore if you disagree. Not a big deal either way, and it's easy to change later (if ever).
class Car: | ||
car = CarsCar() | ||
automobile = AutomobilesCar() | ||
vehicle = VehiclesAutomobilesCar() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
export function createPythonClassName(input: string): string { | ||
// Handle empty input | ||
if (!input) { | ||
return "Class"; | ||
} | ||
|
||
// Clean up the input string | ||
let cleanedInput = input | ||
.replace(/[^a-zA-Z0-9\s_-]/g, " ") // Replace special characters with spaces | ||
.replace(/[-_\s]+/g, " ") // Replace hyphens, underscores and multiple spaces with single space | ||
.trim(); // Remove leading/trailing spaces | ||
|
||
// Handle numeric-only or empty string after cleanup | ||
if (!cleanedInput || /^\d+$/.test(cleanedInput)) { | ||
return "Class" + (cleanedInput || ""); | ||
} | ||
|
||
// Handle strings starting with numbers | ||
if (/^\d/.test(cleanedInput)) { | ||
cleanedInput = "Class" + cleanedInput; | ||
} | ||
|
||
// Split into words and handle special cases | ||
const words = cleanedInput | ||
.split(/(?=[A-Z])|[-_\s]+/) | ||
.filter((word) => word.length > 0) | ||
.map((word) => { | ||
// Fix any garbled text by splitting on number boundaries | ||
return word.split(/(?<=\d)(?=[a-zA-Z])|(?<=[a-zA-Z])(?=\d)/).filter((w) => w.length > 0); | ||
}) | ||
.flat(); | ||
|
||
// Process each word | ||
return words | ||
.map((word, index) => { | ||
// If it's the first word and starts with a number, prepend "Class" | ||
if (index === 0 && /^\d/.test(word)) { | ||
return "Class" + word.charAt(0).toUpperCase() + word.slice(1).toLowerCase(); | ||
} | ||
// Preserve words that are all uppercase and longer than one character | ||
if (word.length > 1 && word === word.toUpperCase() && !/^\d+$/.test(word)) { | ||
return word; | ||
} | ||
// Capitalize first letter, lowercase rest | ||
return word.charAt(0).toUpperCase() + word.slice(1).toLowerCase(); | ||
}) | ||
.join(""); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of custom regex in here - could we have used some combination of lodash functions for the same effect? This might be as simple as just calling upperFirst(camelCase(name))
to generate PascalCase like this (plus potentially some other minor cleanups).
Would you mind looking into this a bit to see if we can remove this altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is one we use ourselves and was the result of a lot of collabing with cursor until it passed all test cases. I'll make a timebound effort to see if it can be simplified with lodash functions, but default plan will be to roll with this as is, relying on the thorough tests in this PR to protect against future refactors if they happen.
if (modulePathIdx < 0 || !module) { | ||
nameOverride = `_${nameOverride}`; | ||
} else { | ||
nameOverride = `${createPythonClassName(module)}${nameOverride}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if we don't use a simple _
strategy, the generated names can potentially get long unless you maintain a collision map and assign type names more carefully.
For example, imagine the following module name (which we actually sometimes see in practice):
from .resources.v1.api.vehicles.automobiles import Car as ResourcesV1ApiVehiclesAutomobilesCar
class Car:
vehicle = ResourcesV1ApiVehiclesAutomobilesCar()
For what it's worth, we do this in the original Go generator but only prepend names as needed (until we find a unique name). Given that we traverse the statements in a deterministic order, the import alias assignment is always guaranteed to be unique. You can check out the method docs here, and this test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'm misunderstanding, but I assert the current implementation does only prepend names as needed until it comes up with one that's unique. It goes from right to left, adding one module to the name at a time, until it finds that it's come up with a unique name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d'oh! I completely misread the implementation - that's on me. It might be worth adding a test case to confirm that the module paths are as succinct as possible, but only if it's straightforward for you. It looks like the cases in PythonFile.test.ts.snap
all use the full module path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the test to make this clear makes sense. Will do!
Description
Sibling PR: #5365
The goal of this PR is to handle auto-aliasing of imports when there are naming conflicts.
Changes Made
Writer
PythonFile
'swrite
methodTesting