-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixup info plugin's exclusion logic #6874
Conversation
material/plugins/info/patterns.py
Outdated
# | ||
r".*\.DS_Store", # macOS | ||
# | ||
r"^/\.[^/]+$", # .dotfiles in the root directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is risky. I can think of one case where this might not work: the awesome pages plugin with the docs_dir
set to the root. I think we should remove this line or better specify what to exclude.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK MkDocs doesn't allow to set docs_dir: .
. Do you know if there are any plans to change it?
Also awesome-pages
looks only within the docs_dir
, so root of the project should be fine.
EDIT: I see it is in the plans for 1.6.0 ehh: https://github.com/mkdocs/mkdocs/milestone/18
ERROR - Config value 'docs_dir': The 'docs_dir' should not be the parent directory of the config file.
Use a child directory instead so that the 'docs_dir' is a sibling of the config file.
I also don't think there are any other cases where a .dotfile
is used with a plugin. Any files related to theme's plugins are in the docs_dir
too, like .meta.yml
or .authors.yml
. The social layouts are not a .dotfile
. So the user would have to deliberately break the "recommended way to to things".
I'll see how feasible the explicit patterns for the files would be 🤔
This repository has 9 .dotfiles, but some of them are related to TypeScript, so perhaps not everything is required, or there are even more "standard" files people use 😵 💫
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's time to add inclusion_patterns
as well 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are at least actively considering it: mkdocs/mkdocs#3519
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think there are any other cases where a .dotfile is used with a plugin.
We should only exclude files if we are a 100% sure we don't need them for reproductions. Users might put configuration in dotfiles and read them via hooks. Moreover, the path to the authors_file
, albeit relative to the docs_dir
, can be changed to be located outside of the docs_dir
. Same for all other plugin settings that allow to change file locations. Thus, please remove the dotfile rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example is somebody storing Snippets in a dotfile. Dotfiles are only a convention, and Snippets will not care whether your file is called something.md
or .awesome-stuffs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the exclusion only logic, replaced the match
function with search
to match subdirectories without having to resort to .*
at the beginning. I still have to properly test it though 😅, a multilayered project with nested .git repositories. This also kind of forced me to use exact names, because [^/]*lint[^/]*
would match a documentation file /To lint or not to lint.md
, and adding exclusions for .md
didn't seem right...
Please don't use automatic code formatters for now. We'll revisit this topic later this year. However the rouge |
I tried to produce a reproduction the other day and found that the info plugin added my whole It turns out that the info plugin does not just pack up a venv that is in the project root directory but also a venv that is elsewhere. (I have one installed globally that I usually use so I don't have to upgrade Insiders in a million individual venvs.) Looks like it was a design decision to pack up some elements of the venv but it does more than it should? Here is reproduction: I removed the packed up venv manually, so you need to run a build yourself to see the resulting large .zip file. |
I did claim that the user could have 2 venv directories in the project and proposed having another safeguard in the shape of generic names exclusion patterns like Your case is different, as you don't have 2 vens inside the same directory but one global, one local, and you used the global one. We currently only exclude the active venv if it's inside the reproduction directory. @alexvoss would a generic name exclusion exclude your local one in this instance? Similar to the exclusion of the site_dir based on the sitemap.xml we could search |
I'm a little confused, as what @alexvoss is saying is that the "venv is located elsewhere". Those directories should not be packed up. What I said was that we should not optimize for the case that two venvs exist inside one project, i.e., a root directory, that is being packed up. |
We should very hard not to go down the path where we have a list where we add name of virtual environment directories. This is completely up to the user and thus, arbitrary, so we would need to keep adding and adding thing to the ignore file + those might catch directories that should be packed up that are not venvs. I'm just saying that it's risky. |
Sorry for the sporadic interaction today. Sorry that I manage to produce the weirdest cases... For background: because I have a ton of projects over time, I created a single venv in my home directory that has Insiders in it and the plugins that are commonly used. That makes it easier to run the most up-to-date version most of the time. Because I test things with both the public edition and with Insiders, I actually have two such environments,
IMHO, excluding |
So this should not happen, the path should be excluded because it's activated...
Perhaps another finding in my lacking testing, I only tested the behaviour of
Yes, this was expected, only the activated venv was supposed to be excluded. EDIT: "sys.path": [
"C:\\MyFiles\\_git\\base-material-info-tests\\13-infoenv",
"C:\\MyFiles\\Python312\\python312.zip",
"C:\\MyFiles\\Python312\\DLLs",
"C:\\MyFiles\\Python312\\Lib",
"C:\\MyFiles\\Python312",
"C:\\MyFiles\\PythonVENV",
"C:\\MyFiles\\PythonVENV\\Lib\\site-packages"
] "sys.path": [
"/Users/avoss/venvs/mkd_insiders/bin",
"/Users/avoss/src/mkdocs-material/reproduce/infovenv",
"/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/childrensrights",
"/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/bookproposal",
"/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python312.zip",
"/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12",
"/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/lib-dynload",
"/Users/avoss/venvs/mkd_insiders/lib/python3.12/site-packages"
] |
@alexvoss I can also see the paths in the
but they are not in the zip provided in the previous comment, are the directories empty or did the script omit them, or did you delete them? Please run this command from the same directory you packaged up the zip file before: python -c "import site, sys, os; from pprint import pprint as pp; print(os.getcwd()); pp(site.getsitepackages()); pp(sys.path)" |
Right, these exist simply because I reused a terminal session that had a |
|
What I meant to say is that they may look to you to be related to the reproduction when, in fact, they are not. Interpret |
d5d4908
to
d28901e
Compare
Here is the current form of the
{
"system": "Windows-10-10.0.19045-SP0",
"architecture": [
"64bit",
"WindowsPE"
],
"python": "3.12.0",
"cwd": "C:\\MyFiles\\_git\\base-material",
"command": "__main__.py build -v",
"env:$PYTHONPATH": "",
"sys.path": [
"C:\\MyFiles\\_git\\base-material",
"C:\\MyFiles\\Python312\\python312.zip",
"C:\\MyFiles\\Python312\\DLLs",
"C:\\MyFiles\\Python312\\Lib",
"C:\\MyFiles\\Python312",
"C:\\MyFiles\\PythonVENV",
"C:\\MyFiles\\PythonVENV\\Lib\\site-packages",
"C:\\MyFiles\\_git\\base-material"
],
"excluded_entries": [
"/\\.git/ - /.git/",
"/\\.idea/ - /.idea/",
"/\\.vscode/ - /.vscode/",
"pyvenv.cfg - /custom-v-e-n-v/",
"^/material/ - /material/",
"/node_modules/ - /node_modules/",
"/site/ - /site/",
"^/src/ - /src/",
"/__pycache__/ - /__pycache__/",
"/\\.browserslistrc$ - /.browserslistrc",
"/\\.dockerignore$ - /.dockerignore",
"/\\.editorconfig$ - /.editorconfig",
"/\\.eslintignore$ - /.eslintignore",
"/\\.eslintrc$ - /.eslintrc",
"/\\.gitattributes$ - /.gitattributes",
"/\\.gitignore$ - /.gitignore",
"/\\.stylelintignore$ - /.stylelintignore",
"/\\.stylelintrc$ - /.stylelintrc",
"/[^/]+\\.zip$ - /9.5.13-fixed.zip",
"/[^/]+\\.zip$ - /performance_debug.zip"
]
} The |
I don't really know much about Node, so I guess the |
Looking at the exclusion patterns, I believe there are far too many files there. |
I'm sorry that iteration on this feels so sluggish, but I'm not sure if we have the same vision for the plugin. Maybe we should first write up what the plugin aims to achieve. As said, in my view, it is only meant for packing up mint reproductions. On all other case, it is okay to fail spectacularly if we can somehow signal the user why it does. Adding additional complexity to the plugin makes it more error prone, breaking our bug reporting process. We should keep this in mind at all times when adding more lines and logic to the plugin. The plugin must be as dumb as possible. That is at least my experience from troubleshooting issues for this project. The previous state of the info plugin, albeit not perfect, made interaction with bug reporters so much more sensical and simple, that I cannot stress enough that things should be as simple and dumb as possible to work. |
In the following just my 2p. Feel free to ignore. I realize I am coming to this rather late and may not be sufficiently aware of the discussion that has come before. I ignored it because the case seemed to be in the best hands, then probably only got to reads bits of it. I can see where you are coming from @squidfunk. The I do wonder how many people start with an empty project when producing a reproduction. My workflow would be to take my project, copy it and strip it of things that are not needed for a reproduction. The reason being that removing things bit by bit allows me to narrow down there the problem lies and is easier to do than building things up from scratch. On a good day I would take the result and build it up from an empty project, having narrowed down what needs to be included. On a bad day I would feel convinced that I have produced something "minimal" and would try to zip it up. It seems to me that there is evidence in the forum that people follow the same process. How else would we be getting so many reproductions that do not meet the "minimal" requirement? Could we make it easier for people to have good days and do the second step of creating the reproduction from the ground up? Alternatively, could we somehow detect if a reproduction has been built from the ground up instead of by stripping down an existing project? |
If we get a reproduction, most reproductions meet the minimal requirements. There are only a handful of cases, mostly from users that did not provide one in the first place, that are not minimal. If a reproduction is not minimal, our policy states that we are reserving the right to close the issue, until one is provided, because it can't be ruled out that the problem hides in customizations. This is a mandatory requirement, since we are providing support for free.
Please, let's not overcomplicate things. We have much more impactful areas to work on. I already feel that the latest efforts in improving the info plugin might have complicated things from a maintenance and usage perspective, but that is yet to be learned. Again, please, please, please, let's keep it as simple as possible. |
The idea about excluding the dotfiles is to make sure that the zipped file isn't cluttered with them, I didn't add them to save on storage.
I didn't have a specific vision of the plugin in mind, maybe that's my issue. Only the general idea of helping the user to pack up files. Therefore, I added the validation to make sure the files are included etc., and I considered adding a configurable I agree that the info plugin's aim shouldn't be to automate the "minimization" process of creating a reproduction, but at the same time I feel it doesn't have to be constrained by that logic to not "help" the users if they decide to pack up their Git repository with their development files in-place. I also tried adding a "currently processed file" indicator, so that people can see why the zipping process takes so much time, so they can see for example that mintty_jAP3kC2wLf.mp4
So after reading this, I consider removing the dotfiles would be for the best, as their existence in the zip file will prove the reproduction wasn't built from scratch. 👍 |
d28901e
to
8a93af6
Compare
@alexvoss If you have the time please try the info plugin from this PR to see if everything will be packed up and excluded as it should on your machine. I did run a test in my macOS repository and a venv should include the ⬇️ The latest force push only changed the commit message |
8a93af6
to
6539859
Compare
@alexvoss there's your indicator if a project was built from scratch 😉 Okay, so we exclude the virtual environment resolved via site packages (if we can make that work on all OSs), and then just pack up everything (or am I forgetting something again?), keeping the warnings about the latest version, hooks and customizations. My learning: I for my part need to better explain what the direction and vision of a plugin/extension etc. is from the vantage point of a long-term maintainer next time we work on a PR together. Not explaining this from the start made iteration more tedious, but I think we also learned a lot in the process how we can better work together |
The validation with sys.exit includes version, custom_dir, hooks, and invalid project structure where all files wouldn't be packed up. The current patterns exclude only files that could be auto-generated during the "minimal reproduction" creation process. mkdocs-material/material/plugins/info/patterns.py Lines 16 to 26 in 6539859
This change kind of invalidated the switch from fnmatch to regex 😅 as those patterns aren't that complex. Additionally dynamic patterns (aka exact POSIX paths) are generated for venv, Additionally we support 2 "edge-cases":
The edge-cases support makes the dynamic patterns redundant, so dynamic patterns could be removed to de-bloat the code. To aid the user if they mess up there is the current path indicator, so it's easily understood that the infinite black hole of To aid the debug process, CWD, $PYTHONPATH, and the excluded paths are also stored @alexvoss
simply can't be true, as the
The code only scans the |
You are right. I must have got confused about the content of the local venv, thinking it was still empty. Apologies for complicating things. |
OK, after further consideration:
Handling this edge-case, does "fix" the user's mistake of running And following the "should fail spectacularly" idea, and "should not automate the minimization" idea, I should not exclude inactive venvs, Instead, I need to fix the other issue to make sure not just the
|
6539859
to
743a713
Compare
Done, and I found a bug that stemmed from changing from This tests all edge-cases I'm aware of, and all current patterns:
Here is a test run on the repository that checks the values: EDIT: Found one last styling issue ⬇️ fixed in the force push. Unless there are any styling issues, I consider this ready to merge ✌️ |
743a713
to
2e117c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not yet run it, but skimmed over the changes and it LGTM. Considering ready for merge and giving it a spin! ✌️
material/plugins/info/patterns.py
Outdated
# | ||
r".*\.DS_Store", # macOS | ||
# | ||
r"^/\.[^/]+$", # .dotfiles in the root directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are at least actively considering it: mkdocs/mkdocs#3519
material/plugins/info/patterns.py
Outdated
# | ||
r".*\.DS_Store", # macOS | ||
# | ||
r"^/\.[^/]+$", # .dotfiles in the root directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think there are any other cases where a .dotfile is used with a plugin.
We should only exclude files if we are a 100% sure we don't need them for reproductions. Users might put configuration in dotfiles and read them via hooks. Moreover, the path to the authors_file
, albeit relative to the docs_dir
, can be changed to be located outside of the docs_dir
. Same for all other plugin settings that allow to change file locations. Thus, please remove the dotfile rule.
material/plugins/info/patterns.py
Outdated
# | ||
r".*\.DS_Store", # macOS | ||
# | ||
r"^/\.[^/]+$", # .dotfiles in the root directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example is somebody storing Snippets in a dotfile. Dotfiles are only a convention, and Snippets will not care whether your file is called something.md
or .awesome-stuffs
.
No worries! Since the .dotfiles catch-all was removed, I thought you read them although they were not resolved. |
Continuation of #6826 or rather a fixup of my mistake.
After the change from gitignore-like fnmatch to regex it will be better to store it directly in Python.
I decided to return the list from a function to assure a new copy every time. Not that it matters in the setup, since the plugin runs only once, but operating on a global list didn't feel right to me in this context.
I hope the styling is OK,
black
doesn't like it, and it liked it even less if I omitted the#
to add empty space between entries.I also noticed that when resolving patterns I stripped "/" for directories to only add them later 🤦, so I strip them now for every path. It's not really a bug fix, as in my testing no "path source" returned a path with a slash at the end, so the strip is just for sanity 😄
Changed slash removal from resolved patterns
Reverted build assets gathering dc808ca
Reverted info.gitignore file 79129d5