Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --ignored Flag to Exclude Specific Files and Directories During Ingestion #1432

Merged
merged 10 commits into from
Feb 7, 2024

Conversation

smbrine
Copy link
Contributor

@smbrine smbrine commented Dec 20, 2023

This pull request introduces the --ignored flag to the ingestion script. The motivation for this change stems from encountering UnicodeDecodeError when the script processes Python-generated __pycache__ folders and MacOS-specific .DS_Store files. These files are not relevant to our data processing and can cause errors due to their format.

The --ignored flag allows users to specify a list of files or directories to exclude from the ingestion process, enhancing the script's flexibility and reliability, especially in Python and Mac environments.

Added the --ignored flag bc python projects with __pycache__ and Mac projects in general with .DS_Store caused UnicodeDecodeError
imartinez
imartinez previously approved these changes Dec 22, 2023
Copy link
Collaborator

@imartinez imartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great contribution!

@smbrine
Copy link
Contributor Author

smbrine commented Dec 25, 2023

It's interesting to note that the test pipeline's failure was due to not running 'black .' for code formatting, rather than actual code errors. To prevent such issues, i might integrate 'black' as a step in your GitHub Actions. This way, code is automatically formatted before it's even merged, ensuring consistency and preventing similar failures. This would streamline our development process by catching formatting issues early and automatically, without relying on manual execution. Would you merge if I make these changes?

@smbrine smbrine requested a review from imartinez December 25, 2023 23:40
@imartinez
Copy link
Collaborator

It's interesting to note that the test pipeline's failure was due to not running 'black .' for code formatting, rather than actual code errors. To prevent such issues, i might integrate 'black' as a step in your GitHub Actions. This way, code is automatically formatted before it's even merged, ensuring consistency and preventing similar failures. This would streamline our development process by catching formatting issues early and automatically, without relying on manual execution. Would you merge if I make these changes?

I'd rather add black as a pre-commit action than a github action. I'm no big fan of Github changing code without the developer noticing. Thanks for your feedback!

@imartinez
Copy link
Collaborator

black still fails. Run make check and commit the format fixes please. Thanks!

@smbrine
Copy link
Contributor Author

smbrine commented Dec 26, 2023

Ran make check twice just in case. Also added some .gitignore entries and a make list command to list all available commands.

Copy link
Contributor

Stale pull request

@github-actions github-actions bot added the stale label Jan 11, 2024
@github-actions github-actions bot closed this Jan 23, 2024
@tylervick
Copy link

Hey @imartinez @smbrine, it looks like the feedback in this PR was addressed. Is there anything else preventing it from being merged?

I've been using this branch for some weeks now, I'd love this feature in main 🙂

@imartinez imartinez reopened this Feb 7, 2024
@imartinez
Copy link
Collaborator

Hey @imartinez @smbrine, it looks like the feedback in this PR was addressed. Is there anything else preventing it from being merged?

I've been using this branch for some weeks now, I'd love this feature in main 🙂

Thanks for the heads up! I've disabled the GitHub action that closed the PR. Merging it now!

@imartinez imartinez merged commit b178b51 into zylon-ai:main Feb 7, 2024
5 checks passed
simonbermudez pushed a commit to simonbermudez/saimon that referenced this pull request Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants