Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add events to QueryBuilder class #36572

Closed
wants to merge 1 commit into from

Conversation

summersab
Copy link
Contributor

@summersab summersab commented Feb 7, 2023

Summary

This provides events before and after the execution of QueryBuilder objects in order to modify QueryBuilder elements, the query results, or both. While it may seem like this is only useful for niche applications (or potentially dangerous), there are valid instances where intercepting DB queries and results are extremely useful (or even necessary).

Being able to intercept queries and/or results in order to restrict access, redact information, or even generate query-level logs would require the ability to access the QueryBuilder class before and/or after execution.

One application that I have implemented for my organization is adding column-level encryption on a per-user basis. Without being able to access the QueryBuilder class immediately before execution, this integration would be impossible. Once this PR is accepted and my integration is tested more thoroughly in production, I plan to make my code publicly available so administrators may encrypt information in the NC database such as contact and calendar data.

Checklist

@summersab summersab force-pushed the querybuilder_events branch 2 times, most recently from 331d4a8 to 5145ef6 Compare February 7, 2023 06:31
@nickvergessen
Copy link
Member

or even generate query-level logs

This is already available by setting the following config:

/**
* Log all queries into a file
*
* Warning: This heavily decreases the performance of the server and is only
* meant to debug/profile the query interaction manually.
* Also, it might log sensitive data into a plain text file.
*/
'query_log_file' => '',

Regarding the events:From my POV this looks like an interesting idea, but it has the option to fail very hard performance-wise.

Regarding column encryption: Also any user that has access to the database mostlikely also has access to the source code, so the encryption is not helping a lot. Which use/attack case are you trying to prevent?

@summersab
Copy link
Contributor Author

summersab commented Feb 7, 2023

Regarding the events:From my POV this looks like an interesting idea, but it has the option to fail very hard performance-wise.

Yes, this is true - if used incorrectly, these events can be a huge performance failure. However, based on my early usage of this code, there really isn't a noticible performance hit when done well. (Also, let's be honest. Plenty of integrations have the potential for bad performance unless you build them right. Some might be even worse than this).

Regarding column encryption: Also any user that has access to the database mostlikely also has access to the source code, so the encryption is not helping a lot. Which use/attack case are you trying to prevent?

The integration I wrote uses AES_ENCRYPT/AES_DECRYPT (MariaDB) or pgcrypto (PostgreSQL) in combination with the IProvideUserSecretBackend class to encrypt user data in the DB using a user-specific secret. Based on the tables and columns present in a given query, my code adds the required functions to the QueryBuilder object in order to encrypt and decrypt user data. It may seem questionable, but it works quite well, and I have not noticed a performance hit. There really is no other way to accomplish this sort of integration without modifying core files or using events like these.

That said, yes - this is a very unique implementation with very specific use-cases, and it could lead to negative effects. However, those use-cases DO exist. Also, there are other NC APIs that can impact performance even worse when used incorrectly.

Thank you for your feedback. Let me know if you have further questions, comments, or suggestions - happy to discuss!

@nickvergessen
Copy link
Member

in combination with the IProvideUserSecretBackend class to encrypt user data in the DB using a user-specific secret

I get that, but the server might need to write data without the user password being available (e.g. in a background job, or when others trigger an update of "your" data. So the server is able to encrypt/decrypt and therefor also the sysadmin and any one that hacks themselves access to the server.
That's the reason why I wonder which of the attack vectors you are trying to prevent.

@summersab
Copy link
Contributor Author

summersab commented Feb 7, 2023

Well, I decided to use IProvideUserSecretBackend in my implementation (which comes with a number of limitations like you mentioned like background jobs - I have taken this into account with my setup). It is also possible to use something like the NC server secret for encrypting database columns, and all features work as expected.

Database encryption is my particular use-case for this PR, but I am sure there are other reasons why these events might be useful.

@summersab
Copy link
Contributor Author

summersab commented Feb 7, 2023

@nickvergessen, I took some metrics to address the subject of performance. This is not the most scientific calculation, but it provides a rough estimate. I created a contact in NC and edited it multiple times. Then, I enabled my QueryBuilder events and encryption logic, created a new contact, and edited it the same way. This was done on a pretty small VM with 1vCPU and 8GB RAM. Here are the results:

# of Queries  Avg (Default)    Avg (Encrypted)  Difference         Query
6             0.0009931723277  0.00150179863    0.0005086263021    DELETE FROM `*PREFIX*cards_properties` WHERE (`cardid` = :dcValue1) AND (`addressbookid` = :dcValue2)
36            0.0005187855826  0.00106041299    0.0005416274071    INSERT INTO `*PREFIX*cards_properties` (`addressbookid`, `cardid`, `name`, `value`, `preferred`) VALUES(:dcValue1, :dcValue2, :name, :value, :preferred)
6             0.0003132025401  0.000682870547   0.0003696680069    SELECT `id` FROM `*PREFIX*cards` WHERE (`uri` = :dcValue1) AND (`addressbookid` = :dcValue2)
18            0.0004307428996  0.0009056540097  0.0004749111101    SELECT `id`, `uri`, `lastmodified`, `etag`, `calendarid`, `size`, `calendardata`, `componenttype`, `classification`, `deleted_at` FROM `*PREFIX*calendarobjects` WHERE (`calendarid` = :dcValue1) AND (`uri` = :dcValue2) AND (`calendartype` = :dcValue3)
30            0.0004283030828  0.002046084404   0.001617781321     SELECT `id`, `uri`, `lastmodified`, `etag`, `size`, `carddata`, `uid` FROM `*PREFIX*cards` WHERE (`addressbookid` = :dcValue1) AND (`uri` = :dcValue2) LIMIT 1
5             0.001100826263   0.002524375916   0.001423549652     UPDATE `*PREFIX*cards` SET `carddata` = :dcValue1, `lastmodified` = :dcValue2, `size` = :dcValue3, `etag` = :dcValue4, `uid` = :dcValue5 WHERE (`uri` = :dcValue6) AND (`addressbookid` = :dcValue7)

Again, this is just my use-case for adding Event dispatchers to the QueryBuilder class. There are probably other uses. With that said, it is pretty clear that the performance hit is negligible (when logic is implemented carefully, of course).

@summersab
Copy link
Contributor Author

What else is required for this PR? I'm not sure how to make CI happy...

@nickvergessen
Copy link
Member

The autoloaders are not up to date
Please run: bash build/autoloaderchecker.sh
And commit the result

But the biggest problem is technical. This is a very fragile thing to be added. It comes with great risk (performance and listing to the event while triggering a SQL query will cause an infinite loop) and therefore we are currently unsure if this is the way to go, especially since the target usecase is soo limited. It does not even prevent admins from doing "bad" things, as they have access to the keys as well. So the only thing it would help against is when your database storage is stolen but not the nextcloud one?

@summersab
Copy link
Contributor Author

summersab commented Feb 14, 2023

I completely understand the concerns regarding the events being very fragile and risky. However, I would like to respectfully push back a little.

Many existing events in NC are fragile, risky, can lead to infinite loops, and allow admins to do bad things. There are events that allow developers to interact with the authentication process, user/group creation, and even password updates. Are there really that many use cases for OCP\Security\Events\GenerateSecurePasswordEvent? As far as I can see, it is only used by the password_policy app. However, it IS available for all developers to use if they wish. That would allow them to do some pretty bad things.

My proposed events really aren't much different. Yes, the use cases are limited, but so are many events. That doesn't mean that there aren't valid use cases.

Yes, directly accessing database queries is generally avoided because it is risky, but there are TONS of internal elements of the NC framework already exposed to developers. Many of them are just as risky.

Yes, this could lead to infinite loops if improperly used, but even the most benign APIs can lead to infinite loops if not used correctly. This has happened to every developer at some point.

I think this is the nature of open source and security. Systems should be as malleable as possible (even if it is risky) so developers don't have to "hack" system internals. In turn, developers should make their code available for public review and scrutiny. With that in mind, which option is more responsible?

  1. I could fork NC, edit these classes, and implement my code. This would allow me to move forward, but I would have to edit core files, and it breaks the code signing in NC.
  2. I could continue to politely petition for this PR to officially provide access to the QueryBuilder class even though there are risks and very few use cases (ahem - GenerateSecurePasswordEvent). This would make NC more malleable so I can write a proper integration for NC without hacking into core classes.

Either method will accomplish the same goal, but I think the second option is much more responsible. That is why I am pursuing it. I am certainly open to feedback and suggestions, however.

@summersab
Copy link
Contributor Author

summersab commented Feb 17, 2023

Any thoughts on how to better structure this and still provide the same sort of functionality, @nickvergessen? It's a touchy integration, but it's better than overriding classes in my NC instances. Plus, I could provide my app that enables per-user column encryption, and I think a lot of people would benefit from that (there are open tickets asking for that feature).

@nickvergessen
Copy link
Member

You modified 3rdparty? That should not be needed.

On that note, what is needed is bash build/autoload....

@summersab
Copy link
Contributor Author

@nickvergessen, the postgres11-php8.0 CI test is failing because Postgres fails to respond. I don't think this is an issue with my code - the test just failed to properly run, I believe.

@nickvergessen
Copy link
Member

Yeah, seems unrelated

@summersab
Copy link
Contributor Author

summersab commented Feb 24, 2023

@nickvergessen, looks like I've finally gotten past CI - hurrah!

Sorry for being such a noob on this PR. It's my first PR to a project this complex, so I had a lot to learn.

@summersab

This comment was marked as resolved.

clean up php lint

update autoloaders

add phpdoc to public methods

fix docblock

rebase to master

Update lib/public/DB/QueryBuilder/Events/AfterQueryExecuted.php

Co-authored-by: Joas Schilling <213943+nickvergessen@users.noreply.github.com>
Signed-off-by: summersab <18727110+summersab@users.noreply.github.com>

Update lib/public/DB/QueryBuilder/Events/BeforeQueryExecuted.php

Co-authored-by: Joas Schilling <213943+nickvergessen@users.noreply.github.com>
Signed-off-by: summersab <18727110+summersab@users.noreply.github.com>

update class file names to match changes from nickvergessen

update class file names to match changes from nickvergessen; fix DCO

Signed-off-by: Andrew Summers <18727110+summersab@users.noreply.github.com>

update autoloaders
@summersab summersab marked this pull request as ready for review August 24, 2023 16:22
@joshtrichards joshtrichards added 3. to review Waiting for reviews and removed 2. developing Work in progress labels Sep 10, 2023
@skjnldsv skjnldsv mentioned this pull request Nov 1, 2023
This was referenced Nov 6, 2023
This was referenced Nov 14, 2023
@blizzz blizzz modified the milestones: Nextcloud 28, Nextcloud 29 Nov 23, 2023
This was referenced Mar 12, 2024
This was referenced Mar 20, 2024
@skjnldsv skjnldsv mentioned this pull request Mar 28, 2024
81 tasks
@skjnldsv skjnldsv modified the milestones: Nextcloud 29, Nextcloud 30 Mar 28, 2024
@skjnldsv skjnldsv added 2. developing Work in progress stale Ticket or PR with no recent activity and removed 3. to review Waiting for reviews labels Jul 27, 2024
This was referenced Jul 30, 2024
@skjnldsv skjnldsv closed this Aug 3, 2024
@skjnldsv skjnldsv removed this from the Nextcloud 30 milestone Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. developing Work in progress enhancement pending documentation This pull request needs an associated documentation update stale Ticket or PR with no recent activity technical debt
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for database encryption
8 participants