-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build PROJ without sqlite dependencies #1552
Comments
sqlite is quite a ubiquitous dependency (especially for anyone working with geospatial data, with the GeoPackage or Spatialite formats for example), and not a hard one to build, so I'm not sure there's really interest in offering that option. |
While I agree with that, in our case we use PROJ inside of a database (SAP HANA). Having sqlite as a dependency of HANA is the thing we want to avoid.
So do you see any possibility of removing the dependency (maybe via cmake build option) and have restricted capabilities when running without db access. |
Not really, no. The old PROJ didn't run without its database either, but its database was a bunch of CSV files. An interesting question is why is the SQLite dependency an issue for you? |
That's an easy one to hide. Just do a static build of sqlite & proj
It could be technically possible, but I'm not sure the project wants to maintain that. Another issue I see is that the test suite ("make check") is dependent in a lot of places on having the database available, so that would mean: either sacrifice "make check" in that degraded environment (but who knows if that works correctly then), or do a lot of changes in the test suite so that it can run in the no-sqlite3 case by skipping tests or allowing 2 possible outputs: a lot of initial work to do that, and a maintenance burden over the long term. |
As a database we do not want to have dependencies to an embedded database.
I see. But AFAIK you did not need to ship the CSV files with the library pre v6 right?
We would not have a lib-dependency to sqlite (although the code is still compiled into the static proj library), but still would need to deliver the .db file. |
You could possible puts its content inside a .text entry in some DLL, and use SQLITE in-memory capabilities to work with it.
SQlite could with a bit of work be replaced by another SQL92 capable database, but you need to be able to support JOIN, WHERE, ORDER, etc... A simple CSV file reader won't do. |
The core issue for us is the entanglement of the actual projections (which we want) and the metadata management, which we don't want, because we're having our own.
The old libproj.so ran well enough without the CSV files that we could use it that way. When you have your own CRS table to look up transformation parameters those files were never read, and so we didn't even install them. (We have to maintain that table on our side because we have to ensure transactional integrity and other shenanigans. Needless to say that we also want to allow users to add reference systems.) We have no issue with migrating the CRS data from CSV to a SQLite database, but that it has become a hard dependency even when the data isn't needed. There is also metadata compiled into the library: the data for the
Our issue with SQLite is neither its size nor availability nor delivery. Certainly we can handle all that. The issue is about a leap in complexity, and that we need a dependency for a feature that we don't use. The projection library that we want (and it's how we have used it in the past) is a library of plain computations that runs just on CPU and memory and doesn't know about the rest of the system. With SQLite that goes out of the window, as it does all kinds of low-level file I/O. Using a SQL database engine also adds a lot to runtime complexity, which isn't really needed here:
But we aren't concerned with how the metadata management is handled, it's rather that we handle it ourselves and thus don't want to carry another metadata manager around that we don't use. We don't care whether it's an optional or replaceable component, which leaves two possibilities:
The first route sounds more promising to me, although the work of separating the components would need to be done in both variants. Ideally metadata management would be handled by an entirely separate library which is an optional dependency for PROJ. That library might then as well use SQLite.
I don't see this as a big issue, on the contrary: separating tests for the projections from tests for metadata management should lead to a cleaner structure of the entire test suite. |
I agree SQLite should be optional and it would enable using PROJ in a lot of additional contexts where this dependency is not available/not acceptable. |
None of the active PROJ developers are interested in liberating PROJ from sqlite. After all, we decided to put it in there because we thought that was a good idea. We did so knowing that a small fraction of the community would not like it. The added benefit of being able to use sqlite to handle the complexity of the CRS registry was, and is, worth it to us. With that said, if the community can provide a solution that is acceptable we can't ignore that. Should anyone want to take on this task, at first a RFC with a detailed description of the proposed solution should presented to the PROJ PSC. Such an RFC has a much better change of receiving a positive vote by the PSC if it comes with a preliminary proposed implementation. It will be quite a disruptive change to the current code and we need to be assured that it is sane and posible to maintain. |
@kbevers Sounds fair! I had a closer look to check the extend of the usage of sqlite in the code, and it seems to be very well architected and fairly contained within https://github.com/OSGeo/PROJ/blob/3ae09c3ba164728e200a3b930b72ed5fc24ef6ee/src/iso19111/factory.cpp. There's around 50 different queries but only 5 are complex. I quite like the direction @rouault was going (.text entry in some DLL), but I think the queries are simple enough that it might be possible to use in-memory data-structures rather than through SQL. Alternatively it might be possible to just make the |
I wouldn't want the code to be bound to the existing SQL requests. Newer one might be added, or existing changed. At some point, the EPSG dataset structure will be changed (presumably end of this year), and so proj.db schema will also have to adapt, and the queries will change for sure. |
Cool. I understand proj.db is fine for 99% of use-cases, but it'd be nice if we could make this work for 1% of users which are either not interested on the grid files or need a self-contained build without external resources. I like your idea of using DatabaseContext. Basically I think an implementation plan could work along the lines of:
|
#1595 is a very rough draft which is only intended to show that the approach I proposed can potentially work. I'd be happy to spend more time to finish and clean it up. I'd like to move the SQLite DatabaseContext to its own file so it can be conditionally compiled, add a cmake build option, write some tests, and perhaps an example of how a user can implement their own DatabaseContext interface. Please let me know if there's interest in merging something along those lines if I were to spend more time on it. |
It is not likely that you will be able to build PROJ without SQLite in the future. The reasons for this has been expressed quite clearly in the above comments.
If you after reading the above discussion are still interested in making PROJ build without SQLite, you would have to write up a RFC describing in detail how you would do it. The RFC would have to adress all the various concerns the PROJ core contributors has expressed above and should preferably include a prototype that demonstrates a real world implementation of a PROJ that can build and be used without an SQLite dependency. |
I also just faced a problem with the embedded SQLite db: It's very slow to initialize lots of proj contexts and uses a lot of memory. My whole application used to consume about 10 MB, with Proj 6.2.1. and about 100 Proj instances the application uses about 500 MB (before somebody asks: those 100 instances are all required). I'm about to profile a bit more. But it looks like Proj 6 initializes by reading the Db again for every instance. And the unit tests take several minutes longer (from 2 to 5 minutes). Update: |
Yes, you need to minimize the number of contexts
Make sure you use sqlite >= 3.11 |
That's a quick response =). Sqlite Version is 3.28, System is a Macbook Pro 16 / Core i9. I'll investigate how to reduce the contexts. But I have a bad feeling that it would require quite a lot of work to introduce a relationship between actual worker threads in the main application and the library where the projection methods are in. NB: There are a few leaks in proj_normalize_for_visualization(). |
Reproducer ? |
Sorry, the leak was my fault (faulty copy constructor). The context creation will be challenging. I'll investigate further but even if that works somehow and reduces memory usae, it will still be consuming too much for my application. Optimally it would only initialize as long and take as much memory as required for the actual projection. |
To give some feedback, I did some further tests and considered my options for integrating proj 6 into my project. Unfortunately I had to remove proj 6 and use another library because of slow initialization, memory usage and architectural choices (e.g. (re-)using proj contexts with worker threads was cumbersome). I second the initial request to get rid of the SQLite DB, and additionally trim down memory usage and initialization time. I'd like to use proj again in the future and will try again if there are improvements to these issues. |
I would like to add my vote for getting rid of the SQLite dependency, at least optionally. I have been trying to move my product, CartoType, to proj 6 and have found it impossible for that reason. An earlier commenter said "The projection library that we want (and it's how we have used it in the past) is a library of plain computations that runs just on CPU and memory and doesn't know about the rest of the system." His comments apply to us precisely. |
It would require much more than a vote, but significant effort to implement and thus someone to take the lead or fund someone else to do it. The SQLite dependency enables PROJ to provide much higher value than just map projections. The reality of today and tomorrow geodetic, and also cartographic (as the frontier between both tends to blur), needs, requires to be able to deal with datum shifting, time based transformations, etc etc and rely on a database providing the necessary information. I also question the impossibility of being able to use PROJ6 as it is today as being a lack of effort in trying. It should be abloe to create static builds with a sqlite3 static lib, and if just using proj_create() with pipeline strings, no database access should be attempted (or could probably be made to work with modest effort even if it is currently tried) |
This is very reasonable and I fully agree with this statement. However I think we need to look at it from different angles:
I’m affected by issue 3. Issue 2 is potentially a non-issue when optimized. SQLite itself isn‘t a problem when integrated well. Idea: Keep SQLite but change the behavour to initialize the data lazily. It might still require some work, but it‘s a compromise to reduce file IO for those who don‘t yet want the SQLite dependency and will allow to run Proj 6 in memory and CPU constrained environments (I‘ve a define in my appto seamlessly switch between Proj 6 and another lib; I really can‘t use Proj 6 on several targeted devices due to memory requirements and init times as outlined above). A further optimization for issue 2 might be to embed the db and have it load from memory, entirely avoiding any further file IO. SQLite makes that rather easy. |
That's mostly the current behaviour. But createOperations() need to explore a lot of objects in the database (see https://proj.org/operations/operations_computation.html for how it works), hence a significant part of it is probably accessed. I'm not sure having the DB embedded in the binary will change I/O related performance, and certainly it will not have any impact on memory consumption. |
... but about 5MB memory per instance and the initialization times I see is a lot. |
Re-reading this thread, it seems that for most people who don't want the SQLite dependency, just using plain cartographic projection and providing already constructed pipelines to proj_create() would be sufficient. It doesn't appear that abstracting the database interface to be able to plug another DB would be that needed. What would be lost by building without the SQLite dependency:
Besides code changes in the library, the main impacts I see are more on the test suite where we'd have to separate tests that directly or indirectly depend on the database. If some of the persons participating in this thread are interested in funding such effort, they can email me (even.rouault at spatialys.com) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
That would be what @yonarw and I want. The limitations that you listed are what I expected.
We obviously need to discuss with our management about this, and will reopen if we have a path forward. |
Good discussion. Only thing I would add is the niche of using GDAL/PROJ4 on Android. SQLite dependency is not satisfied in cross-compilation to Android. But I could probably include the source from sqlite.org. Just not very clean imo. |
My product, CartoType, uses PROJ4 on Android, iOS, Windows, Linux, Mac OS, Raspberry PI, Jetson Nano and many other platforms. It's an absolute must that PROJ4 doesn't have the SQLite dependency. Luckily I can carry on using the previous version without any difficulty, but at some point I might want to revisit this problem. If I have the leisure I might attempt my own fork. It was a strategic error to introduce the SQLite dependency because of its great impact on runtime memory use, among other things. @rouault said 'The SQLite dependency enables PROJ to provide much higher value than just map projections.'; but actually a library that just provides map projections is exactly what many of us need, and it ought to be possible to separate that out from the other functionality, which actually reduces the net value to us. |
I'm going to be a bit provocative because I'm a bit annoyed by the constant whining in this thread: why should we care about what value PROJ brings to you, if you don't bring value to PROJ ? The changes of PROJ 6 were done on purpose (this was a strategic decision) because people badly needed a first class geodetic transformation library, which it is now, and gathered the needed funding to make that happen. As outlined before, I guess we could still be able to produce a cut down version that has projection support only, but that won't happen by magic. Otherwise nostalgic people can just indeed fork off and leave their lives without being tied to our crazyness :-) |
Being provocative is never productive or useful. I have avoided personal remarks or slurs like 'whining' and 'fork off' and I suggest that we all do that here. My criticism, and that of others, is well-meant. We believe that we are bringing value to the project by our criticisms, and, as I and others have mentioned, we might also help with making desired changes to make PROJ more useful to everybody. I have benefited hugely over the last 18 years from criticism of my product, CartoType. While a complaint may initially cause me to feel resentful, I always try to think it over and work out how I might learn from it and improve my work. |
At this point you are not. Actual contributions, funding or code, would be bring value to the project. As has been expressed many times, none of the current active developers has any use for a SQLite-less PROJ (quite the contrary) so the changes you seek are not going to happen by asking for it over and over. Step up and provide some actual value and we can continue this discussion. Until then you can use PROJ 5 and earlier which comes without the SQLite dependency free of charge. |
@kbevers We obviously disagree about what constitutes value, and you already know that I believe that constructive criticism has a value of its own. However, I'll leave it there until I have time to take another look at the code. I did in fact start to attempt a clean removal of the SQLite dependency, but it was too hard, and as you say, I can continue to use the older version. |
That much is clear. It is obviously of value to you if the PROJ project take on the burden of developing a SQLite-free version of the software. You can update your commercial software package with 3 years of PROJ bug fixes and new projections and bring value to your customers. And I suspect also increase your revenue. If there's a kickback from that revenue to the project we would likely be more welcoming to the suggested change but so far there has been no indications of that from any of the inquiries we've had regarding this topic. So as far as I can tell your "constructive criticism" only leaves us with an increased maintenance burden of a feature the project doesn't really want. From the viewpoint of a maintainer of the project that is not value. I am going to lock this discussion. I believe everything that needs to be said on this topic has been touched several times in this discussion and reiterating those arguments doesn't get us anywhere. Should someone in the future actually want to make this change happen, here's a few options that will get you of to a good start:
|
We are currently using PROJ v5.1.0 and want to continue using the library. We are especially interested in the new WKT parsing capabilities of version 6+. Looking at the code a bit it seems sqlite is a hard dependency when we want to be able to parse WKT.
Our question is: Would there be a possible/reasonable way of providing a variant of PROJ without sqlite that still is able to parse WKT?
We'd like to work on and contribute a possible solution for this but first wanted to ask what the community thinks about it.
The text was updated successfully, but these errors were encountered: