Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6819 - Include PostgreSQL JDBC driver in WAR #7048

Closed
wants to merge 25 commits into from

Conversation

poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Jul 1, 2020

What this PR does / why we need it:

With Payara 5, we don't have to deal with pushing the Postgres driver JAR into the right places. Instead it's much easier to have a single place to run to: a version used in pom.xml, shipped with the WAR, used consistently on the app server deployment.

Which issue(s) this PR closes:

Closes #6819

Special notes for your reviewer:
Nada. Maybe the docs changes about the timer issues. Would be very strange to see those again.

Suggestions on how to test this:
Follow release notes docs and guide docs. Deploy to Payara. Look for the EJB timers being created in the DB. Test as usual. This should work as before. Looking at the Flyway Schema History should also reveal that the migrations have been applied as usual.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope.

Is there a release notes update needed for this change?:
🔋 included.

Additional documentation:
Nada.

@poikilotherm poikilotherm changed the title 6819 include postgres 6819 - Include postgres in WAR Jul 1, 2020
@poikilotherm poikilotherm changed the title 6819 - Include postgres in WAR 6819 - Include PostgreSQL JDBC driver in WAR Jul 1, 2020
@coveralls
Copy link

coveralls commented Jul 1, 2020

Coverage Status

Coverage decreased (-0.0005%) to 19.485% when pulling 93d3b6a on poikilotherm:6819-include-postgres into 5deeec4 on IQSS:develop.

@pdurbin pdurbin self-assigned this Jul 8, 2020
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind this pull request is that we don't need to instruct people to put the postgres driver in the "lib" directory of Payara. (Typically this is done with the installer.) Rather, we can bundle it with the Dataverse war file. It's small, less than 900K.

I did the following quick test:

  • Remove postgres driver from "lib" in Payara.
  • Try to run "develop". Fails, as expected.
  • Switch to this branch. Works. The postgres driver doesn't need to be in "lib". Great.

Changes are made to the installer in this pull request and I didn't test them. They should be tested. As of this writing the Perl installer hasn't been updated. I'm not sure if we're maintaining both.

The Makefile was also updated to no longer include the driver.

tests/jenkins/groupvars.yml Show resolved Hide resolved
scripts/installer/install.py Show resolved Hide resolved
@pdurbin pdurbin removed their assignment Jul 8, 2020
@pdurbin
Copy link
Member

pdurbin commented Jul 8, 2020

After chatting with @landreev I pulled this out of QA so he has time to comment.

@poikilotherm one thing that you could perhaps comment on... What is the short term or long term vision for this change? I saw stuff about "modern" and Testcontainers and configuration via annotations. Is that where we're going? Thanks. 😄

@landreev
Copy link
Contributor

landreev commented Jul 9, 2020

This may not be an issue necessarily, I'm just curious to understand how this works. If the driver is part of the application war file, this means it is only available while the application is deployed. But the database is defined on the domain level (<jdbc-connection-pool datasource-classname="org.postgresql.ds.PGPoolingDataSource" name="dvnDbPoo l" res-type="javax.sql.DataSource">). Does that really mean that you can define a connection pool, and specify a datasource driver that's not actually available to the application server?

On our production servers we define this data source with the extra "validation-required" attributes: validation-table-name="setting" is-connection-validation-required="true" - so we are telling the application server to keep pinging the database to confirm that it's alive. What's going to happen when the application is undeployed?

What about the ejb timer - in our setup, it uses the main PG database. But the timer is an application of its own, and it appears to still be there (deployed under .../domain1/applications/ejb-timer-service-app) even after the main application is undeployed? Granted, we do not need the timer when the application is not running; and it may be automatically deactivated when there are no applications deployed (?).

I just want to know for sure that Payara is not going to be confused and crash after asadmin undeploy dataverse.

@poikilotherm
Copy link
Contributor Author

I did not test what happens when you undeploy, as this no scenario for me.

As long as the app will be deployed, the domain level and timer database connections seem to work just fine.

@landreev
Copy link
Contributor

Does that really mean that you can define a connection pool, and specify a datasource driver that's not actually available to the application server?

The answer appears to be "yes" - I was able to do things like asadmin create-jdbc-connection-pool ... --datasourceclassname org.postgresql.ds.PGPoolingDataSource without the PostgreSQL driver being present.

As long as the app will be deployed, the domain level and timer database connections seem to work just fine.

Sure, but undeploying and redeploying is part of normal operations. So we need know for sure it's not going to result in Payara crashing every time, especially with the database monitoring in the configuration. Should also be a straightforward thing to test; but it is an extra thing to test. I'm wondering if it would be ok to delay testing and merging this PR until after dataverse v5 is released? I don't have a strong opinion on this, but will ask others on the team.

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Jul 13, 2020

Hi Leonid @landreev, my mid term goal for all of this is moving the database definitions from the installer/server level to the application.
See https://javaee.github.io/javaee-spec/javadocs/javax/annotation/sql/DataSourceDefinition.html and others.

Not just the DB connection, but ideally all resource definitions currently done by the installer. It will be configurable, so no one has to be afraid of lost control.
See https://docs.payara.fish/community/docs/5.201/documentation/payara-server/server-configuration/var-substitution/types-of-variables.html

Once this is done, there will be no problem with a dangling connection, as it is available as long as the app is deployed. 😉

I felt like this would be beyond scope for this issue/PR, but if you think this should be done in the same go, I'd be happy to add more. We can keep it to the DB connection for now.

IMHO this would be a major step forward in terms of modern deployment strategies.

This commit is a first test to apply database configuration and
credentials to the application server without using a config
on domain level but on application level. It is not very configurable yet.

The JNDI names have been changed to be conform with Java EE 7.
See https://github.com/javaee-samples/javaee7-samples/tree/master/jpa/datasourcedefinition-applicationxml-pu
and others for working examples. (Staying with the old name was not successfull.)

We had to use the "global" scope, so the persistent EJB timers are successfully created
and stored in the database. Using the "app" scope crashes the deployment.
@poikilotherm poikilotherm marked this pull request as draft July 13, 2020 22:45
@poikilotherm
Copy link
Contributor Author

poikilotherm commented Jul 13, 2020

@landreev and @pdurbin I changed this to a draft PR again to avoid accidents.

Please note that I added two commits (e90d9e7 & 0aaac25) that add the necessary bits for defining the database within the app.
It's not very configurable yet and lacks some specials like the validation @landreev mentioned, but I wanted to give you a quick demo. This works flawless, including EJB timers, when testing on Payara on K8s (mainly to use the static definitions of server etc). Please see also the commit messages for even more details.

Some notes about this. Please find some literature about how to do this and advantages/disadvantages here:

And please note that I switched from PGPoolingDataSource because it has been deprecated since 2017. See pgjdbc/pgjdbc#729. A pool is auto-created by the annotated definition. FWIW: bigger pooling like those mentioned in the driver upstream issue and deprecation warning don't seem necessary. They are also mentioned in the Payara blog article mentioned above.

@poikilotherm
Copy link
Contributor Author

Did you upgrade Payara?

payara/Payara#4864

@kcondon
Copy link
Contributor

kcondon commented Nov 6, 2020

@poikilotherm I think I have found it but not sure yet why:
[2020-11-06T17:26:52.236+0000] [Payara 5.2020] [SEVERE] [] [javax.enterprise.resource.resourceadapter.com.sun.gjc.util] [tid: _ThreadID=89 _ThreadName=admin-thread-pool::admin-listener(1)] [timeMillis: 1604683612236] [levelValue: 1000] [[
RAR5103 : Error setting java bean value : [dvndbsimpledvndbsimple]]]

the db name is dvndbsimple but it is listed twice. probably something I did, will need to look over where it is configured.

  <servers>
    <server config-ref="server-config" name="server">
      <system-property name="dataverse.db.user" value="dvnapp"></system-property>
      <system-property name="dataverse.db.name" value="dvndbsimple"></system-property>
      <resource-ref ref="jdbc/__TimerPool"></resource-ref>
      <resource-ref ref="jdbc/__default"></resource-ref>
      <resource-ref ref="concurrent/__defaultManagedExecutorService"></resource-ref>
      <resource-ref ref="concurrent/__defaultManagedScheduledExecutorService"></resource-ref>
      <resource-ref ref="jms/IngestQueueConnectionFactory"></resource-ref>
      <resource-ref ref="jms/DataverseIngest"></resource-ref>
      <resource-ref ref="mail/notifyMailSession"></resource-ref>
    </server>

@poikilotherm
Copy link
Contributor Author

@kcondon not sure if your last comment was a question, a note to self or a hint I should chime in...?

@kcondon
Copy link
Contributor

kcondon commented Nov 6, 2020

@poikilotherm Apologies, was distracted but will dive back in.
i have stack traces in the server log when it fails. There are a couple. I provided a small part of the first one above. It was all of the above: an observation, something I needed to look at, and a question to you about whether you recognize how it might have happened. I looked and my entries appeared correct. I will look once more for duplicates but aside from command line cut and paste, not sure how it would have happened? Don't want to keep you up any later so can follow up on Monday if needed.

@kcondon
Copy link
Contributor

kcondon commented Nov 7, 2020

@poikilotherm
After some missteps on my part (apologies for the noise!) I can confirm the installer works and I am also able to deploy to an upgraded system if I use the domain.xml from the clean install.
I don't see anything unusual about the domain.xml so I'll try once more to use the manual steps.

OK, I've found what didn't work with the manual steps. It turns out I needed to specify host and port even though it was running on localhost, 5432 so the following instructions in release notes appear to be incorrect:
If you are using a PostgreSQL server on localhost:5432, you can omit dataverse.db.host and dataverse.db.port.

Also: I am not able to get the harvest timer to work and restarting payara does not reload the application, though it is listed in memory as deployed.

Issues:
-update release notes, opinions may vary here, but note that upgrading to .5 payara is required for this release and that it should be done before adjusting db conn pool and deploying war file and they should confirm the existing deployed app works with the new payara config before proceeding.
-update release notes, in the delete db pool command, need to change dvnDBpool to be dvnDbPool
-update release notes, remove part about omitting host and port if localhost:5432 since on my box at least one cannot deploy without them, gets invalid siteUrl error.
-cannot restart app server and have application load up although it is still shown as deployed by asadmin.
-was not able to get harvest client timer to fire off but this may have been complicated by the restart issue above. can try another approach but so far no luck.
-have noticed deploying and undeploying seem a bit slower. not a major concern but @pdurbin had opened an issue about making it faster.

Friday was not my best testing day, given all the blind alleys I went down, so perhaps some of the above will resolve on Monday upon review. I will go through them once more to confirm but it might help to have someone else on the team validate the update from existing payara, pre .5.

@kcondon
Copy link
Contributor

kcondon commented Nov 9, 2020

@poikilotherm Passing back for comments or resolution on issues.

@kcondon kcondon removed their assignment Nov 9, 2020
@poikilotherm
Copy link
Contributor Author

poikilotherm commented Nov 10, 2020

Issues found by QA (@kcondon):

  • update release notes, opinions may vary here, but note that upgrading to .5 payara is required for this release and that it should be done before adjusting db conn pool and deploying war file and they should confirm the existing deployed app works with the new payara config before proceeding. (joined the files, emphasized and updated to be more like other release notes on upgrading.)
  • update release notes, in the delete db pool command, need to change dvnDBpool to be dvnDbPool
  • update release notes, remove part about omitting host and port if localhost:5432 since on my box at least one cannot deploy without them, gets invalid siteUrl error. (fixed by adding missing properties file for MP Config)
  • cannot restart app server and have application load up although it is still shown as deployed by asadmin. (Tried really hard to reproduce. No luck. Used local installation.)
  • was not able to get harvest client timer to fire off but this may have been complicated by the restart issue above. can try another approach but so far no luck.
  • have noticed deploying and undeploying seem a bit slower. not a major concern but @pdurbin had opened an issue about making it faster. (Both ways of deployment had reproducable times of 32.5 secs +-1.0 sec. Ran both 5 times each.)

@kcondon can we try this again...?

EDIT:
Today @kcondon I saw some errors with the timer service database connection. There seem to be some race conditions during deployment and when shutting down the domain.
EDIT2:
This is due to "undeploy". When the app is gone, the EJB timers go crazy because the DB is not available anymore. It looks like the EJB timers are picky when you redeploy after that and don't pick up the new connection. This has never happened with complete stop and start of domain. As the release notes state "undeploy first", this is likely to cause trouble.

Maybe we should solve #5345 first, before attacking this. Feel free to move back to Code Review / Community Dev.

@poikilotherm
Copy link
Contributor Author

@donsizemore as you mentioned it today on IRC: do you want us to update to PostgreSQL Client 42.2.18?

From what I can see from the Changelog between .12 and .18, there shouldn't be anything biting back when pushed.

@pdurbin
Copy link
Member

pdurbin commented Nov 13, 2020

I'm keeping this pull request in code review for now but I expect some movement after @poikilotherm @scolapasta @landreev @qqmyers and I discuss it after the community call on Tuesday.

My take (after discussing with @poikilotherm yesterday) is that this pull request and the 5345-nonpersistent-ejb branch linked from #5345 could be split up into four smaller pull requests that could move across the board in this order:

Even if we don't split up this pull request, I don't believe we want to merge it as-is because there are problems with undeploying (having to do with timers using the database).

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Nov 17, 2020

Closing in favor to #7416, #7419, #7420 and #7422

@pdurbin pdurbin removed their assignment Nov 17, 2020
@poikilotherm poikilotherm deleted the 6819-include-postgres branch February 1, 2021 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include Postgres JDBC driver in pom.xml
6 participants