Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apix does not filter broadcasted messages #138

Open
christopher-johnson opened this issue Aug 20, 2017 · 4 comments
Open

apix does not filter broadcasted messages #138

christopher-johnson opened this issue Aug 20, 2017 · 4 comments

Comments

@christopher-johnson
Copy link

christopher-johnson commented Aug 20, 2017

In my configuration, I am using the broadcaster with several message queues. APIX is generating a lot of extra requests if I broadcast from the fcrepo queue to its route. See attached log.
fcrepo_access.txt

If I remove broker:queue:apix from the fcrepo queue broadcast, the services cannot self-register on initialization and I get this:
fcrepo-api-x-routing - 0.3.0.SNAPSHOT | No binding for http://apix/services//apix:load

@christopher-johnson christopher-johnson changed the title apix-listener does not filter broadcasted messages apix does not filter broadcasted messages Aug 21, 2017
@birkland
Copy link
Contributor

There are also almost certainly redundant requests that can be eliminated with minimal caching as well. This should mostly be low-hanging fruit.

@christopher-johnson
Copy link
Author

Can you explain why this happens a bit more? I have to admit, that I am a bit confused by the source. The behaviour that I see is that whenever the triplestore indexer receives an actionable broadcast message from the fcrepo queue and sends a request through the fcrepo camel endpoint, the apix instance duplicates it, but not to the same resource, only to service endpoints. I have tried to isolate this to some part of Apix like the listener, but it does not seem like it is the source. Perhaps this is something directly in fcrepo-camel? I agree that the requests have minimal performance impact on fcrepo, but with so many concurrent requests does not blocking occur? What I want to do is dramatically increase the repo response times for indexer requests, which I think is rather slow. I am interested in HTTP2 multiplexing or async as a solution.

@birkland
Copy link
Contributor

Ah, the service indexer would move out of the core API-X code base and into something like fcrepo-camel-toolbox; It's really just a value-add component for those who which to index service documents.

It's a little bit tricky. A given service document can change whenever (a) its corresponding resource changes, or (b) an extension changes, or is added or removed.

When (a) happens, then the camel route that indexes service docs will fetch the service doc for that resource (by following the link header), and ship that off the be indexer. Ideally, with the proper caching, the service doc generator would only need to fetch the content of the updated resource. As it stands now, I believe it makes several requests. That can be cut down considerably.

When (b) happens, it essentially triggers a reindex of all service docs. There's no easy way a priori to know that an individual resource's service doc is affected, so it reindexes all of them. It may be helpful to be able to turn this feature on and off (for example if there are several planned updates to extensions, or if it is known that an update won't affect binding, etc). Additional logic can also be added in order to automatically filter out updates to existing service definitions that won't affect binding.

@christopher-johnson
Copy link
Author

christopher-johnson commented Aug 25, 2017

I took another look at this from the karaf point of view and have confirmed my initial suspicion that the source is the listener-update-apix route. If I suspend this route in karaf, the extra requests stop. I attach a console log and a debug message for reference.
apix-camel:route-list.txt

I am pretty sure that this is a bug here

I have made it work filtering by URI only:

 .filter(or(header(FCREPO_URI).contains(TYPE_APIX_SERVICE), header(FCREPO_URI)
                        .contains(TYPE_APIX_EXTENSION)))

I defined these types as a String (i.e. "services" and "extensions"). Filtering by CamelFcrepoResourceType like this:

                .simple("${in.header.CamelFcrepoResourceType} in 'http://fedora" +
                        ".info/definitions/v4/service#ServiceRegistry,http://fedora" +
                        ".info/definitions/v4/api-extension#Extension,http://fedora" +
                        ".info/definitions/v4/service#ServiceInstance,http://fedora" +
                        ".info/definitions/v4/service#Service'")

does not work because a Service Definition does not have a unique CamelFcrepoResourceType ... as seen in the following header dump:

apix                        | 2017-08-25 10:07:51,780 | INFO  | msConsumer[apix] | UpdateListener                   | 63 - fcrepo-api-x-listener - 0.3.0.SNAPSHOT | Processing service update for {breadcrumbId=ID:68ff47879c88-46441-1503655660655-1:1:1:1:13, CamelFcrepoAgent=[bypassAdmin, Apache-HttpClient/4.5.2 (Java/1.8.0_111)], CamelFcrepoDateTime=2017-08-25T10:07:50.702Z, CamelFcrepoEventId=urn:uuid:ea9afd6a-f06c-4423-a983-4e4a38ca3fef, 

CamelFcrepoEventType=[http://fedora.info/definitions/v4/event#ResourceCreation, http://fedora.info/definitions/v4/event#ResourceModification, http://www.w3.org/ns/prov#Activity], 

`**CamelFcrepoResourceType=[http://www.w3.org/ns/ldp#Container, http://fedora.info/definitions/v4/repository#Resource, http://fedora.info/definitions/v4/repository#Container, http://www.w3.org/ns/ldp#RDFSource, http://www.w3.org/ns/prov#Entity], 

CamelFcrepoUri=http://fcrepo:8080/fcrepo/rest/apix/services/definitions-v4-api-extension-LoaderService,**` 

CamelJmsDeliveryMode=2, JMSCorrelationID=null, JMSCorrelationIDAsBytes=null, JMSDeliveryMode=2, JMSDestination=queue://apix, JMSExpiration=0, JMSMessageID=ID:messaging.pandorademo_default-40611-1503655654005-1:5:1:1:39, JMSPriority=4, JMSRedelivered=false, JMSReplyTo=null, JMSTimestamp=1503655671000, JMSType=null, JMSXGroupID=null, JMSXUserID=null, org.fcrepo.jms.baseURL=http://fcrepo:8080/fcrepo/rest, org.fcrepo.jms.eventID=urn:uuid:ea9afd6a-f06c-4423-a983-4e4a38ca3fef, org.fcrepo.jms.eventType=http://fedora.info/definitions/v4/event#ResourceCreation,http://fedora.info/definitions/v4/event#ResourceModification, org.fcrepo.jms.identifier=/apix/services/definitions-v4-api-extension-LoaderService, org.fcrepo.jms.resourceType=http://www.w3.org/ns/ldp#Container,http://fedora.info/definitions/v4/repository#Resource,http://fedora.info/definitions/v4/repository#Container,http://www.w3.org/ns/ldp#RDFSource, org.fcrepo.jms.timestamp=1503655670702, org.fcrepo.jms.user=bypassAdmin, org.fcrepo.jms.userAgent=Apache-HttpClient/4.5.2 (Java/1.8.0_111)}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants