Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read only config file doesn't dictate projects list in webapp #3414

Closed
Ymoise opened this issue Jan 30, 2021 · 8 comments
Closed

Read only config file doesn't dictate projects list in webapp #3414

Ymoise opened this issue Jan 30, 2021 · 8 comments
Labels

Comments

@Ymoise
Copy link

Ymoise commented Jan 30, 2021

I have two folders currently being shown as projects by the webapp - Folder1 and Folder2.

I want to get to a point where I still have Folder1 as a project, but Folder2 is replaced by Folder2's subfolders, so I would have Folder1, Subfolder1, and Subfolder2 as projects.

I created this read-only config file as a PoC, to see if I could get any of the subfolders to register as a project:

<?xml version="1.0" encoding="UTF-8"?>
<java version="11.0.6" class="java.beans.XMLDecoder">
        <object class="org.opengrok.indexer.configuration.Configuration">
                <void property="projects">
                        <void method="put">
                                <string>Subfolder1</string>
                                <object class="org.opengrok.indexer.configuration.Project">
                                        <void  property="historyEnabled">
                                                <boolean>true</boolean>
                                        </void>
                                        <void property="indexed">
                                                <boolean>true</boolean>
                                        </void>
                                        <void property="name">
                                                <string>Subfolder1</string>
                                        </void>
                                        <void property="path">
                                                <string>/Folder2/Subfolder1</string>
                                        </void>
                                </object>
                        </void>
                </void>
        </object>
</java>

I added the -R flag to the indexer script and triggered it and from the looks of the console output, it used the file, but... when I went to the webapp, after the indexing was done, I still saw Folder1 and Folder2 as the projects, and not what I set up at all.

I'm running docker image 1.3.11

Please advise. Is this a bug or am I really off, here?

@vladak vladak added the question label Feb 1, 2021
@vladak
Copy link
Member

vladak commented Feb 1, 2021

By definition every immediate subdirectory under source root is a project. So, you cannot really say /Folder2/Subfolder1 in the project path. You can create symlinks to these subdirectories in the source root directory, the Folder2 will remain listed as a project, though.

Also, it depends on how you run the indexer. If it performs project detection and repository scan it may override the read-only configuration. The cure for that is to use the per project workflow - basically populate the project/repository list initially and then add/remove individual projects. For such use case it might be overkill, though.

@Ymoise
Copy link
Author

Ymoise commented Feb 1, 2021

Well, my next task is actually to circumvent project detection and repository scan (the idea is to speed up the indexing so we could switch History back on. It's currently off to aid performance), so, assuming Overkill is my middle name... how do I go about doing populating the project/repository list and then managing it, like you said?

@vladak
Copy link
Member

vladak commented Feb 1, 2021

Building the list of projects is usually fast (assuming you don't have huge list of projects) as it is essentially just a source root directory listing. It is not really possible to turn off the project "scan" because the -P indexer option has 2 meanings:

  • scan for projects
  • enable projects

It should be possible to set the projectsEnabled property in the read-only configuration and avoid -P (while using -R with said configuration). I have not tried that, though.

Scanning for the repositories inside projects is not so cheap as it usually involves running bunch of external commands (such as hg, git and such) however is at least parallelized (there is room for more parallelization. That would require smarter approach). The repository scan can be turned off by avoiding the -S indexer option.

Note that this is not something where the OpenGrok Docker image is no longer helpful. That said, I have a set of changes to convert the mirroring/indexing in the Docker image to use the opengrok-sync Python program that basically employs the per project workflow as described on the wiki.

@vladak
Copy link
Member

vladak commented Feb 2, 2021

Back to the original problem description: if a set of projects is set (sic!) via the read-only configuration (which has the projectsEnabled property set to true) and the indexer is run without the -P option, the set of projects from the read-only configuration obliterates the pre-existing set of projects. So, in fact it dictates the list of projects too strongly. There are two ways how to overcome this: either use the config_merge tool to merge read-only configuration and active configuration (#2147 thwarts this however) or add the project to the configuration via the projadm tool (or just use the RESTful API) and then retrieve the configuration from the web app. The latter is part of the per project workflow.

@Ymoise
Copy link
Author

Ymoise commented Feb 3, 2021

Suppose I create symbolic links to the subfolders in the root folder so that, like you suggested... and then add folder2 to the ignored list?

Would the indexer still index the projects represented by the symbolic links, or will it ignore them, too?

@vladak
Copy link
Member

vladak commented Feb 3, 2021

Any symlinks directly under source root will be automatically accepted and followed, even if they point to directories outside of source root.

@Ymoise
Copy link
Author

Ymoise commented Feb 3, 2021

Cool. So I don't need the read-only config file for that.

I'll do that.

Before I close this, though, one last question about project and repo discovery: Like I said, I had the -S switched off to improve performance, because, like you said, it ain't cheap. I tried to circumvent it by specifying

--repository /folder1
--repository /folder2

But no matter what I do, I get

Feb 03, 2021 11:54:31 AM org.opengrok.indexer.history.HistoryGuru getReposFromString
WARNING: Could not locate a repository for /opengrok/src/folder1
Feb 03, 2021 11:54:31 AM org.opengrok.indexer.history.HistoryGuru getReposFromString
WARNING: Could not locate a repository for /opengrok/src/folder2

And, consequently... I don't have a history for ANYTHING.

@Ymoise
Copy link
Author

Ymoise commented Feb 3, 2021

Read the reply in my previous ticket.

Thank you and sorry.

closing.

@Ymoise Ymoise closed this as completed Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants