Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Documentation - Fix code links and improve tutorial layout #6321

Merged
merged 33 commits into from
Sep 26, 2023

Conversation

saurabheights
Copy link
Contributor

@saurabheights saurabheights commented Aug 18, 2023

Type

  • Bug fix (non-breaking change which fixes an issue): Fixes #
  • New feature (non-breaking change which adds functionality). Resolves #
  • Breaking change (fix or feature that would cause existing functionality to not work as expected) Resolves #

Motivation and Context

A lot of docs had gone out of sync in repo where it refered to some code file where changes were made. I tried to come up with approach which keeps docs in sync with new code changes. At the same time, I always wanted to rearrange tutorial layout so following docs is easier.

There were two options:

  • Add start-after/at;end-at/before comments which can be added to code making it easy to keep docs sync with code. However this reduces readability. (discussed with Wei on discord)
  • Finalized on using pyobject with prepend option which can be used to show a class/method/function along with filepath (using prepend) containing that code. Where docs refer to only some lines inside some method, I only fixed line number.

Note - prepend is not used everywhere, found it a little late. I will review the PR myself once I get some time free.
Update - Reviewed all.

Checklist:

  • I have run python util/check_style.py --apply to apply Open3D code style
    to my code.
  • This PR changes Open3D behavior or adds new functionality.
    • Both C++ (Doxygen) and Python (Sphinx / Google style) documentation is
      updated accordingly.
    • I have added or updated C++ and / or Python unit tests OR included test
      results
      (e.g. screenshots or numbers) here.
  • I will follow up and update the code if CI fails.
  • For fork PRs, I have selected Allow edits from maintainers.

Description

The reasoning behind changes of layout is mentioned here in Open3D discord : https://discord.com/channels/467404918333112321/573049248095141889/1141704493768134806

Commits are atomic, so even though they might take longer to review, reviewing one at a time is a lot less taxing.


This change is Reviewable

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@saurabheights
Copy link
Contributor Author

CI failed here https://github.com/isl-org/Open3D/actions/runs/5907642946/job/16025904216?pr=6321#step:5:2442 due to download failure of armadillo mesh

@ssheorey
Copy link
Member

Hi @saurabheights thanks for the doc updates. We'll get back with reviews shortly. Don't worry about the CI failure - it's a network download error that happens occasionally on github.

@reyanshsolis
Copy link
Collaborator

Thanks @saurabheights for this much needed update in documentation. Reviewing changes in the documentation, will get back soon.

@reyanshsolis
Copy link
Collaborator

Bug:
image

Copy link
Collaborator

@reyanshsolis reyanshsolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also difference in this part:
Left - Version 0.14 Doc, Right - This PR

image

You may download the documentation generated from this PR from (https://github.com/isl-org/Open3D/actions/runs/5907642936). Then you may open it side-by-side with version 0.14 documentation (http://www.open3d.org/docs/0.14.1/tutorial/t_reconstruction_system/integration.html) which is the stable one, and compare changes for the same. Once they are same, push the changes and re-request for review.

@saurabheights
Copy link
Contributor Author

@reyanshsolis - Thank you for the review. A bit busy but I will finish comparison of v0.14.0 and current by EOW.

P.S. -
#6321 (comment) - I intentionally skipped this and kept it for later PR, only to keep current PR shorter (it was already getting quite large).

The issue here is that code for saving and loading of vbg to numpy npz file has been removed from tutorial, so just changing line number wont solve the problem. I would rather keep the tutorial as it is, and add saving and loading in code as many would find save/load SDF functionality useful.

#6321 (review) - Fixed.

@saurabheights
Copy link
Contributor Author

@reyanshsolis So, I noticed one regression in display of line numbers. A sample screenshot attached:-

Screenshot from 2023-08-30 11-28-15

I have pyobject directive that allows to add a whole method and tried to fix the line number using lineno-match directive. This has fixed line numbers as in this commit - fc04d88 for interactive_visualization.rst and non_blocking_visualization.rst.

The problem is in reconstruction_system tutorials:-

  • register_fragments.rst
  • refine_registration.rst
  • integrate_scene.rst
  • make_fragments.rst

Files here usually have a python method and filename added to docs. I initially used pyobject with prepend option to ensure code changes in future does not make docs go out of sync, as done below:-

.. literalinclude:: ../../../examples/python/reconstruction_system/register_fragments.py
   :language: python
   :pyobject: preprocess_point_cloud
   :linenos:
   :prepend: # examples/python/reconstruction_system/register_fragments.py

The problem is prepend doesnt work with lineno-match, i.e. no code is rendered in docs. Right now, I have 3 options:-

  1. Use old line number approach. Include code by lines and not use pyobject.
  2. Keep the changes as it is - Ignore the change in line number. - (Not good, IMO)
  3. Remove prepend which removes filename and add lineno-match.

I think 3 is fine to go for as http://www.open3d.org/docs/release/tutorial/reconstruction_system/system_overview.html#quick-start already provides location of code.

Let me know if 3 is ok and I will wrap this up this weekend.

@reyanshsolis
Copy link
Collaborator

Yes (3) seems like the most reasonable approach to me as well.

Copy link
Contributor Author

@saurabheights saurabheights left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed with side-by-side comparison of current and v0.14.1 docs. Looks good. @reyanshsolis You can review now. Thank you.

@saurabheights
Copy link
Contributor Author

saurabheights commented Aug 31, 2023

I think an upgrade of documentation libraries would be good as sphinx version and open3d_sphinx_theme based on sphinx-rtd-theme version are quite old. For sphinx, nbsphinx, sphinx-rtd-theme, the versions used are over 2-3 years old. I already faced few issues due to older versions of documentation libraries when testing out newer sphinx/m2r2/ nbsphinx features to correct documentation. Will it be alright to gradually upgrade these packages in future prs?

@ssheorey
Copy link
Member

I think an upgrade of documentation libraries would be good as sphinx version and open3d_sphinx_theme based on sphinx-rtd-theme version are quite old. For sphinx, nbsphinx, sphinx-rtd-theme, the versions used are over 2-3 years old. I already faced few issues due to older versions of documentation libraries when testing out newer sphinx/m2r2/ nbsphinx features to correct documentation. Will it be alright to gradually upgrade these packages in future prs?

Yes, good idea!

Copy link
Member

@ssheorey ssheorey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @saurabheights for these updates / improvements to the docs. Mostly everything looks good, except for a couple of comments below.

:linenos:

The main function calls each individual function explained above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this section? It refers to the run() function in make_fragments.py that calls all the previous functions.

Copy link
Contributor Author

@saurabheights saurabheights Sep 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssheorey The reason I removed this is because the run method calls only process_single_fragment, however process_single_fragment is never discussed in documentation before. Also, run method focuses on use of multiprocessing to speedup the code, but thats not really where open3d documentation focus is.

I dived deeper into git history and originally process_single_fragment used to be process_fragments which did all prep before calling previous functions for each fragment. It was documentated in Batch Processing section - https://github.com/isl-org/Open3D/pull/521/files.

See http://www.open3d.org/docs/0.6.0/tutorial/ReconstructionSystem/make_fragments.html#batch-processing for easy check (although code lines went a little out of sync with docs here)

See http://www.open3d.org/docs/0.7.0/tutorial/ReconstructionSystem/make_fragments.html#batch-processing where this got changed to run method.

Will it be better to revert my change but instead we show either

  • process_single_fragment. OR
  • both process_single_fragment and run method. This would go well with addition of few statements explaining use of multiprocessing to speedup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 2 sounds good. Let's show both process_single_fragment and run and explain use of multiprocessing to speed up processing. Optionally the parallel execution in run() can be cleaned up in the example. joblib can be replaced with concurrent.futures.ProcessPoolExecutor or multiprocessing.pool, which is part of the Python standard library.

Copy link
Contributor Author

@saurabheights saurabheights Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssheorey Need your input. I tried replacing joblib, but in an earlier problem at work, I had faced similar issue where open3d hangs with multiprocessing. As per your comment here, #1552 (comment), I moved to forkserver and that is working fine. However, forkserver is not available on Windows, and in this usecase, spawn should be fine as well. Shall I go with this?

max_workers = min(max(1, multiprocessing.cpu_count()-1), n_fragments)
mp_context = multiprocessing.get_context('spawn')
with mp_context.Pool(processes=max_workers) as pool:
    args = [(fragment_id, color_files, depth_files, n_files, n_fragments, config) for fragment_id in range(n_fragments)]
    pool.starmap(process_single_fragment, args)

P.S. Was curious, and found joblib went with forkserver for same reason but later reverted back to fork -

Make joblib use the ‘forkserver’ start method by default under Python 3.4+ to avoid causing crash with 3rd party libraries (such as Apple vecLib / Accelerate or the GCC OpenMP runtime) that use an internal thread pool that is not reinitialized when a fork system call happens.

Later they implemented loky backend which circumvent this issue. https://joblib.readthedocs.io/en/latest/parallel.html#bad-interaction-of-multiprocessing-and-third-party-libraries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea. This will also be a reference to users about how to use Open3D with multiprocessing.
Just add a comment saying "Linux default method 'fork' cannot be used with multithreaded Open3D."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python_mp_spawn is using spawn as starting method.
python_mp_forkserver is using forkserver as starting method.
python_joblib_mp_backend is using joblib with multiprocessing backend.
python_joblib_mp_backend is using joblib with loky backend.
python_single_process is just main process.

Only for 4 fragments with each fragment made of 10 images, I get following measurements:-

Code block 'python_mp_spawn' took:            79258.28545 ms
Code block 'python_mp_forkserver' took:       81994.75125 ms
Code block 'python_joblib_mp_backend' took:   92798.19047 ms
Code block 'python_joblib_loky_backend' took:  8461.00342 ms
Code block 'python_single_process' took:      28939.51906 ms
# Without over allocation of openmp threads, See https://github.com/joblib/joblib/pull/940
os.environ['OMP_NUM_THREADS'] = str((multiprocessing.cpu_count() - 1) // max_workers)
Code block 'python_mp_spawn' took:            7003.84680 ms
Code block 'python_mp_forkserver' took:       7078.38375 ms
Code block 'python_joblib_mp_backend' took:   8058.99887 ms
Code block 'python_joblib_loky_backend' took: 7350.69630 ms
Code block 'python_single_process' took:     30476.80167 ms
# I also tested with OMP_NUM_THREADS=1. Very slightly slower.
Code block 'python_mp_spawn' took:            7040.57154 ms
Code block 'python_mp_forkserver' took:       7271.84740 ms
Code block 'python_joblib_mp_backend' took:   8278.31439 ms
Code block 'python_joblib_loky_backend' took: 7983.64183 ms
Code block 'python_single_process' took:     31091.80214 ms

Didnt make much of a difference, despite with 16 cores and 4 child processes. As this is an introductory example, just setting 1 would be simpler and easier for new users IMO.

Final committed code test

Multiprocess takes Making fragments 0:05:11.274633 (15 cores)
Singleprocess takes Making fragments 0:32:18.475112 (OpenMP number of threads unchanged)

Copy link
Member

@ssheorey ssheorey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @saurabheights . This PR is a really good contribution!

@ssheorey ssheorey merged commit f1a0f3e into isl-org:master Sep 26, 2023
34 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants