Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaned up docs. #1642

Closed
wants to merge 19 commits into from
Closed

Cleaned up docs. #1642

wants to merge 19 commits into from

Conversation

gunjanbaid
Copy link
Contributor

@gunjanbaid gunjanbaid commented Jul 26, 2017

Cleaned up formatting, language, and broken links. @fnothaft Let me know if it's easier for me to open multiple PRs (one for each file?).

@coveralls
Copy link

coveralls commented Jul 26, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 5f8d939 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2288/
Test PASSed.

@fnothaft
Copy link
Member

Thanks @gunjanbaid! A single PR is a-OK by me! I'll make a pass shortly.

Copy link
Member

@fnothaft fnothaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gunjanbaid! Looks solid!

I saw lots of places where we lost anchors/anchor links in the documentation. I'm guessing this was intentional? My preference is to keep the links, as they aid navigation in the HTML and PDF copies. What's your general line on this?

* View our software artifacts on Maven Central at [http://search.maven.org/#search%7Cga%7C1%7Corg.bdgenomics](http://search.maven.org/#search%7Cga%7C1%7Corg.bdgenomics)
* See our snapshots at [https://oss.sonatype.org/index.html#nexus-search;quick~bdgenomics](https://oss.sonatype.org/index.html#nexus-search;quick~bdgenomics)
* Look at our CHANGES file at [https://github.com/bigdatagenomics/adam/blob/master/CHANGES.md](https://github.com/bigdatagenomics/adam/blob/master/CHANGES.md)
* [Follow](https://twitter.com/bigdatagenomics/) our Twitter account
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this, it reads much nicer! Would you mind copying this over to README.md? There is a very similar section of links in there, IIRC.

@@ -1,4 +1,4 @@
# Building ADAM from Source {#build-from-source}
# Building ADAM from Source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC, why'd we lose the anchor tag here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think this was the correct formatting for these anchors, based on adam docs. There seemed to be two issues:

Since you can link to headers with just the text of the header, separated by dashes (for example, http://adam.readthedocs.io/en/latest/source/02_installation/#running-adam), I just removed the anchors and updated existing links to this format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The anchors in the first format don't work when displayed on Github but they do work after generating the docs. Is the second form also compatible with Sphinx?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I would consider the output of the pandoc scripts (docs/build.sh) to be the golden copy.

OOC, did one of us set up the readthedocs page?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I set up the one here https://readthedocs.org/projects/adam/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mine is gunjanbaid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I am fnothaft, same as here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, if you log into readthedocs, you should be able to go to https://readthedocs.org/dashboard/adam and hit the Admin button

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @heuermh! Works for me now.

Copy link
Member

@devin-petersohn devin-petersohn Jul 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: My readthedocs username is devin.petersohn

@@ -49,15 +49,15 @@ to have the Spark binaries on your system; prebuilt binaries can be downloaded f
https://amplab.cs.berkeley.edu/jenkins/job/ADAM/) builds ADAM against Spark versions 1.6.1 and 2.0.0,
Scala versions 2.10 and 2.11, and Hadoop versions 2.3.0 and 2.6.0.

Once this alias is in place, you can run ADAM by simply typing `adam-submit` at the commandline, e.g.
Once this alias is in place, you can run ADAM by simply typing `adam-submit` at the commandline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it, let's change commandline -> command line


```bash
$ adam-submit
```

## Building for Python {#python-build}
## Building for Python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why'd we lose the anchor tag here and the anchor link two lines down?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanation for anchor tags provided in earlier comment.

For the anchor link, I thought it was for an outdated anchor that had been removed, there's no #python anchor. When you click on that link on this page, it does not take you anywhere.

@@ -100,7 +100,7 @@ ASSEMBLY_JAR="$(ls -1 "$ASSEMBLY_DIR" | grep "^adam[0-9A-Za-z\.\_-]*\.jar$" | gr
export PYSPARK_SUBMIT_ARGS="--jars ${ASSEMBLY_DIR}/${ASSEMBLY_JAR} --driver-class-path ${ASSEMBLY_DIR}/${ASSEMBLY_JAR} pyspark-shell"
```

This assumes that the [ADAM JARs have already been built](#build-from-source).
This assumes that the [ADAM JARs have already been built](#).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here RE: anchor link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicking on previous link from docs page does not have any effect. Updated link takes you to the top header. I can update this to be #building-adam-from-source, that is probably more clear than just #.


Note, the steps to register your ssh key and create the template boxes below
need only be done once.
Note, the steps to register your ssh key and create the template boxes below only need to be done once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on removing below in this line? Reads a bit awkward to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will update.

in AWS console that your instances have terminated to avoid unintended costs.

spot instances in order to save on costs. To avoid unintended costs,
it's a good idea to use the AWS console to double check that your
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on removing all contractions in the docs? My preference is to not use contractions in formal writing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of this! Will update.


#### Access Spark GUI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're cleaning up this section, would the header read better as Accessing the Spark GUI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

properly configured for your YARN deployment. Typically, if your Spark
assembly is configured properly to use YARN, there will be symbolic link at
assembly is properly configured to use YARN, there will be symbolic link at
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be symbolic -> be a symbolic

@@ -253,9 +254,9 @@ include:

* [adam-kmers](https://github.com/BD2KGenomics/toil-scripts/tree/master/src/toil_scripts/adam_kmers):
this workflow was demonstrated in [@vivian16] and sets up a Spark cluster
which then runs ADAM's [`countKmers` CLI](#countKmers).
which then runs ADAM's `countKmers` CLI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost the anchor link.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also two lines below.

Shutdown the cluster using:
```
cgcloud terminate-cluster -c cluster1 spark
```

#### CGCoud options and Spot Instances
View help docs for all options of the the `cgcloud create-cluster` command:

View help docs for all options of the `cgcloud create-cluster` command:
```
cgcloud create-cluster -h
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some code blocks with bash commands (like these) start with $, while some don't. Is there a preferred style? I can update so they are all consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, if the bash code block shows one or commands that the user would be likely to copy and paste into a bash shell, then I typically don't include the $.

mvn install
./bin/adam-submit transformAlignments foo.bam bar.adam
spark-submit conductor-0.5-SNAPSHOT-distribution.jar hdfs://bar.adam s3://bar.adam

If the code block is showing the output of a command, then I would include the $.

$ adam-submit --help
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/usr/local/bin/spark-submit

       e         888~-_          e             e    e
      d8b        888   \        d8b           d8b  d8b
     /Y88b       888    |      /Y88b         d888bdY88b
    /  Y88b      888    |     /  Y88b       / Y88Y Y888b
   /____Y88b     888   /     /____Y88b     /   YY   Y888b
  /      Y88b    888_-~     /      Y88b   /          Y888b

Usage: adam-submit [<spark-args> --] <adam-args>
...

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 3329607 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2292/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7643f64 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2293/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7cb8ea5 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2294/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7cb8ea5 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2295/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7cb8ea5 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2296/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7cb8ea5 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2297/
Test PASSed.

@coveralls
Copy link

coveralls commented Jul 27, 2017

Coverage Status

Coverage remained the same at 83.961% when pulling 7cb8ea5 on gunjanbaid:docs into c8a2202 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2298/
Test PASSed.

Copy link
Member

@fnothaft fnothaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @gunjanbaid!


From ADAM shell, or as parameter to ADAM submit, you would refer HDFS URLs
such as:
From the ADAM shell, or as a parameter to ADAM submit, you would refer to HDFS URLs like this:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: break this line at 80 characters when merging.

on your local machine point your web-browser to http://ip_of_spark_master:4040/
8080 on `spark-master`, go to Security Groups in the AWS console
and open inbound TCP for those ports from your local IP address. Find the
IP address of `spark-master`, which is part of the Linux command prompt. On your local machine,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: break this line at 80 characters when merging.

@@ -520,8 +521,8 @@ For those groups with access to a HPC cluster with [Slurm](https://en.wikipedia.
managing a number of compute nodes with local and/or network attached storage, it is possible to spin up a
temporary Spark cluster for use by ADAM.

While the full IO bandwidth benefits of Spark processing are likely best realized through a set of
co-located compute/storage nodes, depending on your network setup you may find Spark deployed on HPC
The full IO bandwidth benefits of Spark processing are likely best realized through a set of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: break this line at 80 characters when merging.

We'd love to hear feedback on your experience running ADAM on HPC/Slurm or other deployment architectures,
and let us know of any problems you run into via the mailing list or Gitter.

We would love to hear feedback on your experience running ADAM on HPC/Slurm or other deployment architectures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: break this line at 80 characters when merging.

@fnothaft
Copy link
Member

fnothaft commented Aug 2, 2017

Thanks @gunjanbaid! I've merged this manually as 6573997 and 9032698.

@fnothaft fnothaft closed this Aug 2, 2017
@heuermh heuermh added this to the 0.23.0 milestone Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants