Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update symbol packages guidance #9110

Merged
merged 6 commits into from
Nov 20, 2018
Merged

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Nov 19, 2018

Remove mention of ye olde SymbolSource, include details about NuGet.org symbol server.

Open question:

I haven't updated the guidance, currently it recommends embedding PDBs over a source package. Should .NET libraries continue to embed PDBs in the main NuGet package? Download size is larger, but doesn't require developers to configure the source in VS. Are there other advantages or disadvantages?

@clairernovotny
Copy link
Member

IMO, PDB's should be in the main package, always. It's much faster to have the pdb's around instead of downloading on-demand. Also, some things like .NET Native need the symbols to be on disk in advance in order to munge the PDB's to match the native output. Symbol server does not work for that scenario.

I believe snupkg should only be used for full pdb's, which are bigger, but that's the one type of symbol they don't support. PPDB's are small. Better yet, use embedded as the symbol type and it'll be compressed and put directly in the DLL. No extra symbols to keep track of. The size difference is not that large.

@clairernovotny
Copy link
Member

Other issues I can think of:

  • If you use a mirroring feed, you don't get the pdb's. Clients must have access to the symbol server. Many corporate environments restrict access to NuGet in favor of an approved package list.
  • If you use a local file share feed, there's no place for the snupkg

I think using pdb's in the main nupkg or an embedded pdb has the least friction across almost every scenario.

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

Example:

  • Newtonsoft.Json nupkg with PDB files = 3.2 mb
  • Newtonsoft.Json nupkg without PDB files = 2.35 mb

Note that Newtonsoft.Json has 9 targets. Most packages are a lot smaller 😄

@jeromelaban
Copy link

I agree with @onovotny, though dll-embedded (not package embedded) symbols can be a problem for the platforms for which the package size can be a problem (iOS/Android, and to some extents WebAssembly as well). If embedding becomes a best practice at some point, the ILLinker has to be able to strip those as well.

For nuget debugging story, keeping the symbols next (or embedded) in the dlls is easier to work with when altering a published package for debugging and development purposes. This does happen when developing a package that cannot be integrated with the source that consumes that same package. If the symbols are located somewhere else, this will complicate this particular, yet very common, debugging/development scenario.

@clairernovotny
Copy link
Member

@JamesNK What about embedded pdb's? Those are compressed in the dll.

@clairernovotny
Copy link
Member

@JamesNK I'd also point to Newtonsoft.Json as an outlier in terms of targets. Version 12 has 9 different versions in the same nuget package. I'd suggest that the per-platform difference isn't high and that most libraries have 1-3 targets (maybe netstandard 1.x, .NET, and .NET Core.) or some other combination.

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

How do you create embedded pdbs?

Since nupkg files are themselves compressed, would they be much smaller?

@clairernovotny
Copy link
Member

@JamesNK Set DebugType to embedded wherever you have it (you can do it once in your Directory.Build.Props file). Depends on what algorithm each uses.

@clairernovotny
Copy link
Member

clairernovotny commented Nov 19, 2018

cc @tmat for additional thoughts

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

Embedded PDBs in DLLs produces a package that is 1KB smaller than loose PDBs 😋

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

Newtonsoft.Json reduced to two targets:

  • Newtonsoft.Json nupkg with PDB files = 0.79 mb
  • Newtonsoft.Json nupkg without PDB files = 0.58 mb

There is about a 25-30% overhead of embedding portable PDBs.

@jnm2
Copy link
Contributor

jnm2 commented Nov 19, 2018

25-30% for portable sounds about right. Here is a table of measurements I did for NUnit when we were considering different options. Another example shows how the ratios can vary from lib to lib.

@seangwright
Copy link
Contributor

Excuse my ignorance but how does the new(er) SourceLink tech vs traditional Symbol Servers play into all of this?

What are the main arguments against a slightly larger package, especially in .NET Core where each solution doesn't have its own packages folder and instead a central/per-user cache is used? Is it download time/bandwidth/data-cap limitations?

Is the plan for the docs to provide the default guidance for teams and OSS (eg include pdbs) but also list the alternatives available?

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

As far as I know:

Symbol servers are used to acquire symbol files (PDBs). Symbol files map executing compiled code to its location in source code.

Source Link is the next step: once the source code location is known, Source Link gets the file from the source repository.

@ctaggart
Copy link

I pushed for pdb files being embedded in the main nupkg packages for several years before Microsoft agreed and recommended doing so. However, @vancem also said:

Ultimately we want to have a secure symbol server story that everyone can use

Now that it exists, I'm guessing the recommendation will be to use it. It would be nice if there was a dotnet global tool that pre-fetched all of the symbols and put them side-by-side so symbols wouldn't have to be looked up on-demand by the debugger every run.

@clairernovotny
Copy link
Member

While I agree that it's useful to have a symbol server for some specific cases, in general, I don't believe the extra complexity is worth it for most packages. I would like to see some explanation/justification for any recommendation that makes things more complicated, and to me, adding another set of packages and the related configuration for symbol server, is definitely more complicated for the average user.

@ctaggart
Copy link

how does the new(er) SourceLink tech vs traditional Symbol Servers play into all of this?

@seangwright SourceLink will work either way. The pdb file simply needs the information added. This discussion is just about how to obtain the pdb files. Either they come in the nupkg or they are downloaded separately on-demand using a symbol server (pdb file server).

Copy link

@vancem vancem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably improve this documentation more, but this is certainly an improvement (it removes obsolete things), this is OK

@vancem
Copy link

vancem commented Nov 19, 2018

as @JamesNK and @jnm2 has pointed out there is a non-trivial (25-30%) size cost to including the PDBs into to the main package.

NEVER-THE-LESS, as @ctaggart mentions, because of its simplicity on both the production and consumption side, recommending that most package authors (that probably don't care too much about their package size), is REASONABLE.

However even @onovotny agrees that there are times where symbol packages would be helpful, in particular the native case. Now it is true that the native case is not supported right now, but that will be fixed in the future. Thus the NEED for symbol packages SOMETIMES is clear.

The complexity of symbol packages is now actually reasonably low.

  1. on the consumption side, all that is needed is that the tool (debugger) have https://symbols.nuget.org/download/symbols in its symbol server list, and it is my expectation that we will make even this step trivial in the future). Other tools have a similar incentive to make this step easy (and even if they don't it is not hard to work around), My expectation is it will happen relatively quickly.

  2. Similarly, the 'complexity cost' of making symbol packages is now pretty low, and if there are still bumps we can smooth them over very soon.

So while I don't mind telling people their options, and I don't mind biasing toward simplicity, I think the complexity is now pretty low in both cases, so it is not clear to me there is a clear winner, (but conversely there is a 25% size cost).

Note that currently this edit does not try to change any guidance, but frankly fix what is now clearly obsolete. Thus I recommend we check that in.

We should open a separate Issue/PR for changing the guidance. I would assert that in that Issue/PR, we need to detail the experiences in both production and use (ideally as updated docs), and that will probably make the decision (We will mention both, and if there is a obvious user experience complexity win, then we suggest people do the simple thing unless size is important (which it would be for framework and very popular packages).

@clairernovotny
Copy link
Member

@vancem As of today, the .NET Native case isn't handled by symbol server. That toolchain needs the pdb's already on disk to munge them. Would be great to see options for that.

@diverdan92
Copy link

In the original spec we do call out this desire and are still looking for ways to improve this:

The default symbols experience will be to create multiple packages - one for symbols (.snupkg) and one for the binaries (.nupkg) - so that we do not bloat package size and impact package restore times. Even though portable PDBs are drastically smaller than Microsoft PDBs, they still account for ~50% of the binary size (a 5 MB .nupkg may yield a 2.5 MB .snupkg with Portable PDB). With this in mind, we do recognize some members of the community asking for improved embedded PDB support in the .nupkg and will be looking for feedback to best solve this problem. We still need to evaluate the best experience for including a single-file solution, and would love your feedback on this topic in the GitHub issue.

If a .pdb is included in a normal .nupkg, NuGet and NuGet.org will not do any special validation or indexing of the .pdbs. Debugging of the .nupkg will work in a limited way as the symbols will be cached next to the .dll, but it will not be indexed and downloaded from a symbol server. The only way to ensure indexing on the symbol server is to include the .snupkg. If the symbols are not indexed, certain scenarios will not work, such as crash dump debugging and attaching to an application.

While size increase varies, even a 25-30% increase in package size spread across the entire .NET ecosystem will have down-level impact.

@onovotny - perhaps we can have another conversation more targeted on the improvement of symbol generation with .NET Native. Are there any other big exceptions worth calling out currently?

@clairernovotny
Copy link
Member

@diverdan92 What extra symbol indexing do I need if I already use Sourcelink? Crash dump debugging will work then.

Also, one other major downside of source server is debug latency. It significantly slows down debug startup as it probes the symbol servers. With the symbols all locally, it's a non issue.

@clairernovotny
Copy link
Member

I created a discussion issue for this here
dotnet/sdk#2679

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

How about this for updating the guidance:

✔️ CONSIDER embedding symbol files in the main NuGet package.

Embedding symbol files in the main NuGet package gives developers a better debugging experience by default. They do not need to find and configure the correct symbol server in their IDE to get symbol files..

The downsize to embedded symbol files is they increase the package size by about 30% for .NET libraries compiled using SDK-style projects. If package size is a concern then publish your symbols in a symbol package instead.

❌ AVOID creating a symbols package containing symbol files.

Thoughts?

@Thorium
Copy link

Thorium commented Nov 19, 2018

If considered as a general guideline: Performance? Security?

I've never had a need to debug Newtonsoft.Json.

Copy link
Contributor

@rpetrusha rpetrusha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a minor comment, but you can merge this when you're satisfied with it, @JamesNK., Ping me if you make changes and want them reviewed.

docs/standard/library-guidance/nuget.md Outdated Show resolved Hide resolved
@vancem
Copy link

vancem commented Nov 19, 2018

@JamesNK - I would change

They do not need to find and configure the correct symbol server in their IDE to get symbol files' to 

to

They do not need to find and configure the nuget symbol server in their IDE to get symbol files

I am OK with the advice however.

@JamesNK
Copy link
Member Author

JamesNK commented Nov 19, 2018

Thanks for the review!

@rpetrusha Please review the addition for language

Copy link
Contributor

@rpetrusha rpetrusha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one nit, but otherwise LGTM, @JamesNK

docs/standard/library-guidance/nuget.md Outdated Show resolved Hide resolved
@JamesNK JamesNK merged commit 3f583de into dotnet:master Nov 20, 2018
@KylePreuss
Copy link

Why is a symbol server always required here?

If source is embedded into the portable PDB, and the portable PDB is inside a .snupkg, why can I not push the .snupkg to disk (e.g. a network share) -- the same location that I push the the regular .nupkg to -- and have Visual Studio correctly pull in the .snupkg as necessary? That way you don't bloat your regular .nupkg AND you don't have to come up with some sort of symbol server.

We have a use case for supporting the debugging of our own nuget packages on an air-gapped network, as well as when disconnected from the air-gapped network.

Barring the second part of our use case, I see people recommending a locally hosted symbol server. However, I know of no open-source, reliably maintained solutions (given SymbolSource is deprecated and doesn't support portable PDBs anyway). Until NuGet.org at least provides some sort of NuGetSymbolServer.exe that I can download, transfer to offline machine, and run as a service or something (cheap, easy, free,) our only solution is to put PDBs in the original NuGet package.

To support being disconnected from the air-gapped network, embed source code in the PDB and set network share to "make available offline". Is it not reasonable for Visual Studio to support reading the .snupkg directly from disk?

bgrainger added a commit to mysql-net/MySqlConnector that referenced this pull request Nov 20, 2018
The DLL size increase is not large, platforms where this package would be deployed can support it, and it improves debuggability by default.

See discussion at dotnet/docs#9110 (comment) and dotnet/sdk#2679 (comment).
@ctaggart
Copy link

So the first pull request that references this guidance change thinks the recommendation is to use <DebugType>embedded</DebugType>, which it is not. The guidance is to embed add the portable pdb files in the NuGet package, not embed the pdb files in the dll files. May want to clarify.

@JamesNK
Copy link
Member Author

JamesNK commented Nov 21, 2018

Oren knows the difference 😋

There are instructions immediately above the guidance that demonstrate how to embed (as in add) the PDBs to the NuGet package so I'm not worried.

If you are concerned then you could create a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.