Skip to content

Commit

Permalink
Review adjustments continued
Browse files Browse the repository at this point in the history
Signed-off-by: Michał Staniewski <m.staniewzki@gmail.com>
  • Loading branch information
staniewzki committed Mar 8, 2023
1 parent 98782db commit b516c69
Showing 1 changed file with 38 additions and 39 deletions.
77 changes: 38 additions & 39 deletions thesis-en.tex
Original file line number Diff line number Diff line change
Expand Up @@ -333,12 +333,12 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver}
\end{verbatim}
\vspace{-5pt}

Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc<str>}
causes semver break, even though it is a non-public field of a non-public struct.
Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc<str>}
causes semver break, even though it is a non-public field of a non-public struct.
That's because {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits
that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar}
implement {\ttfamily Send} and {\ttfamily Sync}.
In contrary, {\ttfamily Rc<str>} implements neither of them,
that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar}
implement {\ttfamily Send} and {\ttfamily Sync}.
In contrary, {\ttfamily Rc<str>} implements neither of them,
so the change results in publicly visible struct {\ttfamily Bar} losing a trait.

The given example is not only non-obvious, but also is even harder to notice
Expand All @@ -352,20 +352,19 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver}
\section{Consequences of breaking semver}

When you publish a new version of a crate that is breaking semver,
you are causing a major inconvenience for the crate's users.
you are causing a major inconvenience for the crate's users.
Their code might just stop compiling when the offending version gets downloaded.
This also could happen if the crate containing violation is not an immediate dependency,
This also could happen if the crate containing violation is not an immediate dependency,
so one semver break could result in tons of broken crates.

Debugging a cryptic compilation error that starts showing up one day,
without any change to the code, can be really frustrating,
and might drive the users to stop using your crate.
Debugging a cryptic compilation error that starts showing up one day,
without any change to the code, can be really frustrating. Actually, we have experienced it during our contributions, as one of the dependencies broke semver. This is a major problem, as it might drive the users to stop using your crate.

Because of that, maintainers have to yank
the incorrect releases as soon as possible
-- otherwise more users would encounter this problem and their trust
in this crate (and crates using it as a dependency)
would decrease.
would decrease. Even though yanking the release seems easy, fixing the semver break could also result in a lot of additional work for the maintainers -- they have to investigate the semver break when it is reported, inform the users about the yanking and possibly help some move away from the faulty release.

\section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks}

Expand All @@ -374,45 +373,45 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_
\item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285},
\item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876},
\item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22},
\item and many more. We have developed a script that scans all releases
\item and many more. We have developed a script that scans all releases
for semver breaks we can detect. The results are covered in section \ref{r:section_scanning_script}
\end{itemize}

Those were examples of popular crates with experienced maintainers, but the problem is even more prominent in less popular crates
where developers might not know the common semver pitfalls. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf}
claims that out of the yanked (un-publised) releases,
claims that out of the yanked (un-publised) releases,
semver break was the leading reason for yanking, with a shocking 43\% rate.
It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already),
are yanked, which shows the scale of the problem - thousands of detected semver breaks.
It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already),
are yanked, which shows the scale of the problem -- thousands of detected semver breaks.

\section{Existing tools for detecting semver breaks}\label{r:section_existing_semver_tools}

There aren't many great existing tools for semver checking.
There aren't many great existing tools for semver checking.
The main reason for that is that the semantics of popular languages
do not allow for complete automatic verification.
Of course, there are some initiatives to combat this - for example,
the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning.
It's type system enables automatic detection of all API changes.
Outside of that, it does not appear that tools for checking semver
do not allow for complete automatic verification.
There are some initiatives to combat this. For example,
the Elm languge\footnote{https://elm-lang.org/} by design enforces semantic versioning.
Its type system enables automatic detection of all API changes.
Outside of that, it does not appear that tools for checking semver
in estabilished languages like Python or C++ are commonly used in the industry.

Unfortunately, the Rust langugage's semantic were not designed with semver in mind.
Despite this, there are some existing tools for semver checking.
First of them, cargo-breaking, works on the abstract syntax tree.
The problem here is that to compare API changes, you must navigate two trees at once,
which can get really complex and tedious, because the abstract syntax tree could change quite a lot,
even without any public API changes.
Another issue is that both language syntax and the structure of the abstract syntax tree
might change along with the development of the language, which makes maintenance time-consuming.

The second existing tool is rust-semverver, which focuses on
the metadata present in the rust-specific rlib binary dynamic static library format.
Because of that, unfortunately, the user experience is far from ideal,
as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited.

In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well.
Unfortunately, the Rust langugage's semantic were also not designed with semver in mind.
Despite this, there are some existing tools for semver checking.
First of them, \texttt{cargo-breaking}, works on the abstract syntax tree.
The problem here is that to compare API changes, you must navigate two trees at once,
which can get really complex and tedious (especially when checking for moved or removed items), because the abstract syntax tree could change quite a lot,
even without any public API changes.
Another issue is that both language syntax and the structure of the abstract syntax tree
often change along with the development of the language, which makes maintenance time-consuming.

The second existing tool is \texttt{rust-semverver}, which focuses on
the metadata present in the rust-specific rlib binary static library format.
Because of that, unfortunately, the user experience is far from ideal,
as it forces the user to use some specific unstable versions of the language, and the quality of error messages is limited.

In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well.
Adding new queries is designed to be quite accessible, and the maintenance comes to
keeping up to date with rustdoc API changes, which seems to be about as low effort as it could be.
keeping up with rustdoc API changes, which seems to be about as low effort as it could be.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Vision %
Expand Down Expand Up @@ -541,7 +540,7 @@ \section{Project baseline}
\item some existing lints had false-positives,
\item the codebase was not in a state where new contributors could easily begin making changes
to the project (which is crucial for the project to flourish in the long term).
For example, adding new lints and tests wasn't intuitive and required many manual steps,
For example, adding new lints and tests wasn't intuitive and required many manual steps,
the filenames and variable names were not always descriptive enough
and the code lacked comments that explained some of the logic and decisions behind it.
\end{itemize}
Expand Down Expand Up @@ -749,7 +748,7 @@ \section{Responsibilities}
https://semver.org/

% State of the art references:
\bibitem[1]{beaman} Predrag Gruevski,
\bibitem[1]{beaman} Predrag Gruevski,
\textit{Towards fearless cargo update} (2022) \\
https://predr.ag/blog/toward-fearless-cargo-update/

Expand Down

0 comments on commit b516c69

Please sign in to comment.