Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid calling normalize_path with relative paths that extend beyond the current directory #3013

Merged
merged 2 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ miette = { version = "7.2.0" }
nanoid = { version = "0.4.0" }
once_cell = { version = "1.19.0" }
owo-colors = { version = "4.0.0" }
path-absolutize = { version = "3.1.1" }
pathdiff = { version = "0.2.1" }
petgraph = { version = "0.6.4" }
platform-info = { version = "2.0.2" }
Expand Down
10 changes: 7 additions & 3 deletions crates/pep508-rs/src/verbatim_url.rs
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,13 @@ impl VerbatimUrl {
}

/// Create a [`VerbatimUrl`] from a file path.
///
/// Assumes that the path is absolute.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we panic if it isn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

pub fn from_path(path: impl AsRef<Path>) -> Self {
let path = path.as_ref();

// Normalize the path.
let path = normalize_path(path);
let path = normalize_path(path).expect("path is absolute");

// Extract the fragment, if it exists.
let (path, fragment) = split_fragment(&path);
Expand Down Expand Up @@ -76,7 +78,7 @@ impl VerbatimUrl {
};

// Normalize the path.
let path = normalize_path(path);
let path = normalize_path(&path).expect("path is absolute");

// Extract the fragment, if it exists.
let (path, fragment) = split_fragment(&path);
Expand Down Expand Up @@ -110,7 +112,9 @@ impl VerbatimUrl {
};

// Normalize the path.
let path = normalize_path(path);
let Ok(path) = normalize_path(&path) else {
return Err(VerbatimUrlError::RelativePath(path));
};

// Extract the fragment, if it exists.
let (path, fragment) = split_fragment(&path);
Expand Down
1 change: 1 addition & 0 deletions crates/uv-fs/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ fs-err = { workspace = true }
fs2 = { workspace = true }
junction = { workspace = true }
once_cell = { workspace = true }
path-absolutize = { workspace = true }
tempfile = { workspace = true }
tokio = { workspace = true, optional = true }
tracing = { workspace = true }
Expand Down
54 changes: 49 additions & 5 deletions crates/uv-fs/src/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,21 @@ pub fn normalize_url_path(path: &str) -> Cow<'_, str> {
/// Normalize a path, removing things like `.` and `..`.
///
/// Source: <https://github.com/rust-lang/cargo/blob/b48c41aedbd69ee3990d62a0e2006edbb506a480/crates/cargo-util/src/paths.rs#L76C1-L109C2>
pub fn normalize_path(path: impl AsRef<Path>) -> PathBuf {
let mut components = path.as_ref().components().peekable();
///
/// CAUTION: Assumes that the path is already absolute.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Can we panic if it isn't absolute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

///
/// CAUTION: This does not resolve symlinks (unlike
/// [`std::fs::canonicalize`]). This may cause incorrect or surprising
/// behavior at times. This should be used carefully. Unfortunately,
/// [`std::fs::canonicalize`] can be hard to use correctly, since it can often
/// fail, or on Windows returns annoying device paths.
///
/// # Errors
///
/// When a relative path is provided with `..` components that extend beyond the base directory.
/// For example, `./a/../../b` cannot be normalized because it escapes the base directory.
pub fn normalize_path(path: &Path) -> Result<PathBuf, std::io::Error> {
let mut components = path.components().peekable();
let mut ret = if let Some(c @ Component::Prefix(..)) = components.peek().copied() {
components.next();
PathBuf::from(c.as_os_str())
Expand All @@ -101,14 +114,29 @@ pub fn normalize_path(path: impl AsRef<Path>) -> PathBuf {
}
Component::CurDir => {}
Component::ParentDir => {
ret.pop();
if !ret.pop() {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidInput,
"cannot normalize a relative path beyond the base directory",
));
}
}
Component::Normal(c) => {
ret.push(c);
}
}
}
ret
Ok(ret)
}

/// Convert a path to an absolute path, relative to the current working directory.
///
/// Unlike [`std::fs::canonicalize`], this function does not resolve symlinks and does not require
/// the path to exist.
pub fn absolutize_path(path: &Path) -> Result<Cow<Path>, std::io::Error> {
use path_absolutize::Absolutize;

path.absolutize_from(&*CWD)
}

/// Like `fs_err::canonicalize`, but with permissive failures on Windows.
Expand Down Expand Up @@ -235,7 +263,7 @@ mod tests {
use super::*;

#[test]
fn normalize() {
fn test_normalize_url() {
if cfg!(windows) {
assert_eq!(
normalize_url_path("/C:/Users/ferris/wheel-0.42.0.tar.gz"),
Expand Down Expand Up @@ -272,4 +300,20 @@ mod tests {
);
}
}

#[test]
fn test_normalize_path() {
let path = Path::new("/a/b/../c/./d");
let normalized = normalize_path(path).unwrap();
assert_eq!(normalized, Path::new("/a/c/d"));

let path = Path::new("/a/../c/./d");
let normalized = normalize_path(path).unwrap();
assert_eq!(normalized, Path::new("/c/d"));

// This should be an error.
let path = Path::new("/a/../../c/./d");
let err = normalize_path(path).unwrap_err();
assert_eq!(err.kind(), std::io::ErrorKind::InvalidInput);
}
}
4 changes: 1 addition & 3 deletions crates/uv-interpreter/src/find_python.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ use std::path::PathBuf;
use tracing::{debug, instrument};

use uv_cache::Cache;
use uv_fs::normalize_path;
use uv_toolchain::PythonVersion;

use crate::interpreter::InterpreterInfoError;
Expand Down Expand Up @@ -52,8 +51,7 @@ pub fn find_requested_python(request: &str, cache: &Cache) -> Result<Option<Inte
Interpreter::query(executable, cache).map(Some)
} else {
// `-p /home/ferris/.local/bin/python3.10`
let executable = normalize_path(request);

let executable = uv_fs::absolutize_path(request.as_ref())?;
Interpreter::query(executable, cache).map(Some)
}
}
Expand Down
4 changes: 1 addition & 3 deletions crates/uv-interpreter/src/interpreter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -488,12 +488,10 @@ impl InterpreterInfo {
/// unless the Python executable changes, so we use the executable's last modified
/// time as a cache key.
pub(crate) fn query_cached(executable: &Path, cache: &Cache) -> Result<Self, Error> {
let executable_bytes = executable.as_os_str().as_encoded_bytes();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\cc @BurntSushi - I changed this to just hash the path directly. Not sure if it's equivalent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to grok why you need the hashes to be the same here. I'm not understanding that park. Like, even if this generates a different hash, that seems okay, it just means you'll get a cache miss? Or is there something deeper here that I'm missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't need to be the same!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mostly a Chesterton's Fence of trying to understand why we did this in the first place.


let cache_entry = cache.entry(
CacheBucket::Interpreter,
"",
format!("{}.msgpack", digest(&executable_bytes)),
format!("{}.msgpack", digest(&executable)),
);

let modified = Timestamp::from_path(uv_fs::canonicalize_executable(executable)?)?;
Expand Down
1 change: 1 addition & 0 deletions req.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
httpx @ git+https://github.com/encode/httpx@master
Loading