Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for range mappings proposal #77

Merged
merged 64 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
327c2af
rsm
kdy1 Jan 25, 2024
879fb1b
range_mapping
kdy1 Jan 25, 2024
9de6692
range_mapping
kdy1 Jan 25, 2024
b13e00c
error variant
kdy1 Jan 25, 2024
7558335
rmi_result
kdy1 Jan 25, 2024
502e9a9
fix compilation
kdy1 Jan 25, 2024
9292815
json
kdy1 Jan 25, 2024
0cc6b20
WIP
kdy1 Jan 25, 2024
cecc1bf
error variant
kdy1 Jan 25, 2024
88a054c
decode_rmi
kdy1 Jan 25, 2024
9631444
is_range
kdy1 Jan 25, 2024
4935322
u32
kdy1 Jan 25, 2024
24cda13
doc
kdy1 Jan 25, 2024
26eeeed
fix rmi decoding
kdy1 Jan 25, 2024
3d43ddf
minus one
kdy1 Jan 25, 2024
f517b36
Add a test for decoder
kdy1 Jan 25, 2024
e4ad541
Remove dbg
kdy1 Jan 25, 2024
a7f241e
serialize range mappings
kdy1 Jan 25, 2024
46d337d
Add a test
kdy1 Jan 25, 2024
65ba538
Remove useless TODO
kdy1 Jan 25, 2024
5cd2046
review
kdy1 Jan 25, 2024
4535a2b
repeat empty string
kdy1 Jan 25, 2024
fc2ee72
Fix doc CI
kdy1 Jan 25, 2024
1a1dbc5
Dep on bitvec
kdy1 Jan 26, 2024
af19537
fix decode_rmi
kdy1 Jan 28, 2024
be20676
Use it
kdy1 Jan 28, 2024
3b72696
to: &mut BitVec
kdy1 Jan 29, 2024
59b3241
Fix partial eq impl
kdy1 Jan 29, 2024
0834058
Debug impl
kdy1 Jan 29, 2024
26c6280
Add safety comment
kdy1 Jan 29, 2024
3f80f13
Declare `encode_rmi`
kdy1 Jan 29, 2024
0be3f85
Declare encode_byte
kdy1 Jan 29, 2024
a8d6470
encode_rmi
kdy1 Jan 29, 2024
a040b52
Patch encoder to support multiple mappings
kdy1 Jan 30, 2024
8383467
Remove dbg
kdy1 Jan 30, 2024
6f5a4ae
Fix decoder
kdy1 Jan 30, 2024
3cf9dd6
Use binary search
kdy1 Jan 31, 2024
bc2a237
Remove comments
kdy1 Jan 31, 2024
38bb516
assert 1-based index
kdy1 Jan 31, 2024
2ab0d3f
Use lsb and remove reverse
kdy1 Jan 31, 2024
37cdbc7
Add tests
kdy1 Jan 31, 2024
3150d83
Use 0-based index
kdy1 Feb 1, 2024
38785c3
fix test
kdy1 Feb 1, 2024
09b72f3
lint
kdy1 Feb 1, 2024
b0d9e1d
resize
kdy1 Feb 28, 2024
f4867b1
panic => error
kdy1 Feb 28, 2024
46d64fa
store_le
kdy1 Feb 28, 2024
f70a754
unsafe
kdy1 Feb 28, 2024
a025248
`RawToken.is_range`
kdy1 Mar 4, 2024
92ce26d
Update test
kdy1 Mar 4, 2024
af22bf6
Make encoding efficient
kdy1 Mar 4, 2024
f80aa98
dynamic size
kdy1 Mar 4, 2024
f8d1edd
Use resize
kdy1 Mar 4, 2024
544605d
fix
kdy1 Mar 4, 2024
e53ed5f
Fix unit test
kdy1 Mar 4, 2024
ca6d0c5
fix clipy
kdy1 Mar 4, 2024
b6fd204
Update test refs
kdy1 Mar 13, 2024
d8e5bbe
Add a test
kdy1 Mar 13, 2024
2fedd9f
fixup
kdy1 Mar 13, 2024
8538bd6
offset
kdy1 Mar 13, 2024
a7f3356
saturate
kdy1 Mar 13, 2024
95ff8b8
Update test
kdy1 Mar 13, 2024
b5c4ca8
Fix test & logic
kdy1 Mar 13, 2024
df6cc81
Merge branch 'master' into range-mapping
kdy1 Mar 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ scroll = { version = "0.10.1", features = ["derive"], optional = true }
data-encoding = "2.3.3"
debugid = {version = "0.8.0", features = ["serde"] }
base64-simd = { version = "0.7" }
bitvec = "1.0.1"

[build-dependencies]
rustc_version = "0.2.3"
Expand Down
12 changes: 11 additions & 1 deletion src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ impl SourceMapBuilder {
}

/// Adds a new mapping to the builder.
#[allow(clippy::too_many_arguments)]
pub fn add(
&mut self,
dst_line: u32,
Expand All @@ -189,8 +190,11 @@ impl SourceMapBuilder {
src_col: u32,
source: Option<&str>,
name: Option<&str>,
is_range: bool,
) -> RawToken {
self.add_with_id(dst_line, dst_col, src_line, src_col, source, !0, name)
self.add_with_id(
dst_line, dst_col, src_line, src_col, source, !0, name, is_range,
)
}

#[allow(clippy::too_many_arguments)]
Expand All @@ -203,6 +207,7 @@ impl SourceMapBuilder {
source: Option<&str>,
source_id: u32,
name: Option<&str>,
is_range: bool,
) -> RawToken {
let src_id = match source {
Some(source) => self.add_source_with_id(source, source_id),
Expand All @@ -219,12 +224,14 @@ impl SourceMapBuilder {
src_col,
src_id,
name_id,
is_range,
};
self.tokens.push(raw);
raw
}

/// Adds a new mapping to the builder.
#[allow(clippy::too_many_arguments)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, clippy is absolutely right to complain here. Trying to review the tests for this on github, I completely lose track of what all these parameters mean (without inlay hints).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I'm not sure how should I fix it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, me neither :-( I think this is fine this way, just wanted to call it out as a future improvement opportunity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is definitely on the list for an eventual rework. These signatures can't stay like this.

pub fn add_raw(
&mut self,
dst_line: u32,
Expand All @@ -233,6 +240,7 @@ impl SourceMapBuilder {
src_col: u32,
source: Option<u32>,
name: Option<u32>,
is_range: bool,
) -> RawToken {
let src_id = source.unwrap_or(!0);
let name_id = name.unwrap_or(!0);
Expand All @@ -243,6 +251,7 @@ impl SourceMapBuilder {
src_col,
src_id,
name_id,
is_range,
};
self.tokens.push(raw);
raw
Expand All @@ -260,6 +269,7 @@ impl SourceMapBuilder {
token.get_source(),
token.get_src_id(),
name,
token.is_range(),
)
}

Expand Down
62 changes: 60 additions & 2 deletions src/decoder.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
use std::io;
use std::io::{BufReader, Read};

use bitvec::field::BitField;
use bitvec::order::Lsb0;
use bitvec::vec::BitVec;
use serde_json::Value;

use crate::errors::{Error, Result};
Expand Down Expand Up @@ -120,6 +123,29 @@ pub fn strip_junk_header(slice: &[u8]) -> io::Result<&[u8]> {
Ok(&slice[slice.len()..])
}

/// Decodes range mappping bitfield string into index
fn decode_rmi(rmi_str: &str, val: &mut BitVec<u8, Lsb0>) -> Result<()> {
val.clear();
val.resize(rmi_str.len() * 6, false);

for (idx, &byte) in rmi_str.as_bytes().iter().enumerate() {
let byte = match byte {
b'A'..=b'Z' => byte - b'A',
b'a'..=b'z' => byte - b'a' + 26,
b'0'..=b'9' => byte - b'0' + 52,
b'+' => 62,
b'/' => 63,
_ => {
fail!(Error::InvalidBase64(byte as char));
}
};

val[6 * idx..6 * (idx + 1)].store_le::<u8>(byte);
}

Ok(())
}

pub fn decode_regular(rsm: RawSourceMap) -> Result<SourceMap> {
let mut dst_col;
let mut src_id = 0;
Expand All @@ -129,20 +155,28 @@ pub fn decode_regular(rsm: RawSourceMap) -> Result<SourceMap> {

let names = rsm.names.unwrap_or_default();
let sources = rsm.sources.unwrap_or_default();
let range_mappings = rsm.range_mappings.unwrap_or_default();
let mappings = rsm.mappings.unwrap_or_default();
let allocation_size = mappings.matches(&[',', ';'][..]).count() + 10;
let mut tokens = Vec::with_capacity(allocation_size);

let mut nums = Vec::with_capacity(6);
let mut rmi = BitVec::new();

for (dst_line, line) in mappings.split(';').enumerate() {
for (dst_line, (line, rmi_str)) in mappings
.split(';')
.zip(range_mappings.split(';').chain(std::iter::repeat("")))
.enumerate()
{
if line.is_empty() {
continue;
}

dst_col = 0;

for segment in line.split(',') {
decode_rmi(rmi_str, &mut rmi)?;

for (line_index, segment) in line.split(',').enumerate() {
if segment.is_empty() {
continue;
}
Expand Down Expand Up @@ -176,13 +210,16 @@ pub fn decode_regular(rsm: RawSourceMap) -> Result<SourceMap> {
}
}

let is_range = rmi.get(line_index).map(|v| *v).unwrap_or_default();

tokens.push(RawToken {
dst_line: dst_line as u32,
dst_col,
src_line,
src_col,
src_id: src,
name_id: name,
is_range,
});
}
}
Expand Down Expand Up @@ -311,3 +348,24 @@ fn test_bad_newline() {
}
}
}

#[test]
fn test_decode_rmi() {
fn decode(rmi_str: &str) -> Vec<usize> {
let mut out = bitvec::bitvec![u8, Lsb0; 0; 0];
decode_rmi(rmi_str, &mut out).expect("failed to decode");

let mut res = vec![];
for (idx, bit) in out.iter().enumerate() {
if *bit {
res.push(idx);
}
}
res
}

// This is 0-based index of the bits
assert_eq!(decode("AAB"), vec![12]);
assert_eq!(decode("g"), vec![5]);
assert_eq!(decode("Bg"), vec![0, 11]);
}
100 changes: 100 additions & 0 deletions src/encoder.rs
kdy1 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
use std::io::Write;

use bitvec::field::BitField;
use bitvec::order::Lsb0;
use bitvec::view::BitView;
use serde_json::Value;

use crate::errors::Result;
Expand All @@ -21,6 +24,78 @@ fn encode_vlq_diff(out: &mut String, a: u32, b: u32) {
encode_vlq(out, i64::from(a) - i64::from(b))
}

fn encode_rmi(out: &mut Vec<u8>, data: &mut Vec<u8>) {
fn encode_byte(b: u8) -> u8 {
match b {
0..=25 => b + b'A',
26..=51 => b + b'a' - 26,
52..=61 => b + b'0' - 52,
62 => b'+',
63 => b'/',
_ => panic!("invalid byte"),
}
}

let bits = data.view_bits_mut::<Lsb0>();

// trim zero at the end
let mut last = 0;
for (idx, bit) in bits.iter().enumerate() {
if *bit {
last = idx;
}
}
let bits = &mut bits[..last + 1];

for byte in bits.chunks(6) {
let byte = byte.load::<u8>();

let encoded = encode_byte(byte);

out.push(encoded);
}
}

fn serialize_range_mappings(sm: &SourceMap) -> Option<String> {
let mut buf = Vec::new();
let mut prev_line = 0;
let mut had_rmi = false;

let mut idx_of_first_in_line = 0;

let mut rmi_data = Vec::<u8>::new();

for (idx, token) in sm.tokens().enumerate() {
if token.is_range() {
had_rmi = true;

let num = idx - idx_of_first_in_line;

rmi_data.resize(rmi_data.len() + 2, 0);

let rmi_bits = rmi_data.view_bits_mut::<Lsb0>();
rmi_bits.set(num, true);
}

while token.get_dst_line() != prev_line {
if had_rmi {
encode_rmi(&mut buf, &mut rmi_data);
rmi_data.clear();
}

buf.push(b';');
prev_line += 1;
had_rmi = false;
idx_of_first_in_line = idx;
}
}
if had_rmi {
encode_rmi(&mut buf, &mut rmi_data);
}

Some(String::from_utf8(buf).expect("invalid utf8"))
}

fn serialize_mappings(sm: &SourceMap) -> String {
let mut rv = String::new();
// dst == minified == generated
Expand Down Expand Up @@ -89,6 +164,7 @@ impl Encodable for SourceMap {
sources_content: if have_contents { Some(contents) } else { None },
sections: None,
names: Some(self.names().map(|x| Value::String(x.to_string())).collect()),
range_mappings: serialize_range_mappings(self),
mappings: Some(serialize_mappings(self)),
x_facebook_offsets: None,
x_metro_module_paths: None,
Expand Down Expand Up @@ -121,6 +197,7 @@ impl Encodable for SourceMapIndex {
.collect(),
),
names: None,
range_mappings: None,
mappings: None,
x_facebook_offsets: None,
x_metro_module_paths: None,
Expand All @@ -139,3 +216,26 @@ impl Encodable for DecodedMap {
}
}
}

#[test]
fn test_encode_rmi() {
fn encode(indices: &[usize]) -> String {
let mut out = vec![];

// Fill with zeros while testing
let mut data = vec![0; 256];

let bits = data.view_bits_mut::<Lsb0>();
for &i in indices {
bits.set(i, true);
}

encode_rmi(&mut out, &mut data);
String::from_utf8(out).unwrap()
}

// This is 0-based index
assert_eq!(encode(&[12]), "AAB");
assert_eq!(encode(&[5]), "g");
assert_eq!(encode(&[0, 11]), "Bg");
}
12 changes: 12 additions & 0 deletions src/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ pub enum Error {
InvalidRamBundleEntry,
/// Tried to operate on a non RAM bundle file
NotARamBundle,
/// Range mapping index is invalid
InvalidRangeMappingIndex(data_encoding::DecodeError),
Comment on lines +48 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a NOTE: this is a breaking change as Error is not (yet) marked as #[non_exhaustive]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say if we're ok with making a breaking change here, then we should also put the information whether a token is a range token directly on RawToken.


InvalidBase64(char),
}

impl From<io::Error> for Error {
Expand Down Expand Up @@ -78,6 +82,12 @@ impl From<serde_json::Error> for Error {
}
}

impl From<data_encoding::DecodeError> for Error {
fn from(err: data_encoding::DecodeError) -> Error {
Error::InvalidRangeMappingIndex(err)
}
}

impl error::Error for Error {
fn cause(&self) -> Option<&dyn error::Error> {
match *self {
Expand Down Expand Up @@ -114,6 +124,8 @@ impl fmt::Display for Error {
Error::InvalidRamBundleIndex => write!(f, "invalid module index in ram bundle"),
Error::InvalidRamBundleEntry => write!(f, "invalid ram bundle module entry"),
Error::NotARamBundle => write!(f, "not a ram bundle"),
Error::InvalidRangeMappingIndex(err) => write!(f, "invalid range mapping index: {err}"),
Error::InvalidBase64(c) => write!(f, "invalid base64 character: {}", c),
}
}
}
2 changes: 2 additions & 0 deletions src/jsontypes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ pub struct RawSourceMap {
pub sections: Option<Vec<RawSection>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub names: Option<Vec<Value>>,
#[serde(rename = "rangeMappings", skip_serializing_if = "Option::is_none")]
pub range_mappings: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub mappings: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
Expand Down
1 change: 1 addition & 0 deletions src/ram_bundle.rs
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,7 @@ impl<'a> SplitRamBundleModuleIter<'a> {
token.get_src_col(),
token.get_source(),
token.get_name(),
false,
);
if token.get_source().is_some() && !builder.has_source_contents(raw.src_id) {
builder.set_source_contents(
Expand Down
Loading
Loading