-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async or sync APIs #19
Comments
I think we should aim to have our I/O async, because it enforces a lot of downstream async decisions, and plugging sync into async is easy. The reverse is hard. For async to work we mainly have to decide between |
I think that core OmnICU code shouldn't care about where and how data comes to it. Wrapper layer should take care of sync/async decision. For example, Wasm port would be wrapped in JS with async code that downloads locale data from the server, and passes it down to Wasm layer for final date formatting. |
I'm not sure if its possible. If you have any code that does:
and that code has to be possible to be async, then it trickles down to each and every caller who has to accept an async callsite. There are core algos that are sync input, sync output, sure, but if my constructor takes a |
I think constructors / factories should be async, and methods should be sync. In other words, you can "await" a new DateTimeFormat, and then when the DateTimeFormat is ready, it has all data that it needs ready to go for the format methods. |
Then I would say that your constructors/factories belong to the client wrapper code. The low level constructor should be:
The wrapper should get data part from file/server and serve to the sync, base code. So I agree that some part may have to be async - e.g. Chrome loads all of the data on IO thread, but it doesn't have to go all the way down. |
Yes, that's one option, but I don't know if this is what we want to do. This basically means that we're writing a fully synchronous API which handles dataprovider only from the outside and expects it to be synchronous later. That means that the provider has to have all the data at the moment when its fed to the constructor and cannot be augmented during lifetime and cannot be requested for any data update or additional data download. That may be severely limiting. |
I had discussed the "inline wrapper layer" approach last week (I still need to make an explainer for it), and in that model, it could be possible for data loading to occur exclusively in the wrapper layer. In that case, we could expose user-facing async functions, but the functional exports from OmnICU would be synchronous and have the data already loaded for them. On the other hand, putting data loading in the wrapper layer makes the wrapper layer even more complex than it already is, and it may be limiting to us down the road in unknown ways. |
But aren't we aiming to have immutable objects? Once I create a DateFormatter for sr I don't expect it to change. If we add data for pl after that, it will only affect new objects created with Polish locale. Or am I misunderstanding the issue? |
I think the question of whether data loading goes into the wrapper layer or deeper in the implementation code is tangential to the question of whether data loading is sync or async. And, even if data loading happens in the wrapper layer, it still requires async APIs in Rust. |
I think async requirement was never a question for me. So to answer @kpozin question - async API will be available, from client's perspective. How deep would asyncness go is another question (that we should hash out before implementing first API). |
That's not necessarily the case. I can see a scenario where you may want to pull additional data from the provider or react to some dynamic event during lifetime of the object by updating data in the provider. Two scenarios I have in mind:
Without that option, all data has to be loaded before the provider gets passed to the constructor. That may be limiting and costly. lazy data loading is powerful and allows for significant perf/mem wins.
Without that option, you need to handle data updates separately and invalidate/reinitiate formatters. We could handle that in several ways, including recreating instances on data updates, or having just one event to override dataprovider in the instance once the data loads. That may fix (2), but likely will still be insufficient or clunky for (1). I think we should keep an option open that we'll want to handle async data providers. |
Clients tend to assume calling a format function is synchronous; introducing asynchronicity into every terminal function* call is a huge deal that we shouldn't take lightly. To respond to Zibi's two scenarios:
The formatter should know that it needs currency names for polish in the constructor. The constructor knows the locale, and it knows whether or not currency long names were requested as a formatting options. This is data that should additionally be known by static analysis, as I explained last week.
If the constructor is async, then the data should be loaded before the promise resolves to the immutable formatter object. I'm not convinced this is a scenario we should necessarily plan for, but if you did want to do it, you still can, externally to OmnICU: you can wait for an event to fire when new data is available, and then build new formatters and swap them into place, replacing the old formatters. Also: even if we wanted to cover this scenario within OmnICU, I think doesn't actually require async terminal functions*; the function can check synchronously if updated data is available in a cache somewhere, and then swap it in and use it, all synchronously. * A "terminal function" is a method that you call on a constructed OmnICU object, like |
(updated my previous post a bit) To be clear, I'm all in favor of async data providers; I'm just saying that we should assume we no longer need the data providers after the factory methods return. During the construction phase, OmnICU should get all the data it might need (based on the input options), and store it in the memory space associated with the formatter instance such that terminal method calls can be fast and synchronous. |
I am not suggesting that.
I can see scenarios where ability to lazily pull data would allow us to produce better quality software. It may not require us to make all methods asynchronous tho.
Right, or we could allow for updating providers inside the formatter: async fn on_update() {
let new_provider = fetch_provider().await;
formatter.set_provider(new_provider);
} right? |
This leaves one scenario - lazy data loading. Seems like Shane and Nebojsa are of the opinion that the provider should have all data needed for the constructor available synchronously at construction time. I'm suggestion that there may be a value in having ability to lazily fetch additional data. Not sure yet how it would work, and it may end up being on the wrapper, but something like: async fn create_currency_formatter() -> NumberFormat {
data_provider.ensure_currency().await;
let nf = NumberFormat::new_with_provider(&data_provider);
return nf;
} |
I think Scenario 2 is something that can be built on top of OmnICU pretty easily with the model of async constructor, sync terminal. For example (apologies if my syntax is not correct; still learning): // OmnICU code:
struct ICUCurrencyFormatter {
data: /* a type that includes all data required for this particular formatter */
}
async fn create_icu_currency_formatter(data_provider: &dyn DataProvider) {
let data = data_provider.get_currency_data().await;
return ICUCurrencyFormatter { data: data };
}
// Client code:
struct AutoUpdatingCurrencyFormatter {
p: DataProvider;
f: Option<ICUCurrencyFormatter>;
}
impl AutoUpdatingCurrencyFormatter {
async fn format(&self, num: i64) -> Result<Err, String> {
if (self.f == None || self.p.has_new_data()) {
self.f = create_icu_currency_formatter(self.p).await;
}
return self.f.format(i64);
}
}
// Example call site:
data_provider: MyDataProvider = // ...
auto_fmt = AutoUpdatingCurrencyFormatter {
p: data_provider
};
let result = auto_fmt.format(100).await;
Not exactly; I'm of the opinion that the constructor should be async such that it can use an async data provider. Once the constructor resolves, it should no longer need any more data from the async data provider, and the object can become synchronous from that point forward. (Thus, "async factory, sync terminal")
Yes, that is an example of an async factory method loading data and passing it synchronously into OmnICU. Cira suggested something similar in #19 (comment). This is one possible model we could adopt. Another model is if we put the data resolution a little deeper in the constructor code, rather than up front in the wrapper layer. |
I think if possible all asyncness should be on the provider, and totally up to the provider on how to make work. So individual providers may have pluggable networking hooks, or whatever. We should not be picking async-std or tokio here, at best we should have APIs that use Futures, letting users pick whichever framework they'd like. I'd point out that async IO frameworks like async-std and tokio are mostly useful when you're in a web server context, for applications they're less useful and really just spawning a thread to do the fetch should be enough. |
I'm 100% in agreement about not having any async or risk of failure in the fetching of individual pieces of data". Yes, this means that a message formatter will need to be constructed with all knowledge of any data it might need (which will affect the API for constructing it) but there's really no other option, since lazy loading can always fail, and failure isn't something the APIs should expose to users at that level (e.g. "You 20th message formatter call failed inside the UI thread because data wasn't available for that one locale" - no business logic will have a good response to this). Failure of the data suppliers is all about a failure of system configuration. It's not in the hands of business logic, so business logic shouldn't be expected to code around it. When constructing these things we should think about returning some semantically understandable information about the available data (e.g. ask for Polish formatter, get English - code will not fail at runtime and it's not likely the business logic can do better, but you might want to apologise to the end user). Just returning "this is the locale of the data we have" isn't great since there's a gulf between "pt_PT/RB" and "en_GB/US". Was it a "good" match or a "bad" match is probably the sort of level an end user will care about. |
I'm happy with the conclusion here. Let's focus on keeping async to Provider. |
Should we close this one as resolved and move discussion to actual proposals for providers? |
OK. I think there's not much more to discuss here. I mention in #28 that sync/async is up to the host language and data provider. |
Our data loading involves reading files on demand, and sometimes even requesting data from a REST provider over HTTP.
Is the plan to make all the APIs asynchronous? Or to have two versions of each method?
CC: @sffc @zbraniecki
The text was updated successfully, but these errors were encountered: