-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replacement for tuxi: *oi* and *iris* #197
Comments
I think I managed this project very poorly and it can be very useful. |
I'm thinking in rewriting the whole code in a readable manner, I'll start early next month and try to match the best of performance and modular code that can be written in POSIX-complaint shell. |
Really sorry I haven't been around for the past 7-8 months (had a bunch of IRL stuff going on, not been online at all) Before I went away I wrote a fork of tuxi in rust, I have updated it this week to make sure everything is working in it again. If you just want a working tool for now you could use that. If you check the selectors.rs file and all the files in the selectors directory they contain all the updated IDs you need to feed pup to get the various bits info from the html (save you having to spend time in chrome dev tools looking it up yourself) Again, sorry about kinda ruining this project, most of this is on me. Just let me know if you need anything, happy to both either help or stay out of your way. |
I'll try to lend a hand with small stuff, just recently I thought I won't have to ever open youtube again thanks to tuxi ( cuz with mpv and yt-dlp I could just play vids directly from cli), but then just a few days later the youtube links parsing stopped working T.T P.S. Hopefully you'd pick rust as the main language :D |
All right, so thinking through this over the last few weeks (while I was studying for a few exams) I came to the conclusion that we have basically two options here. The first, of course the easiest, will be to just create a new project with the same features tuxi offers now with a simpler code that we can easily maintain the selectors and the most essential functions. It would be written in shell script, so here a few things to consider: Pros:
Cons:
The 2nd, would be to write in a compiled language such as C, C++ or Rust -- for performance and feature-rich like of course --, so here's a few things to consider: Pros:
Cons:
Let's not try to make this a vote-issue. We can definitely work through this with conversation. Now, the thing about tuxi is that we feed it information and uses it to try to find something about this information, now only consider that if we limit it to just Google for simplicity, we don't have an efficient tool because it could do way more than it does. Of course, this creates a priority problem that we already faced pre-#162 PR, but then if we choose the 2nd option, it wouldn't be too hard to actually fix it: maybe we could make an option that tries to find the best source based on the what the query says, maybe we could let the user choose one type of source he/she wants to use, etc... It can have all of those features without losing performance, that's the point. Of course I'm not an expert nor knowledgeable enough to write from start the best code in C, C++ or Rust (maybe nobody here is), but that's fine as long as we can keep improving it, that won't be much of an issue. Waiting for your responses, if nobody responds (sadly it is an option), I'll just choose the second option and the language Rust -- maybe not... |
And, @simdimdim, I believe there's a cli tool already for that. |
Hey dude do you want to check this out? It basically is tuxi but written in rust, I've just updated it and it should be bug free (as far as I'm aware). Let me know what you think of it, any improvements to be made, stuff missing, could you work with it, etc |
@PureArtistry yeah, it is pretty nice. Since you're already programming it on Rust, we definitely could use it as base. Maybe the name too if you want to, the thing is, the word 'oi' is 'hi' in Portuguese, kinda strange to type it in; what you think about "iris"? So we can create an organization already (Edit: though there's already someone named iris on Github...) |
How we handle other sources in the code is something to think about when we actually have other things to add (open to ideas for that btw) As for the name, oi is what I used to alias tuxi to because tuxi is a little awkward to type on a qwerty keyboard, whereas oi are right next to each other. Also oi in english used kinda like hi, as a way to grab someones attention to ask them a question. I don't mind changing the name (or using tuxi) - I do quite like oi though. |
Iris is such a cool name.., that's pretty much why.
And to note, I'll start already looking at the source code to see how everything is being handled so I can start recreating/maintaining it and add more sources (primarily for lyrics since Google doesn't scrape most of the websites). |
The reason I suggested rust is because I've already worked on something similar (using different selectors according to a site/purpose) here(trait) and here and here(usage). The concept is implement a trait for a new struct/selector (in an add-on) then just stick it in a HashMap<domain, Box>, in the case of a assistant it should be more along the lines of Map<(command,domain), Box> I guess, but the idea is the same. P.S. As an anime fan I find 'oi' (it's basically a 'hey') quite cute as a name. lastly, I'm more in favor of the usage of a compiled language, specifically rust with requests and a html parsing lib are Very easy to work with in my opinion |
My only fear w\ writing in Rust and by so using dependencies as libraries is the dependency hell, which a way bigger language like JS with npm has the feasibility in being easy to create this problem; while in Rust for now isn't that easy to do so, how long until it gets there? For a project like a cli assistant, it shouldn't be too hard if we're managing simple things such as the raw of the raw, nothing more than that. But when we have to create pretty outputs for some scrapers-functions, like tables, we have to rely on dependencies. Taking for example the current I want to be careful here, because comparing this to C or C++, we know that there it's pretty complicated for you to actually do that, while in Rust the cost of memory safety (+ certain things) create this other window of bad possibilities. But then there's that thing, a small thing as I consider, which is other people being able to contribute to such project. Rust, for now, is a more popular language. but is that really important when the tool just does what it can do in the best way? I think not. Is C++ or C still on table? I would say yes, it is efficient and although slower to develop since we would starting from ground zero, still is something that I/(we) can make good in almost every possible way... so, any thoughts?
That's actually a pretty cool project, and the code is similar to the ones' of PureArtistry's; so I guess that's how Rust is able to handle this. |
In my opinion rust is much easier to work with (closer to python's ability to rapid development) than C/++ is, as for dependencies, as long as one doesn't have any git repositories or wildcard *, there's virtually no chance for breakage to occur as far as I'm aware. there's also still a Cargo.lock for extreme cases. 150 packages don't sound like a lot to me (i'm expecting 200-400 for a mature project to be normal), since most of them are just dependencies of dependencies, for the most part I'd say we need just reqwest, select, tokio, url, futures and maybe chrono and quality of life stuff like log and itertools. (maybe something that integrates with OS commands for external app execution)
Thanks, have been working on it on and off offline for some time, had more time lately so will probably finish the file handling (on disc and freshly downloaded) soon. And yes, rust ain't quite like C/++ but I'm yet to hear to be unable to handle stuff C/++ can. It just does so via different tools/concepts. P.S. I feel like I sound like quite the rust fanboy, but it's not due to my belief it's a great language, rather it's because I'm yet to run into any nastiness while working with it. But in the end language is also just a tool and it is also my belief one should choose the optimal tool for the job, (so in the end it's a matter of determining what the 'job' is I guess :D) |
The job for now is the essential, it is to scrape a page, then run a loop through its content with the selectors to print out what it found, for something so simple, of course the best choice is shell script. But eventually we'll go further than that, making a prettier output may rely on other things, multiple sources is an idea, long-term ideas like such are better fitted to compiled languages; and using oi for the follow up example: what Rust can do that C/++ can't besides being 55mb in binary size and with far more libraries/dependencies? Easier to code, modularity of dependencies? I think I'm relying on the idea that Rust isn't a replacement of C/++ here, we should use it if its benefits outgrow the disadvantages. Nonetheless relative is relative, for one, easier to code/faster to code is a feature that can easily be an disadvantage in this situation when we talk about performance. And dependencies? Talk about npm and 150+ packages is nothing, talk about C/++ and that's far more libraries than even big C/++ projects have. |
As far as I'm aware c/++ and rust are comparable in performance (that's what most articles on the Internet say anyway). Also as far as I'm aware there's little essential difference between what c/++ can do and rust can't or vice versa. Not sure what that 55mb binary is about. With rust I'd say if the code compiles you only need to look for logical errors, meanwhile, c/++ is just able to compile quicker, but you still have to keep a lot of things in mind. Also multi-threading in rust is way easier. |
Multi-threading for a cli assistant is too much for what it is supposed to do, it's not happening; it is a less-than-a-second run, let's keep it simple there because it should be simple here.
I mean, one of the points of dependency hell is exactly that, it is having dependencies of dependencies or dependencies of dependencies of dependencies and so forth, all of which you still have to compile in Rust, so it definitely counts.
Static libraries compiled inside the binary, I guess? I will add a few things later on this. |
I think you have some misconceptions of how cargo/rust works. There are a lot of dependencies because the rust std library is quite small, lots of features built into the C/++ std lib are instead provided as separate crates, a lot of projects like to break up the various functions of a program into their own crates for clean code re-use. Dependency hell also isn't worse in rust compared to anything else, every version of every package/crate is stored on crates.io. You can specifiy exactly which version you want to use in the Cargo.toml file and after compilation a Cargo.lock file is generated with the details (with version and checksum) of every crate used in the compilation and when building a release version it will fetch the exact crates listed in the lock file Rust binaries are typically larger than the C/++ equivalent but you're looking at the wrong file for the size. |
Well, that's true and that was my mistake, thanks for pointing it out. However the dependency hell I'm looking at here isn't about versions or dependencies that use certain versions, that Cargo and even npm handles well and I'm aware of it, I'm talking about the number of dependencies -- it is a lot, being specifically here, in C++ that is never happening if you take little care of your project; what is in those projects (dependencies) that asks for so many? Those dependencies really need it all? Sometimes I think if I dig deeper I can find the equivalent of this on Cargo. Since you're already here, @PureArtistry, what feature on oi is giving this huge dependency tree? The tables feature? Like, to download a source-page on C++ I need one library, curl (which is available everywhere) or if we're going an easier path we can use cpr (two libraries) -- very simple there. What then would be the equivalent in Rust since you both are more experienced than obviously me on it? |
Most of the crates in oi are deps of scraper (the html parsing lib), if you run I think the number of packages is sort of a cultural thing, rust as a language was created around having cargo as a package manager and repo for libs. When it's so easy to find, share & use libraries people tend to write more modular code, also the compiler only uses the code from those libs that's needed so it doesn't affect the final binary only compilation time and disk space while working with / building the binary. for a curl equivalent in rust you can use: |
I also think trying to minimize the number of dependencies is the wrong thing to focus on, in rust that's somewhat similar to saying you want to minimize the usage of reusable code. I'm sure rust also has silly libraries like that, but choosing to use them is entirely different thing (we're not implored to use something like 'true' in rust :D). Usually the amount of dependencies a project has is more a measure of how many things the project does (or needs,) or how good it is at handling code reuse. random example: If I recall correctly the way c++ handles random number generation is part of the std, in rust, it's its own crate. It's not that std is crippled, it just doesn't need to be part of std |
What kind of a powerful general-purpose language, and systematic one to note, doesn't come with the equivalent of rand? Perhaps that's more of a problem of the modularity driven-mind that Rust is always into.
Code reuse is necessary when it is necessary, like when we need a whole HTTP request method that just does what it needs to do -- nobody wants to rewrite such huge thing for their project only. But not everything needs to be reused though, definitely not everything needs to be its own dependency, we all have the capacity to create certain codes specifically for our own project, for efficiency or even for own modularity of code. Look, if none of you are going to give a chance for C++ or C here, we ultimately have to go with Rust, but right now I can't see a single benefit of using it over C++ with this project. |
you_know_what, after trying more with Rust for the past few days, I can see some reasons for using it over C++ overall, but still I don't think in the long run it will be better for this project, besides the uglier syntax and bigger binary sizes 😄 (I'm joking, kinda). However the point of starting this thread is that I wanted to achieve the best out of performance and modularity of our code, not feature-rich from dependencies for tables or for a prettier output of what we scrape, and if there is a language in which this project can be rewritten into and that I can I see a future ahead is in C++. From @PureArtistry at the start,
If I can kindly use this, I would be happy if you actively maintain oi as you want it to be maintained, the thought of having a Rust competitor can give me enough motivation to maintain a better C++ project until it can actually replaces tuxi entirely and even, if we get there, oi as well. I will take my time to actually create the first working version of the project in C++, since it is -- you know -- C++, but we all know that doubt kills more dreams than failure ever will, so that's that. |
@BeyondMagic, sounds good mate! - the offer for help is always open but my C++ is meh (my brain doesn't do OOP) |
Perhaps re-writing or forking it to keep it alive as I think this can be used smartly by some people or new projects.iris is written in C++ and is supposed to be a replacement for tuxi and as well for oi.
oi is written in Rust and is supposed to be a replacement for tuxi.
Move your ideas, issues, PRs to one of those projects instead of tuxi as it is not being being actively developed anymore and is, for its great part, not working for anything now.
The text was updated successfully, but these errors were encountered: