-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't require cproject structure #21
Comments
Norma accepts single filenames in the form:
It can also accept a single directory as a
(Note that This is mainly a question of documentation. Note also that norma accepts wildcards, e.g.
The problem with not using directories and reserved names is that the Note also that wrapping Peter Murray-Rust |
I just tested this and for me it doesn't work.
|
@tarrow what were you expecting? |
I was expecting it to be fine since we didn't specify it was in a CProject. Reserved name shouldn't matter if you aren't using it in a project environment. In any case using foo.xml is even worse:
|
So, to clarify I think the answer is one cannot use Norma on just a single file input. It can only be used to convert a single file input to a CProject with a CTree called fulltext.xml (or html etc..) which you can then run norma on (again) to do the actual conversion. @petermr do you think this is the case? (i.e. |
without reading the docs I'd say -q is how we started. in fact you can write There is certainly the logic to build a CProject from a list of files, but On Tue, May 17, 2016 at 12:25 PM, tarrow notifications@github.com wrote:
Peter Murray-Rust |
So, it isn't possible to run norma without a Should it be possible to do a conversion without being in a CTree? (i.e. I don't believe it is possible now; is this a bug or a feature?) |
So, the situation is that it isn't currently possible. I think; like Richard; that it should be possible. Perhaps not as a particularly high priority task but it should be possible. Hurdles I see are that requiring a CProject, or creating one, are currently an integral part of the "main control loop" that can be found in org.xmlcml.cmine.args.DefaultArgProcessor. |
So what's the use case for this? A major part of normal/ami is that it But how many people want to transform just one file? And what can you do On Wed, May 18, 2016 at 5:14 PM, tarrow notifications@github.com wrote:
Peter Murray-Rust |
I'm exploring whether or not I can use Another use case is just learning how to use the |
The problem is that tables in PDF are very hard. No-one has solved it. TabulaPDF went some way, CM goes a different partial way. It's often possible to do a single source but not generalize:
even recognising tables is hard. |
I've started looking at Tabula via the R tabulizer package, which wraps the command line. I also started pondering cribs for tabula, eg giving it keywords for things t might expect to find in a table heading to help it gets its eye in! I saw you'd done some work extracting data from line charts - is that part of the contentmine toolset? |
It's not clear to me from the command-line --help, or from the README, whether norma requires a cproject structure, but it seems to.
What most users will want to do is:
I think we shouldn't require a specific input filename or enforce a specific output filename, because it restricts what the user can do and creates work for them. There are a lof of ways of getting NLM xml files without using contentmine tools. Using the contentmine project conventions should be an additional option.
The text was updated successfully, but these errors were encountered: