Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError needs attributes for module and file name #43980

Closed
nedbat opened this issue Sep 15, 2006 · 40 comments
Closed

ImportError needs attributes for module and file name #43980

nedbat opened this issue Sep 15, 2006 · 40 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@nedbat
Copy link
Member

nedbat commented Sep 15, 2006

BPO 1559549
Nosy @loewis, @brettcannon, @ncoghlan, @abalkin, @pitrou, @vstinner, @giampaolo, @nedbat, @merwok, @bitdancer, @Trundle, @briancurtin, @cool-RR, @ericsnowcurrently
Files
  • 1559549_3.patch
  • issue1559549.diff
  • issue1559549_v2.diff
  • importerror.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2012-04-13.00:27:13.367>
    created_at = <Date 2006-09-15.19:55:04.000>
    labels = ['interpreter-core', 'type-feature']
    title = 'ImportError needs attributes for module and file name'
    updated_at = <Date 2012-04-13.00:27:13.366>
    user = 'https://github.com/nedbat'

    bugs.python.org fields:

    activity = <Date 2012-04-13.00:27:13.366>
    actor = 'brett.cannon'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2012-04-13.00:27:13.367>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2006-09-15.19:55:04.000>
    creator = 'nedbat'
    dependencies = []
    files = ['21284', '23522', '24467', '24476']
    hgrepos = []
    issue_num = 1559549
    keywords = ['patch']
    message_count = 40.0
    messages = ['54901', '54902', '125660', '125661', '125662', '130190', '130218', '130243', '130250', '130950', '131027', '131147', '131153', '131368', '140266', '140307', '142251', '146404', '149517', '149521', '149539', '149592', '149593', '149603', '149628', '152921', '152930', '152956', '152974', '152977', '152978', '153047', '153048', '153051', '153054', '153060', '153070', '158002', '158188', '158189']
    nosy_count = 17.0
    nosy_names = ['loewis', 'brett.cannon', 'ncoghlan', 'belopolsky', 'pitrou', 'vstinner', 'techtonik', 'giampaolo.rodola', 'nedbat', 'eric.araujo', 'r.david.murray', 'Trundle', 'brian.curtin', 'gruszczy', 'cool-RR', 'python-dev', 'eric.snow']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1559549'
    versions = ['Python 3.3']

    @nedbat
    Copy link
    Member Author

    nedbat commented Sep 15, 2006

    Exceptions would be more useful if they had some
    structured information attached to them. For example,
    an ImportError could have the name of the module that
    could not be imported. This would make it possible to
    deal with exceptions in more powerful ways.

    For more discussion:
    http://www.nedbatchelder.com/blog/200609.html#e20060906T055924

    @nedbat nedbat added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Sep 15, 2006
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 16, 2006

    Logged In: YES
    user_id=21627

    Exceptions do have structured informations, for example

    py> try:
    ... open("/tmp/xxx")
    ... except IOError, e:
    ... print e.__dict__
    ...
    {'errno': 2, 'args': (2, 'No such file or directory'),
    'strerror': 'No such file or directory', 'filename': '/tmp/xxx'}

    It's just that ImportError doesn't, so I'm retitling this
    request to restrict attention to ImportError.

    If you have other proposals for specific information that
    should be on specific exceptions, please submit a separate
    issue.

    Would you like to work on this specific problem? I think
    ImportError should get a module attribute (always set), and
    a filename attribute (set only if a file was selected, yet
    failed to import; otherwise set to None). It probably will
    require some refactoring of C code to simplify raising
    ImportError in the importing code.

    Explicit raises of ImportError should also be considered;
    those in the standard library should be fixed to include
    atleast the module; raising ImportError without giving a
    module should set the module to None.

    @techtonik
    Copy link
    Mannequin

    techtonik mannequin commented Jan 7, 2011

    I think we should investigate deeper why this enhancement request didn't get into Python 3.

    Another user story is: "from xxx import yyy", if yyy not found, the args message is ('cannot import name yyy',)

    Python4?

    label:api

    @bitdancer
    Copy link
    Member

    There's no need for any deeper investigation. The answer is "nobody wrote the patch". If someone writes a good patch, it will go in.

    @abalkin
    Copy link
    Member

    abalkin commented Jan 7, 2011

    I think we should investigate deeper why this enhancement
    request didn't get into Python 3.

    There is nothing to investigate here. This is a request for a marginal improvement and OP did not follow up even though a core developer responded on the next day after his post.

    The only lesson to be learned from this is that Python improvements should be discussed on the tracker or appropriate python mailing lists and not on private blogs.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Mar 6, 2011

    This is a draft of a patch. I have only used this new ImportError api in once place, so it would work with following code:

    >>> try:
    ...  import nosuchmodule  
    ... except ImportError as e:
    ...  print(e.module)
    ... 
    nosuchmodule

    I have literally no experience with Python core, so I would be very grateful for comments and advice, so I could make this patch meet all requirements.

    @Trundle
    Copy link
    Mannequin

    Trundle mannequin commented Mar 7, 2011

    There are some issues with the patch:

    • The check for size of args in ImportError_init() is wrong: You can't create an ImportError with 3 arguments now ("TypeError: ImportError expected 2 arguments, got 3")
    • ImportError_clear() doesn't clear self->msg.
    • ImportError_str() doesn't increase the reference count for self->msg before returning it.
    • The code that raises the exception (Python/import.c) doesn't check for error return values: All the calls can return NULL which will cause a segfault due to the uncoditional Py_DECREFs.
    • Using PyUnicode_DecodeASCII() for the module name is wrong (IMHO), it should rather be something like PyUnicode_DecodeUTF8() as module names can contain non-ASCII characters in Python 3 and are AFAIK (at least right now) always encoded using utf-8.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 7, 2011

    The module name is a UTF-8 encoded string yes. It should be documented in PyModuleDef structure. I already documented the encoding in PyModule_New().

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Mar 7, 2011

    I am sorry again for those mistakes, it's all completely new to me. I have fixed those issues and created new patch. Using hg export, that now spans over two commits. Is it the way those patches should be provided, or should I gather all changes into a one commit?

    @Trundle
    Copy link
    Mannequin

    Trundle mannequin commented Mar 15, 2011

    I am sorry again for those mistakes, it's all completely new to me.

    No worries!

    I have fixed those issues and created new patch. Using hg export, that now spans over two commits. Is it the way those patches should be provided, or should I gather all changes into a one commit?

    A single patch (where all changesets are flattened into one) is the
    preferred way. For general advice, you can read the Developer's Guide
    at http://docs.python.org/devguide/ (e.g.
    http://docs.python.org/devguide/patch.html).

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Mar 15, 2011

    I didn't know about this mq extension. I will do a proper patch with a test after friday.

    @brettcannon
    Copy link
    Member

    At the PyCon 2011 sprint we discussed this issue and Nick, myself, and some other people agreed that using a keyword-only argument for passing in the module name is probably a better solution. While it won't be backwards-compatible (BaseException does not accept keyword arguments), it does provide a very clean API with an unambiguous way of specifying the module. Going another route (e.g., a constructor method) has the same backwards-compatibility issue. But a reason to use a solution other than the magical handling of the second argument is that it prevents doing the wrong thing simply because someone passes two or more arguments to ImportError.

    Another nicety of a new API for ImportError is that it can be made such that if the keyword-only argument ('module_name'?) is the only thing supplied (e.g., no positional arguments) then the message can be auto-generated, which would be a nice way to solve issue bpo-8754.

    @bitdancer
    Copy link
    Member

    +1 for providing a way to autogenerate the message.

    @gruszczy
    Copy link
    Mannequin

    gruszczy mannequin commented Mar 18, 2011

    Ok, here is a patch created using mq. I have a problem, however. I managed to solve following situation:

    try:
    raise ImportError('failed import' module_name='somemodule')
    except ImportError as e:
    print(e.module_name)

    that would print somemodule.

    However, I can't pass kwargs to ImportError_init from load_next. If someone could instruct me, how this can be achieved, I'll be happy to do that.

    @merwok
    Copy link
    Member

    merwok commented Jul 13, 2011

    How about this case:

      from validmodule import name_with_typo

    Do we need one keyword argument module_name and one attribute_name?

    @brettcannon
    Copy link
    Member

    No, we don't need attribute_name as that is getting too specific. Your example is simply importing validmodule.name_with_typo which happens to possibly be an attribute on the module instead of another module or subpackage.

    @brettcannon
    Copy link
    Member

    I have a use for this in my bootstrapping for importlib, so I am assigning this to myself in order to make sure that at least ImportError grows the needed keyboard argument and attribute. I will probably not bother with tweaking import.c, though.

    @brettcannon brettcannon self-assigned this Aug 17, 2011
    @briancurtin
    Copy link
    Member

    Here's an updated patch, plus support for a second attribute that I need for bpo-10854. I previously wrote a patch that does this same thing for that issue, but this one handles things a lot more nicely :)

    I renamed "module_name" to just be "name" since I was adding "path" and didn't want to have to name it "module_path" and have two "module_" attributes since I think it's clear what they are for. We can go back to "module_" names if you want - no big deal.

    It's really just the same patch as before but updated for a few minor changes (the ComplexExtendsException sig changed), and the moving of keyword handling in ImportError_init into a macro since we have to do the same initialization twice now. I'm not married to that implementation, it just minimized duplication and seems to work alright.

    Also, this removes the import.c change (see Brett's last message).

    @ericsnowcurrently
    Copy link
    Member

    Just following up on this ticket. Anyone have any objections to Brian's patch?

    Also, would 'fullname' be more appropriate than 'name', to be more in sync with that identifier in importlib?

    @pitrou
    Copy link
    Member

    pitrou commented Dec 15, 2011

    The patch doesn't appear to have the necessary mechanics to pass the new arguments from the import machinery when an ImportError is raised, is this deliberate?

    Also, I'm not sure why the new arguments are keyword-only.

    @ncoghlan
    Copy link
    Contributor

    The keyword-only idea is a backwards compatibility hack we discussed at the PyCon US sprints because ImportError currently accepts an arbitrary number of arguments:

    >>> raise ImportError(1, 2, 3)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: (1, 2, 3)

    We don't really want to be claiming that the module name is some strange value due to existing code that passes multiple arguments, hence the suggestion to make this a keyword-only argument.

    Regarding import.c, I think Brian misinterpreted Brett's last message. import.c absolutely *should* be modified by this patch, since it's still the default import implementation for the moment. Brett's comment was just to say that *he* wasn't going to do that part, since his aim with the importlib bootstrapping exercise is to nuke most of import.c from orbit :)

    @ericsnowcurrently
    Copy link
    Member

    I'm guessing that more than just Python/import.c should be updated, and more than one spot in import.c.

    @briancurtin
    Copy link
    Member

    If I add back in the import.c change, would this then be alright?

    Eric - fullname seems fine, I'll update that.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 16, 2011

    "fullname" is technically wrong:

    >>> import logging.bar
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: No module named 'bar'

    "module_name" sounds fine and explicit enough to me. Also, as Eric points out, there are many places where an ImportError is raised. You might want to create a private helper API.

    @briancurtin
    Copy link
    Member

    I think I'm going to stick with name unless anyone is super opposed. If we can eventually import something else (sausages?), then setting module_name with a sausage name will seem weird.

    I'll work up a more complete patch. The private helper is a good idea.

    @brettcannon
    Copy link
    Member

    Brian, what is left for updating your patch? I'm going to need this to fix test_pydoc to not fail in my importlib bootstrap work so I can (finally) help with this. Is it just actually using the new features of the exception?

    @brettcannon brettcannon removed their assignment Feb 9, 2012
    @briancurtin
    Copy link
    Member

    Yep, I just need to actually make use of the feature. I'll generate a new patch shortly.

    @brettcannon
    Copy link
    Member

    On Thu, Feb 9, 2012 at 00:03, Brian Curtin <report@bugs.python.org> wrote:

    Brian Curtin <brian@python.org> added the comment:

    Yep, I just need to actually make use of the feature. I'll generate a new
    patch shortly.

    If you can generate it before 17:30 EST today I can review it or at least
    help use it in the codebase.

    @briancurtin
    Copy link
    Member

    Here's an updated patch which creates two convenience functions, PyErr_SetFromImportErrorWithName and PyErr_SetFromImportErrorWithNameAndPath. I figure the two common cases are that you'll want to set just a name or you'll want a name and a path.

    *WithName is all that I've applied for now, and I've only done it in import.c to allow Brett to continue with his work. I'll come back and make changes elsewhere in the code, and I'll apply the *WithNameAndPath function where I need it for bpo-10854.

    All tests pass and some IRL testing works nicely.

    Note: I'm a little rushed at the moment so I have not included docs but I will surely update them. I just want to get Brett what he needs by this afternoon.

    @brettcannon
    Copy link
    Member

    Thanks, Brian! I'll do a review tonight on the drive home (and maybe even write the docs up).

    @bitdancer
    Copy link
    Member

    On the drive home...are you borrowing one of Google's self driving cars? :)

    @brettcannon
    Copy link
    Member

    The patch looks fine for its current semantics except for the fact that the macro doesn't work under clang; ##macro_var is supposed to be used only when concatenating parts of a string to create a complete identifier, not concatenating two identifiers next to each other (e.g. gooblede##gook is fine, but self->##gook isn't because '->' is its own identifier).

    But one issue is that for my purposes I need 'name' to be the full name of the module, not just the tail of the full name. The reason is that pydoc needs a way to tell when a module import failed because the module isn't there vs. when a module had an actual error itself. Using the tail end of the name doesn't help me because it means I can't tell if an import of a module named 'bunk' containing the line 'import test.bunk' fails because the module 'bunk' doesn't exist or because the import line failed since in both cases name == 'bunk'. But if name == 'test.bunk' there is no ambiguity. Pydoc used to do this by inspecting the stack and seeing where the exception was thrown which is just nasty and brittle (i.e. doesn't work with importlib since the stack goes much farther until ImportError is thrown).

    IOW I don't think we should be tossing out valuable data just because historically modules that didn't exist only returned the tail end of the full module name. Anyone object to switching to using the full name of the module triggering the exception? The message wouldn't change to use the full name (unfortunately) for backwards-compatibility reasons of course.

    @brettcannon
    Copy link
    Member

    And to answer David's joke, I carpool from Toronto to Waterloo four days a week so I have an hour each direction to work on stuff.

    @brettcannon
    Copy link
    Member

    Oh, and there are no forward declarations for the new functions added to errors.c

    @brettcannon
    Copy link
    Member

    Attached is Brian's patch with the macro fix and forward declarations.

    @brettcannon
    Copy link
    Member

    So I just tried to pass fullname in import.c and it didn't work (of course). Looks like even when I try to use fullname in find_module_path_list() and load_next() it is still leaving out the package name. Ugh. That means either refactoring import.c to get the full name in all places, not get the full name (and thus I'm still screwed), have importlib do something different than import.c in hopes that importlib replaces import.c before this goes public, or add a full_name attribute for importlib to fill in with the full details.

    @bitdancer
    Copy link
    Member

    However you do it, I'm very much in favor of having the full name available. I either wrote or fixed (I can't remember which) that stack walk in pydoc, and you are right, it is very very ugly. This would also be a big benefit for unittest, which currently *doesn't* do the stack walk and therefore generates incorrect error messages when sub-imports fail.

    @brettcannon
    Copy link
    Member

    This is now officially blocking my importlib bootstrap work (issue bpo-2377), so I have assigned this to myself to get done and I hope to get this committed within a week *without* updating Python/import.c and the requisite tests or pydoc (which I will make part of my merge instead). If Brian beats me to getting this checked in that's great, but otherwise I will get to it myself.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 13, 2012

    New changeset 6825fd9b00ed by Brett Cannon in branch 'default':
    Issue bpo-1559549: Add 'name' and 'path' attributes to ImportError.
    http://hg.python.org/cpython/rev/6825fd9b00ed

    @brettcannon
    Copy link
    Member

    Patch for the attributes is in (no use by import.c)! I'm going to do a separate commit for use by importlib and then fix pydoc in my bootstrap branch.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    10 participants