Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mime type for rar files #4054

Merged
merged 1 commit into from
Jan 16, 2021

Conversation

michalsn
Copy link
Member

@michalsn michalsn commented Jan 2, 2021

Description
This PR adds yet another mime type for rar files.

Ref: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

Most likely will fix #3979

Checklist:

  • Securely signed commits

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

This MIME type should indeed be added, however it will likely not fix the issue.

The problem reported in #3979 is, that the uploaded file gets assigned the application/octet-stream type.
In that case, the MIME type guesser will do the following steps:

  1. Check if application/octet-stream is in the MIME type list for the rar extension. It is not.
  2. Loop all extensions and return the first one to contain application/octet-stream. That is csv.

The underlying problem is that during file uploads the browser determines the media type it sends to the server based on the users setup which might vary strongly. See https://stackoverflow.com/a/26303098 for reference.

I think the most useful implementation would be to add a fallback after step 1 (see above) that simply keeps the file extension if an extension was passed and the media type is application/octet-stream, because this is just the default type passed by the uploading browser if it cannot determine any other.

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

On a side note, if we implement such fallback for uploaded files, the changes from #3956 can be reverted, too. (I think application/octet-stream could actually be purged from the complete mime array then.)

@michalsn
Copy link
Member Author

michalsn commented Jan 2, 2021

Relying on the file extension is not an option for security reasons. This way, we could always upload any file.

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

That makes sense. However, the way we currently treat 'application/octet-stream' it will always be uploaded as a csv file. As the octet-stream is a fallback for the uploading browser if it cannot determine a real mime type, maybe we should catch it and abort file upload with an exception stating that no valid mime type could be detected? This would be rather helpful than just uploading any octet-stream as csv I think.

@MGatner
Copy link
Member

MGatner commented Jan 2, 2021

I would like to throw an exception but I'm afraid that would constitute a breaking change and not be something we could implement until version 5. I agree though that matching to csv feel buggy and is definitely not helpful.

Are there no standards we can reference for handling this? What do other frameworks do?

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

I just checked the Laravel implementation and they just default all octect-streams to .bin which feels about more right than csv.
However, this does not solve the issues that were reported with other extensions that receive the octet-stream type during upload. For now I feel we should add
'bin' => 'application/octet-stream'
as the first enty in the mimes array and remove all other octet-stream entries. As the file is in app/config, people can add octet-stream to every extension they want it to match (like in #3956) but we should likely not promote this from the framework side. I think this will not be a BC changr but should be mentioned in the update instructions, as people would need to check this file for changes manually.

@MGatner
Copy link
Member

MGatner commented Jan 2, 2021

I like this solution, and agree that it would not be a BC.

this does not solve the issues that were reported with other extensions that receive the octet-stream type during upload

I don't really know what we can do about this. Let me try to restate it and you can correct me if I'm wrong:

  1. We rely on the request to determine the MIME
  2. If the MIME is indistinguishable for any reason it is assigned octet-stream
  3. We expose the original filename and extension to developers via UploadedFile

If that's all true, I think we just need to add some docs to note that developers who care about extensions/MIMEs need to handle octet-stream. Maybe we could add a note that future versions will throw an exception?

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

I think these points are true, however I would like to wait for @michalsn to confirm.
For 2. I additionally believe that sometimes the browser will just directly send octet-stream to the server if it cannot determine a type for an uploading file itself.

@michalsn
Copy link
Member Author

michalsn commented Jan 2, 2021

  1. We rely on the information returned by the browser only when the fileinfo extension is not available. Otherwise, the mime-type is always set separately by the finfo_file() function.

  2. If the fileinfo extension is not there and also there is no mime-type, then we set application/octet-stream as a default.

Personally, I feel like the current route is okay. The most common cases are covered. And if you handle upload for some specific files you can always add mime type by yourself so it can be recognized correctly.

I'm not sure about removing the application/octet-stream because some file extensions rely exclusively on it. Maybe the better solution would be to change the way we're guessing extension from the type? 🤔

I mean this:

public static function guessExtensionFromType(string $type, string $proposedExtension = null)
{
$type = trim(strtolower($type), '. ');
$proposedExtension = trim(strtolower($proposedExtension));
if ($proposedExtension !== '' && array_key_exists($proposedExtension, static::$mimes) && in_array($type, is_string(static::$mimes[$proposedExtension]) ? [static::$mimes[$proposedExtension]] : static::$mimes[$proposedExtension], true))
{
return $proposedExtension;
}
foreach (static::$mimes as $ext => $types)
{
if ((is_string($types) && $types === $type) || (is_array($types) && in_array($type, $types, true)))
{
return $ext;
}
}
return null;
}

The change wouldn't be anything big. When we have $proposedExtension set and there is no match we should simply return null. Likewise, we should fire the foreach loop only when there is no $proposedExtension set.

@sfadschm
Copy link
Contributor

sfadschm commented Jan 2, 2021

Thanks for clarifying.
The solution sounds reasonable to me, but I'm not sure if returning null might cause a BC break.
Of course we should keep the octet-streams where they fit, but I think they can be deleted for e.g. csv, vtt, xls and some others. If it is deleted in csv, at least also octet-streams will default to binary if they reach the foreach loop.
However, I think we should maybe open a new topic for that and the changes you mentioned.

The current changes in this PR are fine besides the discussion, but will not resolve the linked issue (your comment in the issue itself will likely do though) :-)

@michalsn
Copy link
Member Author

michalsn commented Jan 2, 2021

I don't think this would be a BC change. Looking at the current state it's more like a bugfix for me. If we have an extension and the mime-type doesn't match then we shouldn't check anything else and just return null.

Definitely - if we decide to make these changes, they won't be added to this PR.

@sfadschm sfadschm mentioned this pull request Jan 3, 2021
2 tasks
@michalsn
Copy link
Member Author

Mime issues were resolved in different PRs, but I will merge this too since the proposed mime-type is recommended.

@michalsn michalsn merged commit 8f88e9c into codeigniter4:develop Jan 16, 2021
mostafakhudair pushed a commit to mostafakhudair/CodeIgniter4 that referenced this pull request Jan 16, 2021
mostafakhudair pushed a commit to mostafakhudair/CodeIgniter4 that referenced this pull request Jan 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: RAR file detected as CSV
3 participants