Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio portal #1129

Closed
GeorgesStavracas opened this issue Oct 6, 2023 · 11 comments
Closed

Audio portal #1129

GeorgesStavracas opened this issue Oct 6, 2023 · 11 comments
Assignees
Labels
needs discussion Needs discussion on how to implement or fix the corresponding task new portals This requires creating a new portal interface

Comments

@GeorgesStavracas
Copy link
Member

Right now, the almost ubiquitous way for sandboxed applications to access audio is through PulseAudio (even if the other end of the socket is PipeWire). On Flatpak, that translates to most apps having the pipewire socket always enabled as a static permission.

This is not ideal for multiple reasons: we're trying to move away from static permissions as much as possible; and this mechanism doesn't give much granularity over permissions to control audio.

We should discuss and explore introducing an Audio portal, that would operate much like the Camera portal, but focused on the various ways apps can use audio devices.

The D-Bus interface would look roughly like this:

<?xml version="1.0"?>
<node name="/" xmlns:doc="http://www.freedesktop.org/dbus/1.0/doc.dtd">
  <interface name="org.freedesktop.portal.Audio">

    <!--
        AccessAudio:
        @options: Vardict with optional further information
        @handle: Object path for the #org.freedesktop.portal.Request object representing this call

        Request to gain access to audio devices.

        Supported keys in the @options vardict include:
        <variablelist>
          <varlistentry>
            <term>sources as</term>
            <listitem><para>
              A list of audio sources to request access to. Acceped values are:

              <simplelist>
                <member>'microphone'</member>
                <member>'speakers'</member>
                <member>'camera'</member>
                <member>'applications'</member>
              </simplelist>

            </para></listitem>
          </varlistentry>
          <varlistentry>
            <term>handle_token s</term>
            <listitem><para>
              A string that will be used as the last element of the @handle. Must be a valid
              object path element. See the #org.freedesktop.portal.Request documentation for
              more information about the @handle.
            </para></listitem>
          </varlistentry>
        </variablelist>

        Following the #org.freedesktop.portal.Request::Response signal, if
        granted, org.freedesktop.portal.Audio.OpenPipeWireRemote() can be used to
        open a PipeWire remote.
    -->
    <method name="AccessAudio">
      <arg type="a{sv}" name="options" direction="in"/>
      <arg type="o" name="handle" direction="out"/>
    </method>

    <!--
        OpenPipeWireRemote:
        @options: Vardict with optional further information
        @fd: File descriptor of an open PipeWire remote.

        Open a file descriptor to the PipeWire remote where the audio nodes
        are available. The file descriptor should be used to create a
        <classname>pw_core</classname> object, by using
        <function>pw_context_connect_fd</function>.

        This method will only succeed if the application already has permission
        to access camera devices.

        Supported keys in the @options vardict include:
        <variablelist>
          <varlistentry>
            <term>sources as</term>
            <listitem><para>
              A list of audio sources to access. Acceped values are:

              <simplelist>
                <member>'microphone'</member>
                <member>'speakers'</member>
                <member>'camera'</member>
                <member>'applications'</member>
              </simplelist>

            </para></listitem>
          </varlistentry>
        </variablelist>
    -->
    <method name="OpenPipeWireRemote">
      <annotation name="org.gtk.GDBus.C.Name" value="open_pipewire_remote"/>
      <annotation name="org.gtk.GDBus.C.UnixFD" value="true"/>
      <arg type="a{sv}" name="options" direction="in"/>
      <arg type="h" name="fd" direction="out"/>
    </method>

    <property name="version" type="u" access="read"/>
  </interface>
</node>

Design Considerations

  • Granting permissions for every app that wants to play some audio would be annoying, and that playing audio is not really security sensitive (at most, a malicious app can only be annoying), permission to output audio ("speakers") should probably be given without popping a dialog. Given that this is still a permission, people will be able to revoke it in e.g. GNOME Settings or Flatseal. The important bit is not to be annoying, but still use a permission to control access to it.
  • Not sure if 'applications' should be in this list, or in a potential new Media Sharing portal. We can probably leave it out for now.

CC @hfiguiere @matthiasclasen @jadahl

@GeorgesStavracas GeorgesStavracas added new portals This requires creating a new portal interface needs discussion Needs discussion on how to implement or fix the corresponding task labels Oct 6, 2023
@github-project-automation github-project-automation bot moved this to Needs Triage in Triage Oct 6, 2023
@GeorgesStavracas GeorgesStavracas moved this from Needs Triage to Triaged in Triage Oct 6, 2023
@matthiasclasen
Copy link
Contributor

One question here is: does it make to keep the camera separate?

Or could this just be an 'Pipewire access' portal?

I seem to remember that people have asked for screencast+audio before

@Obsessee
Copy link

Obsessee commented Oct 6, 2023

Screencast + audio has been requested on the PipeWire repo and here on x-d-p before. It's a sought after feature for livestreaming usecases as well as for applications like Zoom and Discord. I think Zoom already has a working implementation as of right now, but I could be wrong.

Something else for consideration, maybe you could allow certain applications to only allow certain speaker devices, A similar WirePlumber issue exists for this, but hasn't been addressed so far. Not to say there should be a pop-up to select which device to play audio (which would be quite ridiculous), but having a way to allow applications to only use a PipeWire sink to output audio to that can be transformed later down the line would be terrific.

@GeorgesStavracas
Copy link
Member Author

I think the screencast+audio case would be entirely contained within the Screencast portal. Apps request a screencast with audio, get a PipeWire remote, and connect to whatever streams are available there.

@Obsessee per-device control sounds like a policy to implement on the media session (WirePlumber) level, not on portal level.

@Mikenux
Copy link

Mikenux commented Oct 7, 2023

Since it's just audio, why not group camera and microphone as a request for audio input?

@GeorgesStavracas
Copy link
Member Author

Since it's just audio, why not group camera and microphone as a request for audio input?

Sure

@Obsessee
Copy link

Obsessee commented Oct 7, 2023

Differentiating between switches for input and output would be really terrific. I'd like to disallow badly behaving apps from capturing my microphone for voice input when I'm just using it to play music. Would especially help for fat fingering buttons on a Linux phone.

I feel like bringing this up now before there's an implementation because I don't think it'd be considered a big enough issue to revise the portal later down the line.

Not sure if 'applications' should be in this list, or in a potential new Media Sharing portal. We can probably leave it out for now.

I think application audio would probably be better fit for a Media Sharing portal, as it's not a (physical) device.

@GeorgesStavracas
Copy link
Member Author

Another device category I did not consider: MIDI devices. They're managed by PipeWire as well.

@hfiguiere
Copy link
Collaborator

hfiguiere commented Oct 8, 2023

PulseAudio access is done with --socket=pulseaudio, which seems to grant ALSA (including MIDI), while other PipeWire access including JACK is done through the --filesystem=xdg-run/pipewire-0 permission.

About the sources:

  • microphone: this should be something like audio_inputs.
  • speakers: this should be something like audio_outputs.
  • camera: this should be something like video_inputs, or does that also include video_outputs?
  • applications: I don't even know what that is.
  • midi: (see Audio portal #1129 (comment)) the MIDI bus, so this name seems to be unambiguous.

Also how do you specify the control of the graph? Example Helvum. Or the DAW that want to control all the audio, but do not need the video. Shall we have a permission for fuller access or shall grant it like it is currently done with --filesystem=xdg-run/pipewire-0?

@jadahl
Copy link
Collaborator

jadahl commented Oct 8, 2023

This sounds a bit like the devices portal that has been discussed as a "replacement" for the camera portal. Perhaps a less generic name is needed than "devices".

A "audio" portal that doesn't involve cameras could work too I guess, but it should ideally be easy to open a PipeWire remote that gets you both microphone and camera streams.

The "OpenPipeWireRemote" you add here adds different sources that would be exposed in the remote, including "camera", is that intentional?

microphone: this should be something like audio_inputs.
speakers: this should be something like audio_outputs.
applications: I don't even know what that is.

The point here is to create some predictable expectations. If an app gets "microphone" access, it shouldn't be able to "eaves drop" on every app's PipeWire source. In other words, applications can be audio input, but that doesn't mean it's a "microphone".

@matthiasclasen
Copy link
Contributor

The point here is to create some predictable expectations. If an app gets "microphone" access, it shouldn't be able to "eaves drop" on every app's PipeWire source. In other words, applications can be audio input, but that doesn't mean it's a "microphone".

Yes indeed.

@GeorgesStavracas
Copy link
Member Author

We seem to cyclicly float between "specific portals for independent categories of devices", and "generic portal covering pretty much all that PipeWire covers". I can see pros and cons for both approaches, and I haven't really made my mind about it.

I like the elegance and explicit semantic of specific portals. Want a camera? Camera portal; need audio? Audio portal; etc. But the downside is that we have to copy-paste an almost identical API in each one of them (AccessSomething, OpenPipeWireRemote). We also force apps to open different PipeWire remotes for each class of devices they need access to. It may or may not be a problem.

A catch-all PipeWire portal ("Devices" portal) would solve the multiple remotes problem, apps would be able to request multiple classes of devices in a single call (AccessDevices(['camera', 'speakers', 'microphone']), etc). But I wonder if we wouldn't be putting this portal in (ultimately) a dead end... what if we later find out that we need different options and flows for different devices?

I don't have a conclusive answer here, but I'm slightly happier with separate portals, even if that means apps having to open multiple PipeWire remotes. We can encode domain-specific knowledge in these portals - like the discussion above about audio device types, which wouldn't apply to camera video.

@flatpak flatpak locked and limited conversation to collaborators Oct 11, 2023
@GeorgesStavracas GeorgesStavracas converted this issue into discussion #1142 Oct 11, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
needs discussion Needs discussion on how to implement or fix the corresponding task new portals This requires creating a new portal interface
Projects
No open projects
Status: Triaged
Status: Done
Development

No branches or pull requests

6 participants