Audio portal #1129

GeorgesStavracas · 2023-10-06T18:55:13Z

Right now, the almost ubiquitous way for sandboxed applications to access audio is through PulseAudio (even if the other end of the socket is PipeWire). On Flatpak, that translates to most apps having the pipewire socket always enabled as a static permission.

This is not ideal for multiple reasons: we're trying to move away from static permissions as much as possible; and this mechanism doesn't give much granularity over permissions to control audio.

We should discuss and explore introducing an Audio portal, that would operate much like the Camera portal, but focused on the various ways apps can use audio devices.

The D-Bus interface would look roughly like this:

<?xml version="1.0"?>
<node name="/" xmlns:doc="http://www.freedesktop.org/dbus/1.0/doc.dtd">
  <interface name="org.freedesktop.portal.Audio">

    <!--
        AccessAudio:
        @options: Vardict with optional further information
        @handle: Object path for the #org.freedesktop.portal.Request object representing this call

        Request to gain access to audio devices.

        Supported keys in the @options vardict include:
        <variablelist>
          <varlistentry>
            <term>sources as</term>
            <listitem><para>
              A list of audio sources to request access to. Acceped values are:

              <simplelist>
                <member>'microphone'</member>
                <member>'speakers'</member>
                <member>'camera'</member>
                <member>'applications'</member>
              </simplelist>

            </para></listitem>
          </varlistentry>
          <varlistentry>
            <term>handle_token s</term>
            <listitem><para>
              A string that will be used as the last element of the @handle. Must be a valid
              object path element. See the #org.freedesktop.portal.Request documentation for
              more information about the @handle.
            </para></listitem>
          </varlistentry>
        </variablelist>

        Following the #org.freedesktop.portal.Request::Response signal, if
        granted, org.freedesktop.portal.Audio.OpenPipeWireRemote() can be used to
        open a PipeWire remote.
    -->
    <method name="AccessAudio">
      <arg type="a{sv}" name="options" direction="in"/>
      <arg type="o" name="handle" direction="out"/>
    </method>

    <!--
        OpenPipeWireRemote:
        @options: Vardict with optional further information
        @fd: File descriptor of an open PipeWire remote.

        Open a file descriptor to the PipeWire remote where the audio nodes
        are available. The file descriptor should be used to create a
        <classname>pw_core</classname> object, by using
        <function>pw_context_connect_fd</function>.

        This method will only succeed if the application already has permission
        to access camera devices.

        Supported keys in the @options vardict include:
        <variablelist>
          <varlistentry>
            <term>sources as</term>
            <listitem><para>
              A list of audio sources to access. Acceped values are:

              <simplelist>
                <member>'microphone'</member>
                <member>'speakers'</member>
                <member>'camera'</member>
                <member>'applications'</member>
              </simplelist>

            </para></listitem>
          </varlistentry>
        </variablelist>
    -->
    <method name="OpenPipeWireRemote">
      <annotation name="org.gtk.GDBus.C.Name" value="open_pipewire_remote"/>
      <annotation name="org.gtk.GDBus.C.UnixFD" value="true"/>
      <arg type="a{sv}" name="options" direction="in"/>
      <arg type="h" name="fd" direction="out"/>
    </method>

    <property name="version" type="u" access="read"/>
  </interface>
</node>

Design Considerations

Granting permissions for every app that wants to play some audio would be annoying, and that playing audio is not really security sensitive (at most, a malicious app can only be annoying), permission to output audio ("speakers") should probably be given without popping a dialog. Given that this is still a permission, people will be able to revoke it in e.g. GNOME Settings or Flatseal. The important bit is not to be annoying, but still use a permission to control access to it.
Not sure if 'applications' should be in this list, or in a potential new Media Sharing portal. We can probably leave it out for now.

CC @hfiguiere @matthiasclasen @jadahl

The text was updated successfully, but these errors were encountered:

matthiasclasen · 2023-10-06T19:43:43Z

One question here is: does it make to keep the camera separate?

Or could this just be an 'Pipewire access' portal?

I seem to remember that people have asked for screencast+audio before

Obsessee · 2023-10-06T19:52:26Z

Screencast + audio has been requested on the PipeWire repo and here on x-d-p before. It's a sought after feature for livestreaming usecases as well as for applications like Zoom and Discord. I think Zoom already has a working implementation as of right now, but I could be wrong.

Something else for consideration, maybe you could allow certain applications to only allow certain speaker devices, A similar WirePlumber issue exists for this, but hasn't been addressed so far. Not to say there should be a pop-up to select which device to play audio (which would be quite ridiculous), but having a way to allow applications to only use a PipeWire sink to output audio to that can be transformed later down the line would be terrific.

GeorgesStavracas · 2023-10-06T20:00:52Z

I think the screencast+audio case would be entirely contained within the Screencast portal. Apps request a screencast with audio, get a PipeWire remote, and connect to whatever streams are available there.

@Obsessee per-device control sounds like a policy to implement on the media session (WirePlumber) level, not on portal level.

Mikenux · 2023-10-07T00:09:14Z

Since it's just audio, why not group camera and microphone as a request for audio input?

GeorgesStavracas · 2023-10-07T19:27:16Z

Since it's just audio, why not group camera and microphone as a request for audio input?

Sure

Obsessee · 2023-10-07T23:40:18Z

Differentiating between switches for input and output would be really terrific. I'd like to disallow badly behaving apps from capturing my microphone for voice input when I'm just using it to play music. Would especially help for fat fingering buttons on a Linux phone.

I feel like bringing this up now before there's an implementation because I don't think it'd be considered a big enough issue to revise the portal later down the line.

Not sure if 'applications' should be in this list, or in a potential new Media Sharing portal. We can probably leave it out for now.

I think application audio would probably be better fit for a Media Sharing portal, as it's not a (physical) device.

GeorgesStavracas · 2023-10-07T23:47:01Z

Another device category I did not consider: MIDI devices. They're managed by PipeWire as well.

hfiguiere · 2023-10-08T04:05:22Z

PulseAudio access is done with --socket=pulseaudio, which seems to grant ALSA (including MIDI), while other PipeWire access including JACK is done through the --filesystem=xdg-run/pipewire-0 permission.

About the sources:

microphone: this should be something like audio_inputs.
speakers: this should be something like audio_outputs.
camera: this should be something like video_inputs, or does that also include video_outputs?
applications: I don't even know what that is.
midi: (see Audio portal #1129 (comment)) the MIDI bus, so this name seems to be unambiguous.

Also how do you specify the control of the graph? Example Helvum. Or the DAW that want to control all the audio, but do not need the video. Shall we have a permission for fuller access or shall grant it like it is currently done with --filesystem=xdg-run/pipewire-0?

jadahl · 2023-10-08T09:57:47Z

This sounds a bit like the devices portal that has been discussed as a "replacement" for the camera portal. Perhaps a less generic name is needed than "devices".

A "audio" portal that doesn't involve cameras could work too I guess, but it should ideally be easy to open a PipeWire remote that gets you both microphone and camera streams.

The "OpenPipeWireRemote" you add here adds different sources that would be exposed in the remote, including "camera", is that intentional?

microphone: this should be something like audio_inputs.
speakers: this should be something like audio_outputs.
applications: I don't even know what that is.

The point here is to create some predictable expectations. If an app gets "microphone" access, it shouldn't be able to "eaves drop" on every app's PipeWire source. In other words, applications can be audio input, but that doesn't mean it's a "microphone".

matthiasclasen · 2023-10-09T13:54:28Z

The point here is to create some predictable expectations. If an app gets "microphone" access, it shouldn't be able to "eaves drop" on every app's PipeWire source. In other words, applications can be audio input, but that doesn't mean it's a "microphone".

Yes indeed.

GeorgesStavracas · 2023-10-09T16:45:23Z

We seem to cyclicly float between "specific portals for independent categories of devices", and "generic portal covering pretty much all that PipeWire covers". I can see pros and cons for both approaches, and I haven't really made my mind about it.

I like the elegance and explicit semantic of specific portals. Want a camera? Camera portal; need audio? Audio portal; etc. But the downside is that we have to copy-paste an almost identical API in each one of them (AccessSomething, OpenPipeWireRemote). We also force apps to open different PipeWire remotes for each class of devices they need access to. It may or may not be a problem.

A catch-all PipeWire portal ("Devices" portal) would solve the multiple remotes problem, apps would be able to request multiple classes of devices in a single call (AccessDevices(['camera', 'speakers', 'microphone']), etc). But I wonder if we wouldn't be putting this portal in (ultimately) a dead end... what if we later find out that we need different options and flows for different devices?

I don't have a conclusive answer here, but I'm slightly happier with separate portals, even if that means apps having to open multiple PipeWire remotes. We can encode domain-specific knowledge in these portals - like the discussion above about audio device types, which wouldn't apply to camera video.

GeorgesStavracas added new portals This requires creating a new portal interface needs discussion Needs discussion on how to implement or fix the corresponding task labels Oct 6, 2023

github-project-automation bot added this to Triage Oct 6, 2023

github-project-automation bot moved this to Needs Triage in Triage Oct 6, 2023

GeorgesStavracas assigned hfiguiere Oct 6, 2023

GeorgesStavracas added this to XDG Portals workboard Oct 6, 2023

GeorgesStavracas moved this from Needs Triage to Triaged in Triage Oct 6, 2023

tytan652 mentioned this issue Oct 7, 2023

App to App Media Sharing #1130

Closed

This was referenced Oct 11, 2023

Add portal / permission for recording system audio #751

Closed

Add a microphone portal #615

Closed

flatpak locked and limited conversation to collaborators Oct 11, 2023

GeorgesStavracas converted this issue into discussion #1142 Oct 11, 2023

github-project-automation bot moved this to Done in XDG Portals workboard Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Audio portal #1129

Audio portal #1129

GeorgesStavracas commented Oct 6, 2023

matthiasclasen commented Oct 6, 2023

Obsessee commented Oct 6, 2023

GeorgesStavracas commented Oct 6, 2023

Mikenux commented Oct 7, 2023

GeorgesStavracas commented Oct 7, 2023

Obsessee commented Oct 7, 2023

GeorgesStavracas commented Oct 7, 2023

hfiguiere commented Oct 8, 2023 •

edited

Loading

jadahl commented Oct 8, 2023

matthiasclasen commented Oct 9, 2023

GeorgesStavracas commented Oct 9, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Audio portal #1129

Audio portal #1129

Comments

GeorgesStavracas commented Oct 6, 2023

Design Considerations

matthiasclasen commented Oct 6, 2023

Obsessee commented Oct 6, 2023

GeorgesStavracas commented Oct 6, 2023

Mikenux commented Oct 7, 2023

GeorgesStavracas commented Oct 7, 2023

Obsessee commented Oct 7, 2023

GeorgesStavracas commented Oct 7, 2023

hfiguiere commented Oct 8, 2023 • edited Loading

jadahl commented Oct 8, 2023

matthiasclasen commented Oct 9, 2023

GeorgesStavracas commented Oct 9, 2023

This issue was moved to a discussion.

hfiguiere commented Oct 8, 2023 •

edited

Loading