Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galicaster should not use episode.xml DC temporal metadata #567

Closed
smarquard opened this issue Dec 15, 2017 · 7 comments
Closed

Galicaster should not use episode.xml DC temporal metadata #567

smarquard opened this issue Dec 15, 2017 · 7 comments
Labels
Milestone

Comments

@smarquard
Copy link

Opencast 4.x introduces a split between dublin-core temporal metadata and scheduling metadata.

The rationale for this is to have descriptive metadata for the time of an event (say 6pm to 7pm), which could be different from the actual recording times (say 5:50pm to 7:30pm).

The descriptive metadata appears in the episode.xml as DC terms created and temporal:

<dcterms:created>2017-12-15T14:00:00.000Z</dcterms:created>
dcterms:temporal xsi:type="dcterms:Period">start=2017-12-15T14:00:00Z; end=2017-12-15T14:05:00Z; scheme=W3C-DTF;</dcterms:temporal>

The scheduling metadata appears in the iCal feed as DTSTART and DTEND

DTSTART:20171215T090000Z
DTEND:20171215T090500Z

Galicaster should only ever use the DTSTART and DTEND fields. It should ignore the dcterms:created and temporal fields.

In ./galicaster/mediapackage/mediapackage.py, the temporal data is parsed and used to update scheduling info:

        # Parse temporal metadatum
        if self.metadata_episode.has_key('temporal') and self.metadata_episode['temporal'] and not self.hasTracks():
            try:
                g = re.search('start=(.*); end=(.*); ', self.metadata_episode['temporal'])
                start = parser.parse(g.group(1)).astimezone(tz.tzutc()).replace(tzinfo=None)
                stop = parser.parse(g.group(2)).astimezone(tz.tzutc()).replace(tzinfo=None)
                diff = stop - start
                self.setDuration(diff.seconds*1000)
                self.setDate(start)
           except:
                pass

This code should be removed as it can lead to Galicaster recording at the wrong times when the temporal and scheduling metadata are different.

@smarquard
Copy link
Author

Possibly related: #470, #365, #566

@smarquard
Copy link
Author

Sven notes here:

https://bitbucket.org/opencast-community/opencast/pull-requests/1885/mh-12609-as-a-user-i-expect-scheduling-of/diff#comment-52017765

Note that for ad-hoc recordings (not scheduled by Opencast), capture agents indeed should set starttime, duration and location in the dublincore as Opencast would not know this data. I think that is consistent with setting starttime, duration and location in the dublincore to the initial technical metadata values when creating scheduled events in Opencast.

So if Galicaster uses the iCal DSTART and DEND für Scheduled events, everything would be just fine.
As for scheduled events, it likely would be preferable if capture agents don't modify the starttime/duration/location in the dublincore/episode they receive as iCal attachment.

smarquard added a commit to cilt-uct/Galicaster that referenced this issue Jan 11, 2018
Always use the iCal start/end time for scheduling, and do not override this with
temporal data from the mediapackage.
@smarquard
Copy link
Author

smarquard commented Jan 11, 2018

One problem with this is that Galicaster relies on the mediapackage start and end dates to get the recording duration. This should be calculated from DTEND less DTSTART from the iCal feed, and for a scheduled event, should then be saved in the manifest.xml as an attribute of the mediapackage element.

So to summarize what I think is correct behaviour:

episode.xml created and temporal metadata are cosmetic and should be ignored by Galicaster (also never changed by Galicaster)

the iCal DTSTART and DTEND define the actual recording start time and duration. These should be recorded in manifest.xml. Prior to recording, the element start time is DTSTART and the duration is DTSTART-DTEND. After recording, the duration is the actual recorded media duration.

@Alfro
Copy link
Contributor

Alfro commented Jan 17, 2018

I agree that relying on dcterm:temporal for scheduling purposes is not the best strategy. I'll do some changes to use DTSTART-DTEND to calculate start time / initial duration instead.

However, in case DTSTART and dcterm:created differ, shouldn't we update the second? Or if a recording starts earlier than scheduled, for example (where we would update both the manifest.xml start property and the dcterm:created field)

@smarquard
Copy link
Author

The separation in OC 4.x between descriptive metadata (dublincore) and scheduling metadata (in the ical) basically means that Galicaster should both ignore and never update the dublincore episode.xml (unless it's creating it for an ad-hoc recording, in which case it sets the DC values to be the actual recording start/stop dates).

So an example is a public lecture advertised for 5pm to 6:30pm and that's in the DC metadata for the event and ultimately it would get published with that info. But the scheduling could be say 4:50pm to 6:45pm in case it starts early or runs over, and the trimmed media in any event may be from 5:03pm or something with a shorter media duration.

This separation is a new concept to Opencast so there may be different views, but the above is my interpretation.

I think the manifest.xml start property should be the actual recording start date/time, because the manifest is about the actual media created, rather than the metadata that describes it.

@rubenrua
Copy link
Contributor

#573 merged.

@Alfro
Copy link
Contributor

Alfro commented Feb 8, 2018

@smarquard We have added this to the 2.1.0 release. We essentially never use dcterm: fields in favor of manifest.xml duration + start. For unscheduled recordings, we set both values when the mediapackage is created.

Thanks!

@Alfro Alfro closed this as completed Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants